Chemical culture

Nick Morris navigates through the world of computational chemistry

Scientific Computing World: February/March 2006

One consequence of the Kyoto protocol, which seeks to reduce global carbon dioxide emissions, is that power generated from nuclear fission is once again being discussed as an alternative to electricity from fossil fuels. However, for major players in the nuclear industry - such as BNFL - to gain public trust, safety must come before all other concerns.

Understanding how chemicals and structural materials interact is particularly important for the safety of the nuclear industry. Nuclear engineers need to understand how chemicals behave when subjected to the very high temperatures and pressures within nuclear reactors. Material failure, perhaps due to faults or impurities within systems such as a nuclear fuel reprocessing plant or reactor coolant, could lead to catastrophic results. However, it is often dangerous and difficult, if not impossible, to carry out the necessary tests within the laboratory. BNFL uses Accelrys' software to model chemical behaviour under such harsh conditions, and hence understand better chemical and material processes such as corrosion and solvent-extraction.

When asked about the benefits of computational modelling, Scott Owens, research technologist at BNFL, said: 'In the nuclear industry, experimentation can be very expensive and comparatively high-risk. Virtual experiments allow us to carry out research that would have been otherwise impossible due to safety risks, reducing the amount of traditional pilot-and-process scale experimentation.'

All industries, nuclear or conventional, need to discharge some waste into the environment and thus industry, government, environmental agencies and NGOs (non-government organisations) all have an interest in the behaviour of chemicals in the environment. Chemical developers and producers must comply with national and international regulations, and governmental labs must be able to check for leaks and spills of chemical pollutants. Open information exchange between concerned groups and stakeholders is vital to ensuring public acceptance that industry is complying with regulations and environmental legislation.

In 2003 the EU adopted a regulatory framework for Registration, Evaluation and Authorisation of Chemicals (REACH), to protect human health and the environment while maintaining the competitiveness and the innovative capacity of the European chemicals industry. Under the terms of REACH, every company that manufactures, imports, or exports more than one tonne of a chemical per year is required to register data about that substance with a central authority. Quantities of more than 10 tonnes require a specific chemical safety report.

When the REACH programme is fully operational in 2008, chemical companies, such as the German firm Degussa, will have to carry out the tests necessary to register both existing and future chemicals with the central agency. Computational 'testing' methods, such as QSARs, offered by companies such as Quantum Pharmaceuticals and SemiChem, are particularly useful in terms of cost and time efficiency. Moreover, fewer chemicals need be tested on animals if chemical behaviour can be modelled accurately with software.

Advanced Chemistry Development (ACD/Labs) has supplied Degussa with its ACD/LogD suite to help Degussa meet the new requirements. The software is helping with investigations of chemical solubility, which is key to a compound's bioaccumulation and impact on the environment.

In the United States, cheminformatics techniques are also being used to enhance understanding of environmental impact. One current example is where researchers are increasingly concerned about the discharge into the environment of the chemical perchlorate.

Naturally occurring perchlorate is used as an ingredient of fertiliser, while its man-made form is used in industry for tanning, rubber processing, paint and enamel production, lubricant oil additives, and in solid rocket propellant. Perchlorate is a water-soluble chemical, so if not used or disposed of correctly, it can contaminate soil. As so often, the concern is that it might seep into the water table and the domestic water system, polluting drinking and irrigation water. Perchlorate uptake is cumulative, so daily consumption of food and water polluted by perchlorate could create in vivo concentrations above safe levels. At high concentrations, perchlorate interferes with the uptake of iodine by the thyroid. This can lead to hyperthyroidism, which affects the body's ability to regulate metabolism.

Chemists at Waters, the analytical instrumentation company, have developed a means of detecting perchlorate at parts-per-trillion levels, in only 15 minutes. The process combines chromatography and mass spectrometry methods without complicated, time-consuming sample preparation.

'As the toxic effects of perchlorate become better known, the need to confirm its presence at low levels grows,' says Dr James Willis, director of the chemical analysis market development group at Waters. 'Regulatory agencies and laboratories now have a one-step method to confirm levels of perchlorate in water samples.'

The Waters system forms the basis of the American Federal Drug Administration's CFSAN method of perchlorate detection. High levels have been found in 35 US states.

As these examples show, the cliché of chemistry in vitro - white-coated researchers peering over bubbling beakers and strangely coloured test tubes - has long since proved inadequate; nowadays a lot of chemistry is done in silico. Computational chemistry cuts down the time scientists have to spend testing and confirming results in the lab, which gives them more time to spend on the actual science. However, some experts believe that more than 80 per cent of chemical information is never published, due to the lack of any simple method for recording and accessing that data.

Databases such as the Cambridge Structural Database (CSD), from the Cambridge Crystallographic Data Centre, address this. The CSD currently contains crystal structures and chemical data for more than 365,000 organic and meta-organic compounds. The database, distributed on CD on an annual basis (with regular updates available on the web) can be searched, analysed, and visualised within the system and is used for research in structural chemistry, drug and materials design, and crystallography.

The world of electronic searching is changing to address this. While basic, text-based searches are useful for the casual web surfer, they don't meet chemists' needs. Databases are now available that allow semantic searching - where common data 'tags' are associated with chemical database entries -and chemical structure-based searching. Companies such as ChemAxon are rolling out semantic and structure search engines designed specifically for corporate and web-based chemical searching. For example, the Protein Data Bank, an online molecular search engine, allows users to draw structures using ChemAxon's Marvin applet, then search the database using the JChem search engine. ChemAxon's JChem software can search for both exact matches and similarities in both substructures and superstructures. Users can combine structural searches with sample attributes search criteria, and even non-chemical queries.

ChemAxon is now offering a 'FreeWeb' package, which includes chemical editing, viewing, search, property calculation and database management toolkits at no cost to freely accessible web resources being operated for non-commercial purposes. 'The FreeWeb package is our contribution towards making online chemical data which is publicly available more accessible and useful,' comments Ferenc Csizmadia, CEO of ChemAxon.

Database handling is especially important for large organisations that have many researchers handling chemical data. Chemical Computing Group (CCG) supplies database-handling software within its molecular design software, Molecular Operating Environment (MOE). MOE contains many cheminformatics-specific features, such as: molecular diversity and similarity analysis, including clustering and classification; descriptor calculations and advanced molecular fingerprinting; structural database handling; and design of virtual chemical libraries. However, perhaps MOE's most useful feature is the 'Scientific Vector Language' (SVL) upon which MOE applications are built.

SVL is an interpreted language for the popular 'C' programming system. SVL consists of around 2,500 chemical (and biological) specific functions, sub-defined in C. When a MOE session is started, an SVL-to-C compiler starts within the user's computer. The interface and the applications are then simply SVL program files, which are either automatically compiled and run, or run at the press of a button in the interface.

Writing software this way makes it very easy for CCG to customise MOE, either to run on different operating systems (it is supported by all common platforms), or to work within or in conjunction with other programs or workflows. CCG also supply the source code to its customers. According to Dr Steve Maginn, director of scientific services at CCG, companies can write their own applications in MOE, to create their own libraries, or other applications.

CCG has a website where users can exchange applications, or other pieces of code. 'Our policy is that we should not be dictating what our customers' choices are - and SVL allows us to stick to this. Multiple platform support and the open-source nature of SVL code mean that we don't really care what platforms MOE users have. We don't restrict our customers to our applications, and encourage them to play around with the science,' says Maginn.

Safer industries, improved environment, and all the time making the business of the chemical and pharmaceutical industries faster and easier by improving chemical searching and structure-activity calculations - computational chemistry has already had a practical impact on the world in which we live and its applications will spread more widely in the future.

Patent protection

According to the United States Patent and Trademark Office: 'Anyone who invents or discovers any process or composition of matter, or any new and useful improvement thereof, may obtain a patent.' Patent and trademark protection laws protect chemical manufacturers from other companies imitating their products or processes, thereby maintaining their market position. However, to validate a patent there must be some guarantee that the new chemical product is indeed unique.

The United States Patent and Trademark Office uses computer software tools from Cambridgesoft for checking whether patent applications are valid. Cambridgesoft's software can recognise not only chemical structures drawn using its dedicated modelling applications, such as ChemDraw, but also can generate structures from the exact chemical name of a substance. Patent lawyers can then search for chemicals with similar structured in Cambridgesoft's various chemical databases.