Maths maps tomorrow's drugs

David Robson on how molecular modelling is making headway in the laborious process of drug discovery.

'Computers are incredibly fast, accurate, and stupid; humans are incredibly slow, inaccurate and brilliant; together they are powerful beyond imagination,' said Albert Einstein more than 50 years ago.

He could have been talking about drug discovery in the 2000s. Gone are the days of Fleming discovering Penicillin in a mouldy Petri dish, when chance and inspiration led to the biggest breakthroughs. With increasingly stringent criteria imposed by the FDA, research and development is taking longer than ever. And there is no better way of reducing discovery time than using the power of computers to virtually screen drug candidates, before performing time-consuming physical experiments on the few likely contenders.

However, the methods computational chemists are finding to reduce drug candidates from millions to thousands varies considerably, with competing results. An innovative approach by the University of California at San Diego (UCSD) recently made headlines for assembling a virtual human metabolic network using mathematical techniques traditionally used in signal processing and operations research.

The BiGG database will allow scientists to explore hundreds of human disorders involving the metabolism, such as diabetes and high cholesterol, and will even provide a tailor-made diet for the obese.

The database includes more than 3,300 known human biochemical reactions, and draws from the human genome sequence, allowing scientists to create any cell in silico.

Neema Jamshidi, an MD/PhD student at UCSD who has worked on the project says: ‘This approach confirmed in a mathematically rigorous way what cell biologists already understand to be true: cells use compartmentalisation to coordinate their metabolism. Our technique provides scientists with a new way to investigate the role of compartmentalisation in metabolism.’

While signal processing and operations research are mathematical tools not typically associated with drug discovery, statistics have been used for molecular modelling since computers were first invented.

These programs build an algorithm based on a training set that may contain thousands of experimental results. The database contains unique information, called descriptors, which will be used for prediction of the molecules’ behaviour. Once the algorithm is built, it is tested on the training set and, if successful, can be used to predict new results.

Tripos was one of the early pioneers of a method – QSAR with CoMFA – that used 3D, rather than 2D descriptors of molecules. Developed in the 80s, the method is still cited in dozens of articles every year.

Constantly improving its products, Tripos is planning on releasing a superior version of this method later in the year as part of its Sybyl suite of molecular modelling software (pictured above). topCoMFA will provide a more automated and quicker way of doing the analysis.

PredictionBase from IDBS also uses a descriptor-based method. It typically predicts potential problems drugs may have, particularly in the area of ADME-Tox (absorption, distribution, metabolism, excretion and toxicity) – a common application of molecular modelling.

'With ADME/Tox prediction, it is useful to know whether there will be any problems as early as possible in the development process,' says Glyn Williams, VP of marketing and product management, IDBS. ‘It saves time at the end of the trial. The scientist is given more and better information and it indicates problems that will become expensive later in the trial.’

The Molegro Virtual Docker at work.

In addition to detecting toxicity throughout the drug's lifecycle, many scientists screen potential drugs by investigating whether they will interact, and bind, with the target protein involved in the disease. The Molegro Virtual Docker uses a highly effective algorithm to run this – with benchmark studies showing an 87 per cent accuracy.

'This accuracy is due to a very advanced optimisation algorithm that can search the huge space of potential candidate solutions very effectively,' says René Thomsen, CEO of Molegro.

Usability seems to be an important feature of all these programs, as the end users are not always computational chemists. To this end, Virtual Docker provides wizards that allow the user to visualise the molecules before preparing them – typically determining properties of their structures such as partial charges, bond order and protonation states, and to select the potential bonding sites on the protein. Once the simulation has run (typically taking just a few minutes), an energy score is calculated to rank the solutions most suitable for further study. A visual analysis to inspect molecular interactions such as hydrogen bonds allows scientists to see how the ligand is positioned in the binding site.

Following a trend that is common to many, though not all, companies working in this field, the recently released version 2 of Virtual Docker includes tools that can simulate how the protein adapts to the shape of the ligand. This is hoped to give more realistic, accurate results.

When performing these calculations, it is important to understand and choose the correct forms of the molecules being studied. In another sign that molecular modelling is taking an increasingly cheminformatics approach, ChemAxon (traditionally a cheminformatics company) has developed its Clean3D geometry calculation tool to predict stable 3D conformations, using energy calculations.

ChemAxon's MarvinSpace predicts Van der Waals' forces

In common with many companies, Molegro and ChemAxon are both developing a pharmacophore approach to molecular modelling, as did Accelrys in the latest release of Discovery Studio. It is a fragment-based approach, and allows the study of molecules where the exact 3D structure is not known, for example, in ion channels or embedded proteins.

It is an approach that potentially saved Biogen Idec $300,000 when looking for a potent inhibitor for a disease-implicated protein using the Catalyst module of Discovery Studio. The software reduced 200,000 commercially available compounds to just 87 that were examined experimentally.

'It is a more rational way of discovering drugs instead of the more random way of choosing what to test experimentally,' says Dr Samuel Toba, product marketing manager for life sciences at Accelrys. 'It intelligently uses information discovered over the years.'

Most of the pieces of software mentioned so far use a flexible approach to molecular modelling, by trying to simulate the way the target protein and ligand will adapt during docking. OpenEye Software, however, is somewhat controversially using a rigid approach – with significant success.

This success comes in part from extra care taken during the early stages of model development. 'The molecules generated must be sensible if you are to treat them as being rigid later,' says Paul Hawkins, a senior applications scientist at OpenEye. 'This approach saves time. It is intuitive to assume the flexible approach is better, but we have seen little evidence that this is actually better than our approach.'

Significantly, OpenEye is yet another company that considers a cheminformatics approach important in this field. 'We try hard to retain and preserve the integrity of the molecular data, as it is processed from application to application,' says Hawkins.

The data could be stored in a number of different formats – depending on whether it is 2D or 3D – and even then, there are many more methods. 'We need to be able to read and write all the file formats in a consistent and careful way. If we don’t do this, we're throwing landmines in front of the users.'

However, this may be one of the few similarities with other molecular modelling software. A fundamental difference comes in the way they envisage the problem – with a deep understanding of the physical phenomena, rather than experimental data and parameterisation.

Although a docking method clearly has its place, Hawkins seems to believe its importance is overrated: ‘Its status is entirely disproportionate to its power and performance.’ He believes the number of approximations used to solve problems sometimes cause an inherent inaccuracy in this method.

OpenEye's alternative follows an entirely different approach altogether, focusing on the ligands, rather than the proteins. The software, Rocs, tries to find any other molecules that are similar in 3D shape, paying little attention to the actual chemical structure.

The idea is that if a molecule has a similar shape, it is reasonably likely to be just as active. This opens up a large number of compounds that the chemist could have had no idea were worth considering, unless they had an exceptional ability to visualise molecules. It could have significant use to overcome problems in ADME-Tox – if one drug fails, researchers can move onto another compound, with a similar shape, but different properties.

However, potentially the biggest business benefit for pharmaceutical companies could be the practice of lead hopping – from drug candidates patented by competitors, to thousands of unpatented molecules that are similar in shape, and likely to be similarly active. It is a practice that could save the company vast amounts of money, if the drug did prove to be successful.

It is an amazing thought that computers could soon provide not just the perspiration, but the inspiration, for drug discovery. Einstein had a mind that seemed an age ahead of his contemporaries, but could even he have imagined the advances computers are now making?


Parallel computing vs Malaria

Forget the power of single-PC molecular modelling – using the EGEE grid, biologists have been using in silico screening of drugs for malaria on previously unknown scales.

In an example of the powers of parallel supercomputing, Wisdom (Worldwide in Silico Docking of Malaria) used the equivalent of 420 years of single-PC power to screen more than 140 million compounds – all in a time spanning from 1 October 2006 to 31 January 2007. Incidentally, in that same time period, more than 250,000 people will have died from malaria.

The search proved to be fruitful: it found three families of molecules that could be effective against the malaria parasite. With more than 140 million compounds screened to find just three families, the task is clearly harder than finding a needle in a haystack, and one that small academic institutions could barely play a part in: previously, they could only have screened thousands, not millions, in this timescale.

Doman Kim, director of the bioindustry and technology institute at the Jeonnam National University in Korea, explains: 'The impact of Wisdom goes much beyond malaria. Until now, the search for new drugs in the academic sector was done at a relatively small scale whereas the Wisdom approach allows a systematic inquiry of all the potentially interesting molecules.'

Analysis and opinion

Robert Roe looks at research from the University of Alaska that is using HPC to change the way we look at the movement of ice sheets


Robert Roe talks to cooling experts to find out what innovation lies ahead for HPC users

Analysis and opinion