The burgeoning needs of biology

Scientists collaborate using Symyx Software ELN applications, which are tailored to meet the specific needs and protocols of chemists in the discovery, process, analytical, formulations and bioprocess functions

‘Chemists are from Mars, biologists are from Venus’. That is the eye-catching title of a recent article on the BioRails preclinical study management website, describing the informatics needs of chemists and biologists working in drug discovery. It may be an obvious spoof of John Gray’s well-known self-help book Men are from Mars, Women are from Venus, but the article still makes a serious point: experiments conducted at different points on the drug discovery process are best served by different types of information provision. Laboratory information management systems (LIMS) and electronic lab notebooks (ELNs) are information management tools that were first introduced to the industry within drug discovery chemistry and cheminformatics, but are becoming more or less universal. Most companies will adopt, or adapt, different software solutions to support different disciplines. However, as the packages available have become more sophisticated, they are gradually merging and, even, the boundaries between the ‘traditional’ LIMS and the ELN are now becoming blurred.

A LIMS is best seen as a tool for automating laboratory workflow, taking in the management of samples, users, instruments and the data generated. In contrast, an ELN is designed – as the name implies – to replace a paper notebook, documenting procedures, analysing data and generating reports. Thus, in general terms, data produced from a LIMS will flow directly and seamlessly into an ELN system.

It almost goes without saying that knowing the structures of potential drug targets is an important step both in prioritising those targets and in discovering early leads. The process of protein structure determination has now been ‘industrialised’ with many labs setting up programs for high throughput structure determination, or structural genomics. The Structural Genomics Consortium is a not-for-profit organisation, based in the Universities of Oxford and Toronto and the Karolinska Institute in Sweden, set up in 2004 with the aim of solving the structures of more than 350 human and malarial proteins of medical importance, within the final three years of a four-year project. These include important drug targets for cancer and metabolic diseases. Each structure is released into the public domain when it is completed. At the start of the project, the directors of the SGC lab in Oxford made the decision to ‘go paperless’, choosing a data management system from the Californian company MolSoft, BeeHive, and an ELN solution from Contur. ‘We have been using Contur’s ELNs for almost three years,’ says Brian Marsden, one of the principal investigators in the SGC at Oxford. ‘We chose them because a few of our senior scientists had had good experiences with them at Biovitrum in Sweden.’ The largest difference between the system as used in Oxford and at Biovitrum is that workers in the not-for-profit Oxford lab have no need of the intellectual property modules that are used extensively in the company. The Swedish group has also adopted Contur and MolSoft’s products, although the group in Toronto is using its own solution.

‘We have a bottom line: “If it’s not in the ELN, it didn’t happen”,’ says Marsden. ‘This makes it easy to track the flow of information and share novel results and interpretations. Also, when someone leaves the lab, we can guarantee that their notes will remain available, and will be at least legible, if not necessarily written in brilliant English.’ In their system, the complex procedures involved in solving a protein structure are divided into ‘bite-sized’ chunks: for example, the expression, purification and crystallisation of one protein (‘X’) will be defined as a single ‘experiment’, separately from the use of X-ray crystallography or (much more rarely) NMR to solve that protein’s structure. It is, however, very straightforward to search the entire lab notebook system for every experiment involving ‘protein X’. The code of the laboratory management system, BeeHive, has been modified considerably to fit the specific requirements of structural genomics, and although it links closely to the ELN, the two systems are basically independent.

The SGC has just announced that it has reached 350 structures, some months ahead of schedule. ‘This is one of the most successful structural genomics efforts in the world,’ says Marsden. Consortium members are now applying for a second tranche of funding, aiming to move more into early-stage drug discovery by studying the structures of proteins with their ligands. ‘It won’t be difficult to modify our existing LIMS and ELN products for this work, and we are hoping to standardise on a single system throughout the Consortium,’ adds Marsden. ‘We in Oxford would be very happy if this were the one we use at present.’

Early-stage drug discovery is one of the most promising areas for information management systems, involving, typically, the high-throughput analysis of enormous numbers of relatively straightforward assays. The LIMS and ELNs developed for this area are mature and have been adopted widely throughout the industry. It is also an area where the boundaries between information management and notebook systems are beginning to blur. Symyx, based in Santa Clara, California, has developed an integrated software suite that includes ELN, instrument execution and data analysis applications. Its customers include biopharmaceutical giants Eli Lilly and Merck. ‘Discovery Notebook version 4, released in January this year, combines these three applications into a single product. This… is designed to integrate easily into users’ existing systems so they can leverage their current resources… providing more opportunities to make breakthrough discoveries,’ says Paul Nowak, executive vice president and chief operating officer of Symyx.

Further down the drug discovery pipeline, IDBS’ BioBook, part of the E-WorkBook suite, is an ELN system with additional data management capabilities. It is designed specifically for pre-clinical drug development, assessing drug safety and pharmacology. The BioBook solution is designed to provide a complete environment for in vivo drug testing, involving data capture and management and laboratory record-keeping. At this late stage of drug discovery, a typical experiment will involve many different procedures. ‘Challenges in drug development are different from those in chemoinformatics,’ says Glyn Williams, vice president of marketing and product management at IDBS. ‘Here compound numbers are much smaller, but the complexity is in the procedures.’ BioBook builds on, and extends, a generic ELN framework. Its setup is based on multi-disciplinary studies, each with a designated study director (group leader). Each study consists of a number of experiments with the same compound set, focusing on pharmacokinetics and toxicology and, almost invariably, using animal models. The same system controls experiment setup, parameter setting, data capture, statistical analysis and report writing. Its setup facilitates two types of data searching that are computationally very different: contextual text searching, which is ‘not very different from Google’, and database analysis. Coding the software to be able to capture and analyse many different types of data in different settings, and handle it consistently, led to ‘complications and headaches’ for the development team. ‘Flexibility and compliance are both essential [in a system of this type],’ says Paul Denny-Gouldson, IDBS’ product manager for BioBook and related software, ‘but they are difficult to implement together: they don’t usually go hand in hand’.

A screenshot from IDBS' BioBook ELN

Pfizer, described as the world’s biggest pharmaceutical company, is implementing BioBook in secondary pharmacology, toxicity, pharmacokinetics and dynamics studies across all its therapeutic areas. Graham Baker, associate director for Discovery Biology at Sandwich (UK), explains: ‘We introduced about 60 biologists across the company to BioBook during 2006, and evaluated it for performance, functionality and ease of use. We will be deploying it fully later this year once version 7 of the software has been launched, recommendations from the pilot study addressed and the product successfully integrated into the Pfizer informatics architecture.’

Following the failure of a novel HDL cholesterol raising agent, torcetrapib, in an important Phase III clinical trial, Pfizer is now facing the patent expiry on its anti-cholesterol blockbuster, Lipitor, with no obvious replacement in the pipeline. The company, therefore, has recently announced a worldwide business shake-up, aiming to secure long-term profitability by shedding thousands of jobs. Williams, however, is fully confident that this will not adversely affect the company’s take-up of the BioBook technology: in fact, the reverse may be true. ‘Pfizer is investing in the infrastructure and tools necessary to increase the efficiency and effectiveness of those research teams that they are keeping on,’ he says.

LabLogic is a company based in Sheffield, UK, that develops LIMS that are designed to work through from pre-clinical to early clinical or ‘first in man’ drug studies. Its lead product, DEBRA, was first produced almost 20 years ago – ‘back in the DOS days’ – for the automatic monitoring of radio-labelled drugs. It is now a world-leader in the niche area of drug metabolism, and, in 2005, won LabLogic the accolade of ‘Sheffield Exporter of the Year’.

The system has matured considerably since those early years; it has been developed in a modular fashion, and companies choose to implement whichever parts they wish. ‘For example, some companies use DEBRA for bar-coding and labelling samples, while others prefer to opt out of this module,’ explains LabLogic’s systems director, Huw Loaring. This flexibility is particularly useful for smaller pharmaceutical companies, as they are able to introduce DEBRA module by module. One of the most recent developments has been the addition of a module to measure drug binding to proteins using a variety of techniques. This is particularly valuable in pre-clinical work; ‘often the same group of drug metabolism people are responsible for in vitro and in vivo studies, and it is very valuable to be able to use the same system in both areas,’ says Loaring.

Laboratory information management systems have come a long way since the 1980s, when they were used principally for sample tracking in chemistry. Now they, and ELNs, are almost ubiquitous as informatics aids to drug discovery. Many of the newer products are beginning to cross the boundaries between LIMS and ELNs, as well as between the lead identification, discovery chemistry, pre-clinical and early clinical stages. Nevertheless, there is no single product that will suit every need, and a company will do well to explore the many options in this area thoroughly before settling on one product: or, quite possibly, on more than one.

The burgeoning needs of biology

Editor's picks

Enter the SCW75 - celebrating leaders in scientific computing

On Demand: Free Online Panel Discussion | LIMS innovation boosts precision and security

On-Demand: Optimise your HPC storage strategy

On-demand | AI in Life Sciences: Practical applications in small molecule design

Protecting bioanalytical data integrity from bench to report

Why AILNs are the future of scientific discovery

Future-proofing your lab: key considerations for upgrading or switching chromatography data systems