Everything in sequence

Laboratory Information Management Systems (LIMS) are about to undergo a major change and accelerate into the next phase of their development. LIMS were originally used to handle data in the laboratory and for quality control and assurance (QA/QC). Information from a laboratory's instruments would be forwarded to an interfaced computer for sorting and analysis. The data would usually be stored in many different software packages and drawn together into a report format by the researcher as and when it was needed.

There is now a noticeable shift away from using LIMS packages just for QA/QC. Instead many laboratories, particularly those in the pharmaceutical sector, are seeking ways to implement whole solutions capable of supporting analytical research and development. This frequently takes the form of several LIMS systems of differing complexity operating under the same umbrella to support the whole range of management tasks in research and development. LIMS are now set to underpin the huge growth in genetic and proteomic information on disease markers and new drugs candidates.

Too much of a good thing
The amount of data produced by today's instruments is much larger than that produced a few years ago and it is easy to see how the scale of this could get out of hand. In 2000, the central sequencing facility at the University Medical Centre, Nijmegen was faced with the task of implementing a software solution to help control and organise the large amounts of sequencing data it was producing.

Dr Erik Sistermans, who runs the facility in the Department of Human Genetics, explained: "Two years ago, when the facility was first set up, we created a simple Indexed Sequential Array Method (ISAM) system where users could log into the department server and then access the system to input their sample information. At this point, we had a gel-based sequencer that was handling approximately 15,000 sequences a year. However, an increased demand on our services meant we soon needed another instrument to cope. Our capacity expanded enormously with this 96-capillary machine and we knew we would have to eventually cope with this information. As expected, the database grew so large that we couldn't get all the data onto a normal computer, and we decided it was then time to get a more professional system."

The University Medical Centre opted for an Applied Biosystems SQL*GT LIMS. The system is now up and running, and is proving popular with the facility's researchers. "Often there is a technical language barrier between computer scientists and biomedical scientists. They're talking bytes and we're talking nucleotides. But I'm very pleased that, although the system itself is rather complicated, it is very easy for our researchers to use," explained Dr Sistermans. "The simple web interface allows them to type in their sample information and later use the same web browser to retrieve, download and view their data. We rarely have any complaints about the system, which is very important."

The Nijmegen system supports the implementation of other software and will eventually become the backbone for a number of central facilities. "We are planning to use the LIMS for expansion and integrate many other systems into it, for example, separate databases for SNP analysis, arraying and genome scanning facilities. We also want to integrate the patient database of the clinical genetics centre so that, one day in the future, a clinical molecular geneticist can click on a specific patient sample to check, for example, for the presence of a mutation. They could then be transferred to Sequence Collector software to look at that particular DNA sequence if further information is needed."

Genotyping a nation
The University Medical Centre used its informatics solution to solve an organisational problem. Some scientific companies are embracing informatics as a way of directly tackling complex genetic problems and have abandoned the more traditional means of generating results through hypothesis. deCODE genetics, the Icelandic genomics company, is using a population-based approach with advanced data-mining techniques in the hope of generating data that may lead to new products and services in the healthcare industry.

deCODE is using DNA samples taken from volunteers to detect genes linked to more than three dozen of the world's most common and devastating illnesses. It then uses this information, in combination with Iceland's unrivalled genealogical resources and the company's bioinformatics tools, to accelerate the identification of new drug targets and help develop more effective treatments.

DeCODE's laboratories produce vast quantities of raw genomic data, approximately 12m genotypes per month, and has one of the largest and most advanced genotyping facilities in the world. deCODE is now in the process of adapting its genotyping software suite, which it developed in-house, for integration with Applied Biosystems laboratory management software and instruments. This will provide deCODE with a full range of solutions for the generation, management and analysis of its genotyping data.

Joining the dots
Flexibility in a LIMS that allows software programs to be added when necessary is also important for the development of systems that support many sites simultaneously, whether they are a corridor apart or a country away.

Installing software programs on a central server rather than on individual PCs saves considerable money through reduced software licensing fees, installation and support costs. This draws together all of the information produced by a facility's instruments and allows users easy access to their data via a simple web-based program however far distant they may be.

The Ludwig Institute of Cancer Research has research branches at academic centres in the United Kingdom, Switzerland, Sweden, Australia, Belgium, Brazil and the United States. The centre at University College London has recently implemented a LIMS for proteomics, based on an Oracle relational database. The branch structure of the Institute allows the interaction of a number of different research and clinical environments, with each branch focused on studying one aspect of cancer, such as tumour immunology, genetics or virology.

"Proteomics is the central focus of our work with the new informatics system," according to Dr Marketa Zvelebil, leader of the bioinformatics group at the UCL branch. The group analyses protein modifications and interactions and studies a number of biochemical reactions, including the interactions of protein kinase enzymes with their substrates and protein growth factors with cell surface receptors. "We are trying to understand the mechanisms within the critical steps of the signal transduction process," continued Dr Zvelebil. "Such steps may be useful for diagnosis of different conditions, or as targets for therapy."

The system provides many ways to view sample information from previous or current experiments, as well as trace associations between phenotypic data, molecular characterisations, patient data and clinical data. Through the unified system interface, the researcher can look at everything at once.

The robust infrastructure of the LIMS has enabled the proteomics laboratory to integrate a wide range of bioinformatics and protein analysis tools to characterise the proteins separated and purified from various tissue samples and cell lines. Researchers treat cells with one or more specific growth factors and then profile the proteins expressed. This allows them to monitor downstream protein interactions, and the subsequent effects on cell physiology, using a combination of 2D gel electrophoresis, mass spectrometry and data analysis software, all integrated through the proteomics LIMS. The LIMS links to a computer that controls the departmental mass spectrometry systems.

Researchers then use Protein Prospector (University of California, San Francisco) and other software programs integrated into the infrastructure to compare protein mass information to the data in protein and genomic databases. The system allows database searches for protein identity across both public domain and proprietary sources using the Internet, or an intranet. Still, identifying the different proteins involved in signalling pathways only provides a framework for understanding the mechanisms involved in signalling in a single cell. "Once you identify a protein, you need to mine the data to discover how the protein functions," explained Dr Zvelebil.

Dr Zvelebil uses a statistical software program to find out how closely related the proteins on gel images actually are to one another and give an indication of which proteins can be grouped into families. "In clustering, you can calculate any measure between two or more sets of data (proteins represented by spots on a gel) and then determine how close or distant each set is based on that measure." If two sets of data are closely related, they will cluster in subsets. The software draws a diagram that resembles a phylogenetic tree, except that it shows interrelation between a group of proteins instead of between species.

So where next?
The correlation of data right across R&D, analytical QC/QA, and enterprise is a significant driver in the development of new functionality in LIMS solutions by manufacturers.

Customers are becoming more aware of the capabilities and potential of LIMS, and know that it is not only scientific tasks that can be managed under this system but business tasks too. They are demanding more from their LIMS than simple data organisation, so communications and IT must develop rapidly to keep up with the emerging technologies and business infrastructure of science-based organisations.

You can use the online Reader Enquiry service at Scientific Computing World to make contact with organisations referred in this article, or to visit relevant websites.

Everything in sequence

Editor's picks

The convergence of HPC and AI: Innovation in the post-Moore’s Law era

Online Panel Discussion | Optimise your HPC storage strategy

On-demand | AI in Life Sciences: Practical applications in small molecule design

On-demand Webcast: Transform your labs with cutting-edge AI solutions

Centralising analytical data from mass spectrometry in drug discovery and development

AI-driven Laboratories: Navigating Challenges and Embracing the Future

Choosing a flexible digital platform for drug discovery