Thanks for visiting Scientific Computing World.

You're trying to access an editorial feature that is only available to logged in, registered users of Scientific Computing World. Registering is completely free, so why not sign up with us?

By registering, as well as being able to browse all content on the site without further interruption, you'll also have the option to receive our magazine (multiple times a year) and our email newsletters.

Scientific Information Production comes to discovery

Share this on social media:

Topic tags: 

Scott Deutsch looks beyond the traditional LIMS quality control approach to scientific information production.

The August 2000 report on bioinformatics from Front Line Strategic Management Consulting, Foster City, CA. USA, says "the biggest and most urgent bioinformatics need is to provide scientists with tools to effectively manage, integrate, and analyse their data from numerous sources". Like any rapidly developing marketplace, participants look for the latest advances in technology and automation to simplify and speed processes.

A logical place to look for clues to the future of technology in informatics would be the traditional QA/QC laboratory market that embraces Laboratory Information Management Systems (LIMS). This article discusses the needs that must be addressed to provide an adequate informatics technology solution and explores the reasons why the traditional QC LIMS approach will not work for the early discovery processes in the informatics market.

Scientists in the biotechnology field have to streamline and automate their activities. Many need ways of electronically documenting protocols, hypotheses and results if they are to speed progress from concept to execution to result. Scientists also realised that they need to acquire and keep track of materials easily - chemicals, compounds, biological materials, plates, lab supplies, etc. - to execute the science.

To meet these multifaceted needs, the research community requires an easy-to-use, configurable technology to manage and produce scientific information.

Scientific Information Production (SIP) is a key element in the success of any biotechnology company. To deliver the business value of its science, a company's technological infrastructure must connect all stages of production - from research to information packaging. Rapid change in the discovery world is forcing today's leading biotech companies to search for software solutions that are scalable, configurable, and flexible enough to accommodate the changing requirements in the industry.

Scientific Information Production: the discovery market's unique needs
The growth of SIP is mainly driven by two factors. First, there is constant pressure to speed drug discovery and development. As a result, companies are relying increasingly upon outside vendors for laboratory automation and informatics solutions. As companies move from traditional methods of drug discovery to high-throughput screening (HTS), proteomics and cell-based as well as genomics-based techniques, several events are occurring:

  • There is an explosion of data. Companies need IT infrastructure and informatics solutions to manage the growing amounts of data;
  • New data streams are being created. New data management solutions are needed to deal with heterogeneous data and inherent incompatibility with current systems; and
  • The large-scale production and packaging of information that is then sold is a factor that limits a company's ability to thrive.

The second reason for the explosive growth of SIP is the rapid maturation of biotech companies who outgrow internally developed information infrastructures. In-house developed systems that met initial requirements can no longer handle the data nor can the systems meet the scalability requirements for information production processes.

IBM Life Sciences and Sun Microsystems' Life Sciences are both predicting a rapid increase in scientific information in the near term. "In this community, data is going from terabytes (a trillion) to petabytes (a quadrillion) sooner than in any other industry. Because the computational requirements of proteomics are orders of magnitude greater than genomics, we will see architectures scaled to petabytes installed in the next two to three years," according to Sia Zadeh, director of Sun Microsystems' life-science initiative.

The information production chasm

figure

This figure depicts the information lifecycle and illustrates the point at which automation becomes crucial for business success. The importance of traditional IT is growing due the rapid increase in data volumes and the ensuing greater data management requirements. The life-blood of any biotechnology company is its ability to successfully integrate and decipher the meaning of data. These objectives must be realised to achieve SIP. The need for SIP begins to take shape as efforts move from first research studies into research and development, which is driven by the need to generate revenue. Suddenly, the production of information determines the company's business success.

figure

As companies move from pure research, internally developed IT solutions no longer meet requirements, and the company falls into the information chasm. Management has two simple choices: either continue to invest heavily in internally-developed solutions, which dilutes the effort required to focus on core competencies; or locate a software supplier who has a solution with the flexibility and adaptability needed to meet the changing business and technical requirements. This "build versus buy" scenario is an integral part of a business lifecycle. SIP bottlenecks occur when companies are unable to provide their customers - both internal and external - with "product" and service in a timely, cost effective manner.

Jim Thompson of Frontline says: "SIP today is mainly addressed by internal informatics groups. The key factors that determine the strength of a packaged solution are the solution's ability to address compatibility, flexibility and scalability issues. Given the rapid growth in heterogeneous data, the ability to organise and integrate disparate data types is of vital importance. As well with the diverse data management/production needs of researchers, it is very important for any solution to function with minimal customisation and be highly scalable. SIP's immediate business value comes through business process integration and the automation of business processes through workflows and an overall reduction in IT costs."

Although researchers may argue against workflows and automation for the sake of the science, those who are responsible for the business will argue that the science cannot go on if the company fails to meet customer demands. In the end, the researcher does benefit from automation. Documentation efforts are streamlined which in turn provides a gain in time dedicated to the science, since as much as 20 per cent of the administrative burden has been automated. So a balance must be found between these opposing forces. With the correct software system, both sets of needs may be met.

Many biotechs are formed to conduct scientific research for an identified area. Once deeper into the research effort, the company begins to focus on the tested ideas, as opposed to a premise. It is this rapid change that disrupts biotech infrastructures more than anything else. Rigid systems have been built to meet needs at a specific time. Unfortunately for those who constructed the internal systems, the degree and the speed at which these changes occurred could not be anticipated, but at some point the flow of data overwhelms just about every in-house solution. Once the IT infrastructure pain exceeds a certain threshold - often earlier in those companies whose revenue source is the packaging and selling of their scientific information - management takes a hard look at how to solve the problem. Management assesses the situation and recommends either expansion of the existing in-house solution with some patching efforts or the commencement of a search for a professionally developed solution to meet their needs. Once management has become involved, the pain is no longer latent and the decision process to determine a solution becomes quite rapid - often less than 90 days. Why? A halt in revenue generation affects the company's performance.

Traditional QC LIMS in discovery: not the ideal solution
Anyone involved in the discovery market can see the rush of LIMS vendors trying to capitalise on an opportunity to sell into a new market. These LIMS suppliers, however, are not developing solutions to address the early discovery market's unique requirements; rather, they are adding some basic discovery functionality to their traditional QA/QC sample-tracking LIMS for a quick entrance into the market.

A LIMS solution may be adequate to satisfy the production-only need of the discovery market, but it does not have the inherent capabilities for backward integration into the early discovery part of the process - for which is it vital to have a technology solution to streamline efforts. Trying to address the needs of high throughput screening, genomics, proteomics or cellomics with a solution that uses a sample and aliquot model is a cumbersome approach. Solutions for the informatics world cannot be constrained by sample and task entities. Nonetheless many LIMS companies have taken this approach by introducing functionality such as "plate aliquots". LIMS focused on QC are rigid, and the schema of sample and aliquot are completely meaningless in informatics. Samples come after a screening process, not before.

Another difference in software use between the LIMS and discovery worlds is that the system needs to be easily configurable to keep pace with the ever-changing discovery environment. For this reason, workflows and process flows cannot require development resources for reconfiguration. Development's involvement may have been acceptable in a process chemical QA/QC facility or in an established production environment where structured processes rarely change, but this is counterproductive in the always-changing world of discovery. This is the main issue that will prevent virtually all LIMS suppliers of today from being able successfully to support informatics requirements in the long term.

The next, obvious limitation of a LIMS system in the early discovery market is the system's ability to manage various plate handling formats and functions. This world is changing so rapidly that having "fixed-only" formats forces the research organisation to wait for new code to be written to accommodate changes. Remember, change is the norm in discovery but is rare in traditional LIMS environments. Genealogy of plate wells is another example of functionality that is simply not sufficiently addressed in traditional LIMS. It is paramount that the data management solution has the ability to track the complete history of the movement of specific well contents. The number of possible combinations is a startling figure to most. But it is just this issue that can cause great system stress in the future.

Build vs. buy
It wasn't until recently that biotech companies had any option other than building their own infrastructure. The research market for genomics and proteomics was very small and only a few companies were in a position actually to purchase software. Now, two years later, the world of informatics has exploded into a thriving, global business segment. Over this period of time software has matured to meet this market's unique and continuously changing needs.

Many in biotech have viewed their infrastructures as an "asset". The truth, though, is that internally developed systems are a liability. How can a system built by three or four software engineers - often in Visual Basic - be an asset? What happens to this asset when the competition recruits the company's top software engineer? Today, smart biotechs view their IT infrastructure as a key element for success and that using a packaged solution is a viable and preferred option. Otherwise, the company would be responsible for staffing in the areas of:

  • Documenting the software functionality;
  • Documenting the software for future FDA validation purposes (21 CFR Part 11);
  • Training people on software application;
  • Continuous application feature enhancements; and
  • Leveraging new technologies to improve system capabilities and performance.

Outsourcing sounds great, but a company gives up a degree of configurability and flexibility. Today's informatics software must exceed the flexibility of in-house development. The software also must provide a scalable architecture that will not be outgrown. The replacement solution must be configurable and adaptable to the changing needs of the science and the business. Once an application is shown to provide a superior level of configurability and responsiveness to the changes in the internal science practices, then customers should entertain a purchased software solution. One final point about outsourcing; time is of the essence. Due to the rapid change in the science and the business, implementations must be operational in less than 120 days.

A new frontier
The informatics market is a new frontier in terms of technology solutions, and traditional LIMS alone cannot satisfy the complete needs of biotechnology companies involved in scientific information production. Facilitating the business and science processes to meet internal and external customer requirements is vital for a company to grow and prosper financially. Technology infrastructure must connect all the stages of production - from research to information packaging - to deliver the business of science.
Scott Deutsch is vice president of global marketing at LabVantage Solutions