Discovering the tools that aid innovation

Work in the R&D laboratory starts with an idea and an experiment, not a sample. Sophia Ktori investigates how informatics packages are tailored to make life easier for scientists in the discovery laboratory

Prising paper notebooks out of some researchers’ hands and replacing them with a laptop or terminal can be a traumatic exercise, and nowhere is this more true than in the drug discovery laboratory. Reasons for not embracing the software tools that have been developed specifically to make scientists lives easier and maximise value from data are predictable, suggests Graham Langrish, sales manager at LabWare. ‘We hear things like, “Our work isn’t regulated, so we don’t need a laboratory system or an audit trail for every experiment”, or “The software isn’t flexible enough to capture and manage the huge breadth of both unstructured and structured data that we generate”.’

Inefficient and inconsistent

Yet we all know that writing on paper is a highly inefficient way of recording and storing data, Langrish points out. Pen and paper generate effectively dead data that can’t be queried or searched, ‘Everyone has their own way of jotting things down, and there is no consistency. Handwritten notes are also subject to misinterpretation, and scientists who leave the company will effectively take a huge amount of their accumulated knowledge with them. Consider two laboratories that are working on different aspects of the same project research at geographically distinct sites. Expecting them to pass around scanned pages from a paper notebook or emailing Microsoft Office files or Excel spreadsheets between sites, is totally impractical.’

Share and share alike

It is also false economy not to invest in an informatics infrastructure, Langrish maintains. ‘At the very least, scientists need an electronic laboratory notebook (ELN) so that all their experiments and processes are logged systematically and are searchable. Results can be shared, and data mined and revisited to avoid repetition. A laboratory information management system (LIMS) adds more value by providing an overall management framework for workflows, projects and experiments, and many other useful tools to manage the lab’

The need to have flexibility in an informatics platform becomes even more relevant when you work with external partners who may have very different in-house processes, or you outsource some aspects of discovery work to CROs, Langrish stresses: ‘There is a major industry drive to collaborate in order to innovate, which means that the whole R&D process, right from early discovery through to formal development, increasingly involves multiple partners who all need to put in, access, and analyse data. A common platform is an absolute must.’

Don’t underestimate the need

A LIMS is just as relevant to the discovery lab as it is to a manufacturing or QA/QC environment, he adds: ‘Although research at this level is not regulated, there is still a requirement for instrumentation, reagent and solution inventory, and document management, such as storing SOPs.’ The benefits of implementing an enterprise system in a discovery setting may also far outweigh the financial outlay, Langrish stresses. This is especially true in the biotech arena, where a company may have a five-year plan that involves developing a single-platform technology or product to the point where the company may be sold off to big pharma. ‘The first thing a potential buyer will want to see is all the data. Who did what, where, and when; what were the results, and how were they interpreted? If the biotech can’t easily retrieve their data or prove their findings, then they are unlikely to attract serious bidders.’

LabWare has developed its Enterprise Laboratory Platform (ELP), which combines LabWare LIMS and LabWare ELN as a seamless platform for any scientific laboratory, from the earliest discovery setting, to downstream development. The searchable organisational database offers increased data visibility and knowledge about compounds, drug formulations, and inventories from compound screening through to late-stage development and regulatory submission.

The company maintains that its approach of providing an integrated solution also allows organisations to manage both structured and unstructured data in a single platform. ‘Today the need to collaborate with both colleagues and external partners is seen as paramount,’ Langrish concludes. ‘LabWare’s ELP provides a single point of truth for all your scientific data that enhances organisations’ efficiency, improves collaboration, and drives innovations.’

Handling biological complexity

The pharmaceutical industry does have a major requirement for software that facilitates its drive to collaborate on drug discovery, comments Andrew Lemon, CEO at The Edge Software Consultancy. Pharma is teaming up with academia, biotech, and contract research organisations (CROs) to identify and develop targets and druggable biological pathways, as well as design small-molecule and large-molecule drug candidates. But it’s the large molecules that are taking centre stage, Lemon suggests. ‘Software for chemistry R&D is unremarkable, and widely available from multiple vendors, but with industry’s increasing focus on the development of large-molecule biotherapeutics, there has never be a greater need for software tools that can both manage the complexity of biological information and facilitate networking and external collaboration.’

Common informatics environment

The Edge specialises in software for biology R&D, and has developed its BioRails suite of tools to facilitate communication and cooperation, in parallel with data handling and reporting. BioRails combines a comprehensive biology data management platform with an ELN, and has the flexibility to manage information input and output for all discovery-stage and downstream R&D processes, Lemon claims. ‘The software establishes a common informatics environment that forms the foundation for functionality between and within interdisciplinary R&D teams in the pharmaceutical and biotech sectors. ‘

Automating assay requests

In addition to its overarching data management and ELN tools, BioRails features BioRails PTO (Project Tracking and Optimisation) and BioRails AP (Assay Planning). These modules have been developed to support networked R&D by offering the capability to plan and submit assays and assay requests – whether to in house or external laboratories – define protocols, schedule, queue and track assays, and manage compound inventory, including the preparation of assay plates, formulation, and biological testing. ‘The platform enables automated progression, such that a biological entity registered into the system is automatically enrolled into the correct assay queues,’ Lemon explains. ‘Dependent on the results of these initial assays, the next set of assays are then also automatically assigned to the molecule.’

Functionality aids decision making

The level of functionality that BioRails provides for assay planning, design and tracking is relatively new in the informatics arena, he claims. ‘Our software effectively joins the biologists up with project teams inside their own organisation, and with collaborators or service providers, to streamline workflows, avoid repetition, and improve efficiency and data management at all levels. It also means that managers can view progress and results in real time, which facilitates reporting and decision-making.’

High-content assays

Research at the discovery stage is also becoming increasingly high-content, rather than just high-throughput, and is moving towards the phenotypic rather than purely pharmacologic, comments Paul Denny-Gouldson, VP strategic solutions at IDBS. ‘In the biology space particularly, scientists are trying to explore molecular mechanisms of action to understand how a disease works, so that they can turn a causal pathway into something druggable. High-content screening is putting a demand on the capabilities of the instrumentation, and also on the software platforms’.

IDBS recently launched a completely overhauled version of its ELN, E-WorkBook 10. Much of the latest iteration has been to migrate the platform to the web as part of a three year modernisation programme, and features new capabilities for visualising and analysing complex datasets and results. This allows multidisciplinary researchers to collaborate more freely on data interpretation as well as input, and the sharing of intellectual property, Denny-Gouldson suggests.


Solid and secure

‘Big pharma is leading the drive to tap into the wealth of innovation in the biotechs, platform, and virtual companies that have the specialist knowledge and research infrastructure to work at the level of the genetic basis of disease and the identification of protein or gene-based therapeutics that can impact on disease mechanism. Today’s informatics solutions need to offer a solid, secure repository for data handling, but also facilitate checking the integrity, standard and quality of that data. Our ActivityBase suite offers dedicated solutions for the management of traditional small molecule screening data and also for biological screening. The ActivityBase biology platform can be used with all assay types, including multiparametric and image-based, high-content screens, and allows all scientific data analysis and result derivation to be effected in one place.’

Supporting collaboration

In parallel with ActivityBase, IDBS’ E-WorkBook has been developed as an environment for supporting collaborative research at the discovery stage, and in particular capturing and managing IP. The platform’s ELN is complemented by BioBook, which integrates with other data systems including LIMS, so that data imported from multiple laboratories can be collated and all research kept in a single secure environment.

‘We are effectively a gatekeeper for all that data, providing a data curation element through ActivityBase and E-WorkBook,’ Denny-Gouldson says. ‘Our aim is to provide industry with the tools that will enable them to maximise the value of their data. Research at the early stage is about the experiment, not the sample. There are no “yes or no” answers, and data is often observational, so software must be able to store, manage and facilitate the interpretation of multifactorial and diverse data types.

‘The future for research is bright. This growing emphasis on high content and high throughput, combined with the move towards more collaboration and externalisation, will be defining factors in the years ahead. The nature of research is evolving, and we are evolving with it.’


The need for dedicated software that can handle the complexity of biological data input, output, and visualisation, and in parallel facilitate multidisciplinary networking, is similarly stressed by Alister Campbell, head of application science at Dotmatics. The firm specialises in the development of software solutions for the R&D sector, including early-stage discovery. ‘Our suite of web-based tools has been developed to help scientists become as self-sufficient as possible,’ Campbell explains.

‘The platform was born out of the need to give scientists an easy-to-use front end with querying and retrieval functionality, in combination with the flexibility to manipulate and view all types of data, both chemical and biological, in different ways.’

It’s this ability to visualise data in multiple formats that can be hugely important in terms of maximising the utility of experimental results, Campbell believes.

‘The Dotmatics suite enables the user to view data in numerous ways, for example, graphically, comparatively with other data, or structurally. But on top of the querying and reporting tools, the software gives users an ELN capability for data input, together with biological data analysis capabilities, and dedicated bioregistration and small molecule registration tools.’

Productivity tools

The company has, in parallel, developed software that allows scientists from collaborating or outsourced laboratories to import data from their own informatics platforms into the Dotmatics suite. Additional productivity tools enable scientists, project managers and decision makers to search and query data, in real time, from a Word document or an Excel spreadsheet.

‘The fact that all our software is web-based has many advantages for collaboration, Campbell notes. ‘You don’t need to install the same software at multiple sites, and monitoring who has access to what is much easier using a central, web-based hub. Web-based tools also reduce overheads and the need for complex implementation and in house IT support, and people are used to navigating websites, so they find browser-led products very intuitive. Moreover, collaborators can start inputting data into the platform immediately, and view it in real time.’

The Dotmatics package currently encompasses 14 different tools that all interconnect seamlessly. ‘We are also continually working on multiple development projects with our clients – particularly in the biotech and pharma sectors – who have a shopping list of functionality that they’d like to see available. Much of this ongoing development is in the biology space’. 

More complex than chemistry

It’s a theme that both Campbell and Lemon emphasise. ‘Although there is a wealth of commercially available tools and packages available for handling chemistry, the complexity of biological information has meant that equivalent tools for the biology laboratory are only now catching up,’ Campbell notes. ‘The chemistry workflow tends to be much simpler than biological experimentation, especially at the discovery stage with the interplay of genomics, proteomics, next-generation sequencing and biological molecule design and expression.

‘With chemistry you are repeating well-rehearsed workflows for designing and manipulating synthetic small molecules that behave in predictable ways. With biology you have to design, produce, and test hugely complex large molecules such as DNA, RNA, siRNAs, proteins, antibodies, and even combinations of biological and synthetic compounds, against biological pathways. Biologists may also have a far more unstructured approach to experimentation than chemists, which means that the capabilities for data input as well as output has to be very flexible.’

Too much to handle?

The unstructured nature of early-stage discovery, in the biologics field especially, is something highlighted by Anthony Uzzo, Core Informatics’ president and cofounder. ‘You have to develop informatics tools that allow the scientist to think about, plan, and execute the experiment,’ he states, mirroring Denny-Gouldson’s sentiments. ‘Scientists start with an idea or an experiment, rather than a sample, which is problematic enough, but that then leads to huge libraries of samples of multiple biological types, associated with a wide range of results and metadata that are highly heterogeneous across different therapeutic fields. Consider the advanced detection technologies that are now being employed in discovery labs, such as next-generation screening (NGS) and high-content screening, and it becomes evident that the complexity and volume of data emerging from early experimentation is already too much for many legacy informatics systems to handle.’

Today’s discovery laboratories require software platforms that offer a flexible system architecture and powerful application programming interfaces (APIs) that empower the customer to be more agile, Uzzo continues.  ‘We started with a highly flexible system. It’s impossible to add flexibility after the fact to a system – a real challenge for most other LIMS providers. Software must therefore address scientists’ unique workflows and integrate with a laboratory’s existing infrastructure, without the need to write and implement custom applications. Our enterprise system has been designed to empower scientists to rapidly configure the platform to accept any workflow, including tracking biological sample genealogy and lineage, for example, without writing a single line of custom code. This means scientists can deploy applications in a matter of hours instead of the months that it can take with customised software approaches.’

Efficient data analysis

Core Informatics’ cloud-based Platform for Science combines LIMS, ELN, and an SDMS (scientific data management system). SDMS is an automated data capture framework that is capable of driving the integration of Core LIMS and Core ELN with any laboratory instrument or workflow. ‘SDMS enables researchers to capture data directly from a wide range of instrumentation, including plate readers, imaging systems, NGS platforms, analytical equipment and liquid handlers, and to transform that raw information and load it into the database,’ Uzzo says. ‘This facilitates much more efficient decision-making and data analysis than if scientists had to manually take files off the instruments and convert them into usable data.’

It’s the combination of a LIMS, ELN, and the SDMS infrastructure that provides the power and flexibility to handle and utilise the most complex unstructured and structured data, he believes. ‘As well as being particularly well suited to tracking samples of any type, LIMS facilitates the streamlining of data analysis into a series of workflows that can be configured to meet each laboratory’s unique needs. Unlike an ELN, which isn’t geared to producing structured reports, LIMS inherently gives the scientist the opportunity to generate structured output and perform multiparametric queries to aid decision making. But actually, what we really need to do is to change people’s perception of individual tools such as LIMS and ELN, and encourage them to think more about an integrated informatics infrastructure.’

Getting data across the firewall

The cloud-based nature of the Core Informatics Platform for Science also facilitates networking and collaboration between laboratories, Uzzo says: ‘Customers are telling us that they want to be able to work more effectively with their external collaborators. They come to us because their legacy systems simply don’t have the inherent ability to allow laboratories to capture results and other data from remote sites, and pull them back across their firewall into the local informatics infrastructure. We enable them to have a scalable, seamless connection with partners through a synchronous exchange of data from anywhere, on any device, at any time.’

Partnerships and outsourced contracts are made and broken with rapidity, and working in the cloud enables expedited provision of computing resources with less complexity of implementation and at lower cost, Uzzo concludes: ‘The bottom line is: when it comes to choosing a vendor the proof is in the pudding – you should expect to see your data and workflows in the system in the first vendor demonstration.’


Robert Roe explores the role of maintenance in ensuring HPC systems run at optimal performance


Robert Roe speaks with Dr Maria Girone, Chief Technology Officer at CERN openlab.


Dr Keren Bergman, Professor of Electrical Engineering at the School of Engineering and Applied Science, Columbia University discusses her keynote on the development of silicon photonics for HPC ahead of her keynote presentation at ISC High Performance 2018 


Sophia Ktori explores the use of informatics software in the first of two articles covering the use of laboratory informatics software in regulated industries


Robert Roe discusses the role of the Pistoia Alliance in creating the lab of the future with Pistoia’s Nick Lynch