One of the most obvious, and immediate challenges when bringing new software into a lab environment is the likely ‘spaghetti soup’ of existing platforms – possibly from multiple vendors – that are already installed, suggests Richard Milne, vice president and general manager, Digital Science, at Thermo Fisher Scientific. ‘Each of these will offer different levels of integration’.
The situation is compounded because even in the same organisation, there may be different suites of instruments and software in separate labs and across departments. While there is an ambition to integrate instrumentation and software tools across a business and geographic sites, the reality may be what Milne describes as an ‘unstructured legacy of decisions’. Each of which represented a theoretically attractive investment at the time, but which in practice offered a point solution that ultimately ‘confuses’ the whole environment.
This means many organisations will have some level of legacy investment, in instrumentation, in SOPs, in working practices, and in pieces of software that people use every day. It’s likely that most integration projects will be very much in a ‘brownfield’ setting, rather than setting up ‘greenfield’ labs, Milne said. And that brownfield environment will almost undoubtedly be very fragmented in terms of its legacy systems. So, connectivity needs to happen at the enterprise level, not just at the lab level, and encompass that existing ecosystem of digital technologies, Milne believes.
‘The overarching aim is to generate an environment that allows you to unravel that spaghetti soup of existing platforms, and make sure that there’s coherent organisation, and the ability to use it all,’ Milne said. ‘Ultimately, this will allow organisations to purchase tools based on the capabilities and features of those tools, rather than on whether they will talk to the lab’s existing equipment.’ And this means that wherever R&D teams, service providers or partners are located and whatever technology they use, they should be able to collaborate and share data across a cloud-based environment.
‘As well as solving the integration challenge for individual scientists, we want to make sure it can also be achieved at scale,’ he continued. ‘We are trying to square the circle a little bit, by creating a platform environment’ – and this will be a cloud-based environment, Milne noted – ‘that will give people the freedom to make those decisions at the level of their workgroup, lab or building, but expand the value of that integration at scale.’
This also means that the concept of integration doesn’t stop at the level of lab instrumentation and software, he believes. ‘Lab function will increasingly be married to asset performance management, inventory or resource management, and quality management. Again, businesses will have many options to choose from, and then the issue becomes, for example, how do I integrate my resource management tool into my existing laboratory information management system (LIMS).’
Leveraging advanced tools
The concept of holistic lab orchestration hinges on addressing three basic problems, says Trish Meek, director of marketing at Thermo Fisher Scientific. ‘Firstly, getting all of the data together so that you can use it for analysis, visualisation, and increasingly, for leveraging AI and advanced machine learning tools. Then there’s the human experience in the lab. How do you optimise the scientific experience for scientists day-to-day. Third is the need to improve and facilitate process optimisation.’
Instrument and software vendors are already making moves to facilitate easier integration, Milne acknowledged. ‘From a lab connectivity perspective, then when we look at instrumentation and software used for everyday lab work, such as sequencers, qPCR, flow cytometry, etc., the vendors of these types of instrumentation are already starting to think about connectivity when they develop their new instruments.’ Thermo Fisher Scientific, for example, is building connectivity into all of its new instrumentation, he continued. ‘But then there will also need to be some sort of ‘retrofittability,’ and that will be part of our initial offering. This will be achieved through the creation of a gateway that will make it possible to connect instruments and software in the lab, into a cohesive environment.’
While there are possibly multiple aspects to the issue of achieving seamless connectivity in the lab, the ultimate aim is to make laboratory systems more effective at what they do, every day, Dave Dorsett, principal software architect at information technology consultancy Astrix Technology, suggests. ‘That’s a foundational concept; how to improve usage of systems – such as a LIMS or ELN platform – from the perspective of everyday use, and how to get these systems to work together to support the labs on a day-to-day basis.’
Consider the software and hardware tools that a lab ecosystem relies on, and much of the interruption in integration will commonly be due to the diverse nature of instrument architecture, Dorsett noted. An organisation may have LIMS systems from multiple vendors in use across different departments, for example, he said, mirroring Milne’s sentiments. ‘Some of these systems, whether LIMS platforms or other hardware or software, are more challenging to integrate than others. And this makes it costly for individual companies to set up and maintain them from an integration perspective.’
What this means at the most basic level, is that many labs may still rely on manual data transcription or ‘scientist-facilitated integration’, Dorsett continued. ‘“Sneakernet” [physically transferring data from one PC to another using portable drives and devices] remains just part of everyday lab life. And no matter how careful you are with manual transcription and data input, or how effective your data review processes, the ultimate quality of that data is always going to be at risk.’
There are two challenges, in fact, Dorsett suggested. Sometimes the issues are not so much with getting systems to talk to each other, as they are with aligning and harmonising the data that comes out: getting data out of point systems and enabling the flow to the next stage represents another stumbling block to seamless lab integration, Dorsett suggested. If it’s hard to get data from a LIMS, ELN or other key piece of software back out as accessible and meaningful, then it may not be possible to use that tool or platform to maximum effectiveness and efficiency.
Dorsett continued: ‘One approach to addressing such issues is to bring data from multiple systems into data lakes, where it can feasibly be compared, but again, you have to ensure that your data are equivalent, particularly where your labs may be running multiple LIMS or ELNs, for example. You may have one LIMS for stability testing, and another for batch release, plus method data in an ELN.’
A typical problem organisations face is how to compare all of that data once you have technically integrated your systems. ‘For any laboratory organisation, one of the biggest challenges to using the systems that they want to integrate, is how to ensure both data quality and data comparability/equivalence across systems, even once they are interconnected. Are your experimental methods equivalent, for example, or does a sample ID from your LIMS match a sample ID from a CRO?’ Dorsett said.
It’s important to try to understand what tools are used at the level of the lab, facility and enterprise, as the basis for working out how to maximise use of that collective investment, identify key gaps, and define a longer-term roadmap that recognises the importance of sustainability and total cost of ownership. ‘You want to try to find ways of using integration technologies that are already there more effectively, as well as to be able to bring in new technologies,’ Dorsett said. For instrument integration, there are middleware companies who are positioned to offer specific software to facilitate instrument integration, Dorsett suggested, citing SmartLine Data Cockpit, TetraScience and BioBright – the latter having been acquired by Dotmatics in 2019. ‘These companies are focused on providing tools that can address how people gather all their data from the different instrumentation,’ said Dorsett.
Any rounded conversation on LIMS/ELN and software integration and management will at some point likely come around to the concept of data lakes, noted Robert D Brown, vice president, product marketing at Dotmatics. ‘The initial concept was that data lakes could house all the lab-derived data, and that people could then dip in and retrieve what they needed, when they needed to.’ But in a real-world setting there are two types of data, he explained. ‘You have the structured data, such as data in your ELN, but then you will also have the vast numbers of unstructured data files that are being generated by all of this automated instrumentation that labs now use. Typically these files may be output in proprietary, non-standardised formats, and the data contained will first need to be parsed out before being put into a LIMS, or an ELN.’
But with the right tools in place, we can now have the best of both worlds. ‘We can have hybrid systems where the unstructured data lives in the lake, and the structured part of that data can then go into the ELN. As long as both sides have a good API, and you have a way of parsing the data, then its possible to overcome most technical hurdles. The trick is to link the two types of data appropriately.’
From a software perspective, the ability to work with both biologics and small molecules using the same overall platform is founded on the use of software components that can be slotted together in multiple ways to establish the right workflow for the right outcome. A chemist may do things in one order, but a molecular biologist might do them in another. ‘That’s the real trick,’ he continued, ‘to be able to put the different pieces of the same overall solution together so that they match the workflow for the different scientists.’
And the next stage in software evolution will at least in part – and perhaps inevitably, Brown noted – focus on integrating AI into the everyday lab function. ‘First, you have to add AI and machine learning into your software stream,’ and this is more procedural, he indicated. ‘But critically – and this is perhaps the biggest problem – it’s imperative that you are getting absolutely clean data into those ML models in the first place. If you don’t put clean data in, you will get garbage out.’
This brings us back to the concept of data automation, so that you don’t have to use humans to move data around, which will at some stage run the risk of human errors in data manipulation, Brown said. Automating data generation, management and transfer will also facilitate that ability to pull complete datasets across from any system, into the ML learning pool.
‘And here is where we have the advantage of the BioBright lab automation solution, which automates the complete process of getting data off instruments into the lake, parsing it, and putting it into the notebook. Compare this with the requirement for human transfer of files between systems, and the manual inputting of data, and sharing spreadsheets, which is inherently error-prone.’
With this goal of complete round-trip automation in mind, Dotmatics announced a partnership with HighRes Biosolutions, which designs and builds robotic systems and laboratory devices, in January. The collaboration focuses on marrying high-throughput laboratory automation capability of the Dotmatics ELN with the HighRes instrument control software Cellario. This combination frees scientists to plan experiments, run individual instruments and publish and analyse data in a single software interface.