Fields of data

Fields of data

Keeping track of agricultural data presents special problems for informatics systems, as Sophia Ktori finds out

Scientific Computing World: June/July 2014

One factor that sets the agricultural sector apart from other industries is the diverse range of scientific disciplines that go into developing a single product. As John Gabathuler, director, industrial and environmental sector at LabWare explains: ‘We are talking about an industry that encompasses both plant- and animal-based products, from seed strain development and genetics to pesticides and herbicides; mechanical sciences – such as seed and grain coating – animal nutrition and health; traditional breeding; and environmental monitoring of farms, waterways, and surrounding areas to ensure that the products applied to crops or livestock don’t have adverse effects on the environment. Add to that: data that needs to be delivered to, and be received back from, the contract service organisations that work with the agricultural sciences industry to test products and the environment; and the farmers who will be applying seeds and grains, pesticides, herbicides and feeds to their arable fields and livestock; and the breadth and scale of the informatics requirement becomes evident.’

Multidisciplinary client base
LabWare works with industry, commercial, academic, and national government organisations and initiatives involved in just about every segment of agricultural sciences. This multidisciplinary client base has helped the company to evolve its core LabWare LIMS solution and electronic laboratory notebook (ELN). Gabathuler continues: ‘Our clients in diverse areas of agricultural analysis and R&D rely on the end-to-end nature of the LabWare LIMS and electronic laboratory notebook, which can drive and instruct workflows and seamlessly integrate data from multiple scientific and business areas and geographies. We term the this end-to-end capability the Enterprise Laboratory Platform.’
The LabWare solution supports mobile technologies for access and data-sampling in the field, a requirement that is a given for the environmental monitoring aspects relating to the agricultural industry, Gabathuler adds: ‘The agricultural sector is ultimately concerned with its products out in the field, and much of the product-testing and marketed-product monitoring is, by its nature, carried out on site – often at multiple locations in different countries. We provide technology support for browser access as well as mobile apps for monitoring, sampling, and data logging in the field, with access to the LabWare system optimised for mobile platforms. Information recorded on site on mobile devices can simply be uploaded into the LabWare system and made available enterprise-wide, in real time.’

From ideation to commercialisation
Paul Denny-Gouldson, VP Strategic Solutions at IDBS, reinforces Gabathuler’s sentiments on the diverse nature of agricultural R&D: ‘There is an exponential rise in the use of genomics and proteomics in agricultural sciences for engineering new strains. Data from each of these cross-functional R&D groups needs to be accessible for search and interrogation in a single, secure environment’. IDBS works closely with some of the world’s largest R&D organisations in the agricultural sciences arena, particularly in the areas of field trials, breeding, genealogy and testing, and crop optimisation (e.g. herbicides and pesticides). ‘Our E-WorkBook Suite offers applications that facilitate analysis, reporting and IP retention,’ Denny-Gouldson continues. ‘The platform is designed to support data management and traceability through the whole process of ideation to commercialisation, and inherently provides an audit trail for the regulators. Uptake of our screening data management solution, ActivityBase, is also on the increase for use in applications such as screening for new natural products’.

Questions, questions …
What is critical in agricultural sciences R&D is the ability to put data into context that will both help to direct commercialisation strategies and potentially drive the development of new products, Denny-Gouldson believes: ‘Our clients want to link up all the research information and data to help instruct commercialisation strategies. This is where a single infrastructure for data storage, management, aggregation, and mining becomes vital. Ongoing research will inform decisions on future product development and the commercialisation of near-market products. For example: “where are we going to sell it? What are the regulations in different geographies? How are we going to position the product with farmers?” Then, further downstream, you need to be able to tie in data from real-world use of the product: “How successful is it? Can we optimise it further? Is there evidence of any detrimental effects on the environment? Is it more successful in specific geographical locations or under specific climatic conditions?” Our clients are basically saying that they want a “joined-up’ picture of their product, spanning development, manufacture, distribution, sales, and utilisation.’
This is particularly important from the perspective of aggregating real world data from the field, adds Christine Zubris, US solutions consulting manager at IDBS: ‘We know that product development in the agricultural sector hinges on the ability to marry data from laboratory-based R&D with that from field trials, including small-scale greenhouse studies and large-scale field trials. Mobile computing is a must, but it is not necessarily the right strategy to give everyone the same mobile device. You need to know what will fit into an individual’s workflow before you decide what mobile device they need. At the most basic level, our clients require data capture resources, such as iPads, for inputting information in real time, but at other levels it may be necessary to provide mobile devices that can analyse, as well as log and store data.’

Mobile analysis for farmers
The ability to apply analytical technologies in the field using mobile instrumentation has filtered down to the farmers themselves, according to David Joyce, senior product manager for laboratory informatics at Thermo Fisher Scientific: ‘Mobile technologies can be used to determine the content of green or grain feed, which can directly impact on livestock yield. Our battery-powered microPhazir analyser, for example, allows farmers to inspect a wide range of animal feeds and ingredients as they arrive, and analyse the constituents for levels for protein, ash, water, etc. The on-the-spot results from spectroscopic analysis may inform feeding strategies or result in a delivery of feed being rejected.’
Joyce’s particular area of expertise is in pesticide analysis, a tightly regulated segment of the agricultural sector that is designed to protect the consumer from high levels of potentially harmful chemicals. ‘Mass spectrometry is applied to assess the levels of multiple pesticide residues and ensure that legally defined limits are not exceeded. Such testing, which is often carried out by contract laboratories, encompasses a two stage-process that involves an initial mass spectroscopic screening to check for levels of a number of pesticides in potentially hundreds of thousands of samples sent to the laboratory. Individual samples that generate abnormal readings are then automatically flagged and taken through additional testing using much higher-resolution mass spectroscopic analysis.’

Instructing pesticide analysis
Thermo Fisher provides both the analytical instrumentation and the software to manage this process. ‘Our LIMS system drives the overall testing workflow, flags anomalous results, and directs the initiation of further testing. Dedicated software instructs the testing system about the pesticides to search for, and what the reference levels are for each pesticide. We also offer a laboratory execution manager (LEM) that sits within the LIMS to take laboratory personnel through the testing process step by step. The data management solution (DMS) ensures that every piece of information and data point is trackable, and can be retrieved for interrogation or review at a later date.’
Joyce believes that what sets Thermo Fisher’s informatics solution apart is the ability to convert data coming out of disparate instrumentation into a human-friendly XML format. ‘It may be necessary to re-evaluate mass spec data from individual samples many years down the line, but this can be nigh-on impossible if the software that was originally used to derive the mass spectrometry data becomes obsolete,’ he states.
‘Our DMS can convert experimental
data from a wide range of instrumentation from multiple vendors into the same XML format. This means you can overlay and compare mass spectrometry and chromatography data directly, and, if required revisit historical data to look for the presence of compounds that may not have been the focus of the original test run. You can’t do that direct comparison with software-locked graphical data or PDFs, for example.’

Scalability and flexibility
Informatics is a core component of the R&D process at DuPont Pioneer, states Jochen Scheel, information management director for Trait Informatics. ‘Managing data, enabling workflow process efficiency, processing and analysing data, and decision support are four main areas of informatics contributing to our agriculture research and product development lifecycle. Requirements include scalable production systems – software, data processing – as well as flexible discovery tools and advanced analytics.’
DuPont Pioneer is a global developer and supplier of advanced plant genetics, providing agronomic support and services to help increase farmer productivity and profitability. The firm’s products are supported by state-of-the-art informatics technologies to enable company-wide information sharing and to increase efficiency. Pioneer’s informatics infrastructure is a mixture of custom solutions developed in-house, commercial tools, and the results of collaborations with leading experts, Scheel explains. ‘Our “must haves” vary for different systems, but some recurring topics for R&D informatics include agility, being able to develop, deploy, and change solutions rapidly and in phases; scalability, an upgrade path to being able to handle large volumes of data, processes, or users; and adaptation and integration, the ability to adapt solutions to specific R&D needs and integrate with other systems.’

Differential requirements for plant breeding and biotech
The vast majority of requirements for R&D informatics are conceptually similar across the agricultural sciences, he says. However, there are some differences between informatics for plant breeding and biotechnology approaches. ‘One difference is the level of centralisation of research. In biotech, much of the research is done in central laboratories and controlled environments, whereas in plant breeding most of the research is done at breeding stations that are dispersed and often in remote locations with limited network connectivity. This has profound implications for informatics infrastructure. Another difference is the need for understanding gene function which is high in biotech R&D but often less required for plant breeding, which makes for significant differences in informatics requirements.’
Technology used in R&D has also had a dramatic effect on the evolution of informatics. The huge volumes of data now generated means that the scalability of systems has become even more important and requires the most advanced concepts and technologies and the competencies to develop, deploy, and support global solutions. Another implication for informatics is the increasing focus on advanced analytics and decision support, Scheel comments. ‘Development and application of quantitative models and predictive classifiers are as important in agriculture R&D to distil information from data and enable decisions as they are in any high-tech industry.’

Managing bottlenecks in the cloud
The agricultural sciences sector represents a major client base for Indiana-based GoInformatics, which offers a truly cloud-based knowledge management solution, GoR&D, for R&D groups in industries spanning animal health/nutrition, medical devices, manufacturing, food and beverage, pharma, biofuels and contract research organisations. Founded in 2010, the firm’s GoR&D cloud platform includes ELN, LIMS, and project and resource management solutions.
Tony Stearns, director, national accounts at GoInformatics, reiterates that problems associated with data-flow bottlenecks can occur because of the field-based nature of R&D within the agricultural industry. ‘That information flow, and ability to seamlessly track and transfer information between R&D groups, project managers, and decision makers, is one of the issues for the agricultural industry that cloud-based solutions can address. Data logged on mobile devices in the field, whether it be metadata, biological samples, or crop or animal physiological, growth, and yield changes, can be captured directly into our cloud platform, and accessed immediately by research teams anywhere in the world. The mobile nature of our platform also facilitates barcode scanning for easily tracking samples and photographing to catalogue visual reactions or physical results.’

GoInformatics has built its cloud solution from the ground up. ‘Whether the company is represented by a small laboratory or has multiple sites globally, GoR&D offers a platform that can be accessed enterprise wide, without the need for costly implementation of traditional, on premise LIMS or other software,’ adds Juan Medina, director of business development. ‘Providing a dedicated ELN and LIMS that facilitates the integration of other systems and instruments into the platform, GoR&D also offers comprehensive document management, reporting, security and accountability, while retaining a log of all activities performed, to keep data and information fully traceable.  
Moving forwards, GoInformatics is working to build increased analytical capabilities into its platform. ‘We do provide analytics for clients on a case-by-case basis, but our clients in the agricultural other sectors are asking for analytics that will allow them to make increased use of their current and historical data in the cloud, Stearns notes. ‘Our move to build these new capabilities into the cloud platform represents a key strategic step forward for us.’

Seed testing
Wisconsin-based BioDiagnostics offers a suite of testing services, including genetic testing, seed treatment and trait analysis, to seed producers, seed retailers, plant breeders and companies that assess the quality of seed, grain and oil. Operating seven laboratories that use different technologies for disparate testing services, the challenges of outsourced testing in this field are largely centred on rapid submission of test requirements and samples, testing turnaround time, data security, and communication between BioDiagnostics and its clients, explains the firm’s vice president, Denise Thiede.
‘The results of our testing can be vital for clients who may be conditioning seed, bagging seed and moving it to where it is to be planted. They need a rapid and easy-to-use platform for submitting test requests, and once they have submitted the samples for testing they want to know when they can expect to receive the results, which may directly affect business decisions.’ To this end, BioDiagnostics has implemented a web-based platform that allows the client to submit its testing requests online, generate labels for sample submission, and even plan a whole season’s testing programme and upload it into the BioDiagostics system in advance. ‘We provide an organisational tool to help clients negotiate what can be a very complicated testing environment,’ Thiede notes. ‘The system is designed to provide guidance on the types of testing that the client may require for its products, along with the information that they need to provide to accompany the samples. The client can upload and complete a spreadsheet and then send this directly to our database prior to the samples arriving at our laboratory for testing.’

Results in real time
Two of the major informatics challenges for BioDiagnostics revolve around the ability to provide its clients with the ability to track the progress of a testing programme in real time, and to upload test results directly from the platform. ‘At present we can give our clients an estimated time for test completion, but what they really want is the ability to see the progress of their testing programme and the results, directly,’ Thiede says. ‘We are in the process of building this capability into our LIMS infrastructure, so that clients can also access their test reports online, and have the results data available for export into other formats, such as Excel, rather than rely on the data being sent by email or in the mail in hard copy format. This facility would also mean that clients should no longer have to store their reports internally, and will allow them to access historical data directly on our system. Ultimately, we aim to have a platform in place that will allow customers to track their testing online, see the analytical data as well as the reports that we generate, and access their information from mobile devices. It’s a move that will increase efficiency both for us as the service provider, and for the client, who will be able to more seamlessly request their tests, access and make decisions on the results, and provide data requested by the regulator.’

Centralising disparate data
The primary advantage of centralising or federating disparate data is that data can be analysed and correlated to extract key indicators that will enable researchers to make holistic decisions, notes Alister Campbell, head of application science at Dotmatics.
The UK-based firm works primarily with the R&D side of the agricultural sciences sector, including chemistry, formulation and scale-up, as well as with managers and executive teams who use the platform to monitor the progress of R&D programmes and make projections.
‘As we see it, the main bottlenecks of traditional agrochemical companies include a lack of consistency in data recording, which hinders the possibility of finding correlation or trends, along with poor information-sharing across silos, and the extent of time and manpower it takes to query data and reporting when results are sourced from multiple experiments or projects,’ Campbell explains.
‘Streamlining communication is key to the success of modern research in agrochemicals. Many large companies in this sector have implemented informatics systems that cover their research processes, but many organisations of all sizes still rely heavily on pen and paper for recording data.’

Supporting the paperless lab
The Dotmatics Platform is a comprehensive web-based system that includes the Studies Notebook ELN, which offers end-to-end functionality for storing, querying and analysing R&D data.  Central to the platform is Browser, which is a configurable web-based system to query, report and collaborate. This can be integrated with Vortex, an intuitive and versatile data visualisation and analysis solution that can act as an alternative to spreadsheets. It provides the plots and functionality required to explore and understand any complexity and size of data. One of the pivotal tools related to implementing a paperless environment is Cascade, which manages laboratory tasks and enables the tracking the progression of samples through their complete life cycle, from work request to testing and processing results.
‘Organisations are now realising that having a comprehensive informatics platform that enables laboratory personnel to work in a paperless fashion is no longer an option, but a must,’ Campbell stresses. ‘Most agrochemical companies share a common desire to become smarter by implementing laboratory environments that are completely digitalised. The focus is on the ability to query, analyse, report, and extract new knowledge from diverse experiments as well as data from multiple projects.’