Pistoia Alliance SEED project unlocks the value of data in Electronic Lab Notebooks

Share this on social media:

Credit: MongkolchonAkesin/Shutterstock

The Pistoia Alliance, a global, not-for-profit alliance that advocates for greater collaboration in life sciences R&D, has announced the second phase of its Semantic Enrichment of ELN Data (SEED) project. 

The SEED project aims to unlock valuable scientific data that is buried in Electronic Laboratory Notebooks (ELNs) through the use of semantic tagging and relationship mapping tools. This will help scientists address the challenge facing research and development from the vast volumes of captured experimental data locked in ELNs. 

These unusable and unsearchable data sets are a significant barrier to digital transformation; resulting in duplicated experiments and time spent tracking down and wrangling data. The SEED project phase 2 will build on the success of phase 1 developing relationship mapping and a prototype of an agnostic solution for semantic enrichment for use by ELN vendors. This solution will make ELN data searchable and reusable by semantically enriching free text in ELNs with metadata for every relevant term, unlocking its value for future analysis and aligning with the FAIR principles.

Gabrielle Whittick, project leader and consultant at the Pistoia Alliance comments: ‘Currently, pharmaceutical companies can only use a limited amount of valuable data held in ELNs, even though semantic technology is available. This situation has to change, and the Pistoia Alliance is in the unique position of being able to bring together multiple large pharma companies pre-competitively to address the challenge. This kind of cross-industry collaboration is made possible under the umbrella of the Pistoia Alliance as every member can contribute with their experience and knowledge gained and this shared input drives improvements that the whole community can benefit from. The SEED project will annotate and enrich the text to make it searchable – offering the possibility of uncovering new insights that can accelerate drug discovery and lead to new innovations. We are now calling for more companies to get involved in and provide funding for the second phase of the project so we can scale up our work’

Phase 1 of the project developed new standard assay ontologies for ADME, PD and drug safety which have now been added to BioAssay Ontology (BAO), an open-source database of common assay metadata terms and definitions, and are freely available to the life science community. The project contributors include Pfizer, AstraZeneca, Bristol Myers Squibb, Scibite, Bayer, Biogen, Southampton University, GSK, CDD, Elsevier, Linguamatics, Merck, Sanofi, and Takeda. 

Pistoia Alliance member Sanofi is already realising value from the project and plans to align its ADME assay metadata with the new ontology classes added by the SEED project to BAO. This will make Sanofi’s assay data compliant with the FAIR principles.

Steve Penn, SEED project champion and medicinal sciences information strategy lead at Pfizer noted: ‘The driving motivation behind the initiation of the SEED project is to create a set of open standards for structuring ELN data across Pharma and the life sciences. Delivery of Pharmacokinetic-Pharmacodynamic (PK/PD) and Drug safety assay standards has been a tremendous start and of exceptional benefit across the many partners involved, as well as for those yet to join. 

‘We are now looking to increase the benefit across pharma, incorporating additional data in the form of attributes, mappings and annotations to create relationships between ontology classes to help to describe and define them. The relationships formed between objects within an ontology and to other ontologies and/or standards form a framework for the creation of a graph ontology/knowledge graph. Enabling a plethora of opportunities associated with usage of these data standards, both for legacy and go-forward data.’