Skip to main content

Case study: How AstraZeneca is overcoming obstacles to analytical data access

Astra Zeneca workflow

Astra Zeneca workflow

As the Director of Structural Chemistry in AstraZeneca’s Oncology R&D division, Nichola Davies recognised that analytical data was difficult for scientists to find and access. This led to inefficiencies and duplication of effort.

Existing methods for sharing data were designed for individual scientists. Data was exchanged via emails with links and attachments. Finding historical data, or that acquired by others, involved numerous applications and steps, and the process was different for each function. Accessing data generated by colleagues in another location was even more difficult.

Ineffective knowledge transfer between discovery and development

The initial synthesis of a drug product occurs years before it transitions into development. Analytical teams in discovery collect volumes of data to understand structure and develop chromatographic methods for purification. Unfortunately, this work is often lost in the transition to development. Analytical teams in development regenerate analytical data, reassign spectra, and redevelop chromatographic methods, because organisations lack effective analytical data management that would allow them to leverage the data and knowledge acquired in discovery. At best, they rely on personal networks and email, making data accessibility siloed, inconsistent, and time-consuming.

Davies and a global team, which included Richard Lewis, Principal Scientist, R&I; Prakash Rathi, Augmented Drug Design Engineering Lead; and John Ulander, Principal Scientist, Data Science and Modelling, recognised this was true for R&D at AstraZeneca and decided to be the changemakers.

Goals – breaking down barriers to data access:

  • Create a centralised, cloud-based solution to make analytical data accessible to all functions
  • Store raw and processed analytical data that is accessible and live (immediately re-useable)
  • Make analytical data available for the development of machine learning models.

AstraZeneca’s analytical data management strategy

Groups at AstraZeneca were using applications on the Spectrus Platform from ACD/Labs. The software was ingrained in analytical workflows; being used to analyse and interpret NMR and LC/MS data and manage analytical knowledge, primarily in development.

Embarking on the Global Analytical Database (GAD) project, the team decided to capitalise on its global licences of Spectrus applications and use ACD/Labs services to integrate hardware and software, automate workflows, and fully realise its vision of centralised, accessible analytical knowledge.

A pilot project for analytical data management was implemented in 2016, to support Oncology Chemistry workflows across the UK and US teams. The success of this deployment evolved to a global project for all therapeutic areas and CRO partners deployed broadly across AstraZeneca. 

AstraZeneca’s global analytical database (GAD)

Workflow for automated analytical data management at AstraZeneca:

  • Collect raw NMR and LC/MS data (from 92 analytical instruments, and expanding)
  • Extract metadata and add structures from the ELN or registry
  • Process the data according to datatype and create a database record in the cloud-based, searchable repository
  • Copy raw instrument data files to archive for regulatory and IP purposes.

AstraZeneca’s GAD is used by hundreds of analytical, medicinal, and computational chemists in sites across the world.

Benefits that are being realised:

Consistent data

No matter which instruments the data are generated from, consistent metadata and processing means data is easy to find, making data comparisons possible and data reuse easier.

Available data

Analytical data is available minutes after it has been acquired and searchable for users to find, regardless of where it was acquired. It is also accessible to automation workflows.

Time savings

Accessing historical data for patent and publication writing is quick and easy, and no longer requires contacting analysts for retrieval.

Easy, standardised reporting

Everyday reporting of data (NMR, MS, and analytical UHPLC) has been standardised across sites. Publishing results is faster than ever because scientists can gather all the data they need with a few mouse clicks.


Colleagues in downstream workflows no longer need to repeat experiments – project data is easily accessible for refinement and further investigation.


Scientists can pull trends from data that would be difficult to identify without the large volume of standardised data now available.

An AI outlook 

Standardised and contextualised analytical data is organised and accessible, beyond anything previously available, for data science and machine learning. 

Teams at AstraZeneca are investing heavily in methods to characterise, understand, and predict data. Leveraging the ability to quickly build an understanding of relationships between data and properties – insights that would otherwise require more rigorous experimentation or be unavailable. 

The future of AstraZeneca’s Global Analytical Database may be to supply data to pattern recognition algorithms that would make the traditional data interpretation of spectra and chromatograms scientists undertake today a thing of the past.

For more information, go to

Media Partners