Why FAIR data accelerates research
Since the creation of the first digital computer, chemical research has experienced an explosion of data from multiple sources. Data is routinely collected from published papers, instruments, databases, and experiments, and then stored in multiple formats, often unstructured, in separate locations.
Today’s research involves huge quantities of data, all of which demands considerable time to upload and organise – and then almost as much time, if not more, trying to find information for analysis as actually working on the results.
Fundamentally, if researchers could collect, locate, and manage data more effectively, how much more time could be devoted to analysing, discovering, and innovating?
Many research institutes and departments have completed part of the journey to digitalisation. Data has moved from paper notebooks to spreadsheets and databases. But although data is held electronically, it is not easily shared with teams or integrated into research workflows.
If researchers rely on separate logins to databases, file servers, SharePoint, DropBox, Azure, AWS, as well as commercial resources, then looking for and validating data resources can be time-consuming, often boring – severely constraining research productivity.
In addition, the sheer volume of data can bring its own challenges. When faced with potentially millions of table rows, understanding data, identifying patterns, and pinpointing significant results may not be humanly possible.
Data can just as easily be isolated at the application level. Importing or connecting to data between applications, and ensuring the validity of sources, can also absorb time and resources that could otherwise be devoted to research.
And, of course, data can be more than numbers in tables. There may be no easy way to store and share information such as notes on procedures, and often unstructured collaboration relies on external services, such as Airtable, Google Docs, Google Sheets, SharePoint and more. In turn, these create their own silos, with another layer of administration, complexity, and cost.
First introduced in 2016, the FAIR data principles of findability, accessibility, interoperability, and reusability aim to address many of these issues. FAIR data principles emphasise machine-actionability, to take advantage of the exponential growth of computing and storage capacity and, perhaps more importantly, wherever possible to automate data curation tasks.
Research productivity is all too easily dragged down by disparate applications, ad hoc data-sharing, and data that does not conform to FAIR principles. Yet following the concept of marginal gains, every small step in the right direction will combine to deliver significant improvements.
For example, imagine that data is available to all your applications, from experiment to analysis. Or that results are visible to all team participants, automatically. Or that all notification and commenting happens in an electronic notebook. Implementing just one of these changes could save up to a week each per year simply by cutting out tedious manual work.
Attaining ‘data nirvana’ does more than simply eliminate administrative tasks. Better insights and analysis will enable better decisions, more quickly; rather than run one project a month, you could release enough time – and energy – to run one more project each year.
The newest, most modern, cloud-based solutions are highly effective enablers of the new digital world, with fully integrated, intuitive working environments. Data is accessible by every application, collaboration notes are available to every user, and results are presented for drill-down analysis by any researcher.
Whether it is working with new molecules, designing new processes, or figuring out scalable production, almost every aspect of research relies on data. Taking a fully strategic approach to data will both deliver the tangible outcome of greater productivity through time saved, and enable innovation by freeing researchers to focus on insight and analysis. A coherent data strategy, integrated applications, and powerful collaboration tools are essential enablers for research productivity. And in the end, establishing the right data strategy is part of the research endeavour itself: the desire to make the world a better place.