What is the scientific opportunity cost of fragmented research data?

What are the challenges and opportunities around scientific data management? (Credit: Sapio Sciences)

Rob Brown, PhD, Global VP of Product and Pre-Sales, Sapio Sciences takes a closer look at the challenges and opportunities around scientific data management, exploring the common obstacles to effectively managing scientific data and the benefits of taking a science-aware approach to consolidating this data.

There is a subtle irony in scientific technology adoption. On one hand, the scientific community is predicated on progress—on asking questions and doing things differently. This is reflected in the current drug pipeline, a cornucopia of breakthrough therapeutic types focused on targeting more effectively, minimising side effects, and treating different indications. But IT and data solutions in labs often only change when systems break or fail to support new research requirements. For members of the scientific community, this irony makes sense. Disruption can significantly impact the speed of research. But what if a system or process is not broken, but not optimal?

One significant example of a functional, yet sub-optimal approach, is scientific data management. While many organisations have tried to solve the problem with data lakes and warehouses, their approaches to data management have remained laborious at best and all-consuming at worst. But pharma and biotech organisations have been getting by on a combination of unification tools and manual analysis for decades. Why change now?

If research IT’s ultimate charter is to support and accelerate science, then data management should be an urgent priority—not because it is broken, but because of the unrivalled scientific value of data and opportunity cost of not fully exploiting it. Many scientists can spend a disproportionate amount of their time on mundane, data-related tasks, which directly impacts time spent on research. Research IT is similarly consumed by data-related functions, for example integrating disparate applications or spending significant time and effort creating parsers for every instrument. This too negatively impacts science, as IT has less availability to support project-specific needs and explore innovative technologies that enhance life in the lab.

Quantifying the opportunity cost

How significant is the opportunity cost of fragmented data? The best way to play this out is anecdotally. Consider a hypothetical project scientist who can access assay data from his organization’s data warehouse, but must search elsewhere for pharmacokinetic data. He spends hours preparing the data for upload into his preferred modeling application. Let’s assume that collectively, this tedious pattern takes 12-14 hours, or 30-35% of the scientist’s average workweek. That is one in three scientific hours that could be spent on research, experimentation, and progress.

Research IT teams are also negatively impacted by data fragmentation. Developing a custom integration between source data systems can take weeks. Multiply that by the number LIMS, Electronic Lab Notebooks and other sources they need to integrate with, and the aggregate consumption of IT resources is staggering—reducing the time available for project support, infrastructure, and innovation.

Re-framing the solution

Industry solutions for scientific data have often taken an IT-first approach. If data is fragmented, why not put it in a data lake? But while this approach consolidates data, it can’t deliver the tangible scientific insight that helps drive research forward.

To solve the scientific data problem, we need to remember what research IT is about: advancing science. When we re-frame the problem through a scientific lens, it is easy to see that research data challenges are as much about utilisation, collaboration, and insight as they are about collection and unification. Solving the problem requires us to unify data from diverse sources while bringing rich data visualisation and analytics closer to the scientist where the data already is.

This means equipping scientists with uninhibited access to their data with full scientific context, supporting seamless searchability, and enabling rich analysis and visualisation in one experience. At Sapio, we call this approach science-aware™.

Charting the impact

The impact of a scientific data cloud that addresses both unification and utilisation, while enabling seamless scientific collaboration, is profound. Eliminating manual data preparation allows scientists to reclaim a significant portion of their workweek for research-related tasks. And research IT teams are no longer responsible for developing and maintaining custom parsers, freeing them to support scientists’ needs more effectively and pursue new technologies that enhance discovery.

At an organisational level, data errors are dramatically reduced by eradicating potential points of failure. Redundant experimentation is virtually eliminated by enabling centralised access to historical data and experiments. Most importantly, the organisation can finally transform the tremendous opportunity cost of fragmented data into tangible value.

If you’re interested in learning more about how a science-aware approach to scientific data management can empower your scientific and IT teams, contact Sapio.

What is the scientific opportunity cost of fragmented research data?

Quantifying the opportunity cost

Re-framing the solution

Charting the impact

Topics

Read more about:

Editor's picks

Burkhard Schafer: Why interoperability is key for the modern lab

NEW On-Demand | Ontologies - the missing foundation for AI in drug discovery

On-Demand | One workflow, every tool: how AI-native ELN is changing drug discovery

On Demand: Free Online Panel Discussion | LIMS innovation boosts precision and security

The path to AI federated learning for drug discovery

Workstations vs Clusters for Ansys Applications

Avoid Duplication, Reduce Fragmentation | Integrated Informatics for Scientific Research