FEATURE

Knowledge: Data analytics

This chapter takes the theme of knowledge management beyond document handling into the analysis and mining of data. Technology by itself is not enough – laboratory staff need to understand the output from the data analysis tools – and so data analytics must be considered holistically, starting with the design of the experiment

Data analytics is the term applied to the process of analysing and visualising data, with the goal of drawing conclusions and understanding from the data. Data analytics is becoming increasingly important as laboratories have to process and interpret the ever-increasing volumes of data that their systems generate.

In the laboratory, the primary purpose of data analytics is to verify or disprove existing scientific models to provide better understanding of the organisation’s current and future products or processes.

Data mining is a related process that utilises software to uncover patterns, trends, and relationships within data sets. Although data analytics and data mining are often thought of in the same context, often in connection with ‘Big Data’, they have different objectives.

Data mining can broadly be defined as a ‘secondary data analysis’ process for knowledge discovery. It analyses data that may have originally been collected for other reasons. This differentiates it from data analytics, where the primary objective is based on either exploratory data analysis (EDA), in which new features in the data are discovered, or confirmatory data analysis (CDA), in which existing hypotheses are proven true or false.

In recent years, some of the major laboratory informatics vendors have started to offer data analysis and visualisation tools within their product portfolios. These tools typically provide a range of statistical procedures to facilitate data analysis; and visual output to help with interpretation. Alongside the integrated data analytics tools, more and more vendors offer generic tools to provide software that can extract and process data from simple systems through to multiple platforms and formats. The benefit of integrated data analysis tools is that they will provide a seamless means of accessing data, eliminating concerns about incompatible data formats. As with any other laboratory software, defining functional and user requirements are essential steps in making the right choice. Key areas to focus on are that the tools have appropriate access to laboratory, and other data sources; that they provide the required statistical tools; and that they offer presentation and visualisation capabilities that are consistent with broader company preferences and standards.

Data analytics plays an important role in the generation of scientific knowledge and, as with other aspects of ‘knowledge management’, it is important to understand the relationship between technology, processes, and people. In particular, staff need to have the appropriate skills to interpret, rationalise, and articulate the output presented by the data analysis tools. To take full advantage of data analytics, it should be considered as part of a holistic process that starts with the design of the experiment.

A quote attributed to Sir Ronald Fisher, ca 1938, captures this point: ‘To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: He may be able to say what the experiment died of.’

Next: Summary >

Feature

Building a Smart Laboratory 2018 highlights the importance of adopting smart laboratory technology, as well as pointing out the challenges and pitfalls of the process

Feature

Informatics experts share their experiences on the implementing new technologies and manging change in the modern laboratory

Feature

This chapter will consider the different classes of instruments and computerised instrument systems to be found in laboratories and the role they play in computerised experiments and sample processing – and the steady progress towards all-electronic laboratories.

Feature

This chapter considers how the smart laboratory contributes to the requirements of a knowledge eco-system, and the practical consequences of joined-up science. Knowledge management describes the processes that bring people and information together to address the acquisition, processing, storage, use, and re-use of knowledge to develop understanding and to create value

Feature

This chapter takes the theme of knowledge management beyond document handling into the analysis and mining of data. Technology by itself is not enough – laboratory staff need to understand the output from the data analysis tools – and so data analytics must be considered holistically, starting with the design of the experiment