Skip to main content

A question of confidence

In April 2016 the FDA issued for comment a new draft guidance for the pharmaceutical industry to help clarify ‘the role of data integrity in current good manufacturing practice (cGMP) for drugs’. The guidance has been drafted because, as the US drug regulator states, over recent years its inspections have uncovered increasing numbers of cGMP violations in which data integrity is involved. 

Put politely, the FDA is witnessing more cases of the pharmaceutical industry being less careful than it should be, when it comes to putting in place measures to secure and ensure the integrity of its data. It’s a serious issue, because decisions based on the wrong data can ultimately allow the release of medicines that aren’t safe. Drugs approved on the back of inaccurate trials data, for example, may even constitute a risk to patients’ lives. 

Real-world consequences

Coincidentaly, not long after the FDA issued its latest draft guidance on data integrity, a relevant scenario hit the headlines, when the European Medicines Agency announced that it was to start reviewing all nationally approved drugs in Europe for which studies had been carried out by contract research organisation (CRO) Semler Research Centre in Bangalore, India. 

The EMEA review will follow up on serious concerns raised by both FDA and the World Health Organisation about the integrity of Semler’s data and the possibility that samples at the organisation’s bioanalytical and clinical sites had either been manipulated or substituted. The FDA and WHO questioned the reliability of Semler-generated data that would have been used to support national drug marketing applications and authorisations in individual EU countries, and the European drug regulator will now have to investigate each new national drug application and approval for which Semler data was included. 

Data integrity refers to the completeness, consistency and accuracy of data. It’s a definition that FDA had already referred to more than 20 years ago, explains Daniela Jansen, senior product marketing manager at Dassault Systèmes Biovia. To be met, the definition also requires that the complete, consistent and accurate data should be attributable, legible, contemporaneously recorded, original or a true copy and accurate (ALCOA). ‘It’s by no means a new topic, but it is worth revisiting, particularly in light of FDA’s admission that it is seeing increased incidences of data integrity-related cGMP violations.’

While discussions around data integrity are often tethered to the pharmaceutical industry, the imperative to maintain the integrity of data is just as relevant to other sectors, including the food, environmental, chemicals, contract research and manufacturing industries,’ Jansen notes. 

From the perspective of the regulators and industry itself, data integrity is ultimately about protecting public health and safety, whether you are talking about a new drug, or a child’s toy. ‘To be sure that products are safe, we need to have test data that fulfil that integrity definition of complete, consistent and accurate, and that are secure and traceable with no possibility of manipulation, whether that manipulation involves accidental or intentional modification, falsification or deletion.’ 

Locked into product life cycle

Data integrity should also be tied to data quality, Jansen adds: ‘The two go hand in hand; being able to verify data integrity means little if your data lacks quality or accuracy to start with. That quality will, to a large extent, depend on your instruments and the personnel who are operating them. You need to make sure that instruments are correctly maintained, calibrated and fit for purpose, that your experiments have been designed to generate the right type of data and that your personnel have been trained properly to execute the experiments.

‘Tracking this supporting information and ensuring that they correctly represent the real-world to which they refer is just as important as the analytical data that you are looking to derive.’

Trickier management

Managing and maintaining such huge amounts of data and its integrity becomes trickier when you want to collate data from multiple and disparate sources in a single environment, Jansen admits: ‘The scientific sector organisations need to consider open platforms that will meet data security and validation requirements, but which will interface directly with the data sources and the data-consuming applications from multiple vendors, without the need to create a separate gigantic data warehouse. On top of this they should be able to handle and manage contextually disparate data formats that are generated across multiple disciplines.’ 

Opportunities for error

In an ideal world, raw data will be captured from every piece of instrumentation and transferred directly into a secure electronic system, from where it can be reported in appropriate formats or shuttled between systems for further analysis, without any manual input, notes Graham Langrish, sales manager for life sciences at LabWare: ‘In reality, however, pen and paper still feature fairly heavily in many laboratories, including those that work with electronic laboratory notebooks (ELN) and laboratory information management systems (LIMS). From the regulatory perspective, if you weigh a reagent on a balance, note the balance reading on a piece of paper and then go back to your bench and transfer that information into a LIMS or ELN, or even into an Excel spreadsheet, then the raw data isn’t what you put into your electronic system, but what you wrote down on your paper.

‘That gives two opportunities for error – writing down an incorrect reading in the first instance, and putting an incorrect reading into the LIMS/ELN, even if you’ve written it down correctly.’ 

The regulators are pushing for industry to capture all its data electronically and transfer that data directly into data management platforms, in parallel the industry is pushing the informatics vendors to provide the software that can achieve this, Langrish continues: ‘LabWare LIMS and ELN have been developed to integrate with a wide range of instrumentation and other informatics platforms. Some organisations may have to shoulder the expense of upgrading analytical instrumentation because legacy equipment simply can’t be interfaced with data management platforms.’ 

While it is obviously vital to maintain the integrity of top-line data that come out of analytical instrumentation, it is just as important to maintain the integrity of data that will underpin any aspect of decision making, adds John Gabathuler, director, industrial and environmental at LabWare: ‘An informatics platform, such as LabWare LIMS/ELN, can help to do this by reducing the requirement to put pen to paper, and also by installing safeguards that will flag up a caution if any data that is added either manually – or automatically – is not within prescribed limits. The LIMS/ELN has a lot of functionality within it to try to ensure that overall data integrity is maintained, by helping people manually to enter information accurately, and in the right format.’

Extensive data trails

In the scientific field in particular, the development of ever more sensitive chemical and biological analyses, and advances in genomic and proteomic techniques, is increasing the depth and breadth of data that is generated, Langrish points out. In parallel, the amount of data that may be associated with even simple tasks, such as taking a pH reading of a chemical solution, for example, can be significant: ‘An extensive trail of data will often be required to demonstrate the integrity of a simple pH reading. All this data has to be backed by instrumentation maintenance and calibration, to verify the data accuracy and its integrity, as well as operator competence. LabWare LIMS/ELN has been developed to store and manage all these data points, and overlay that data with records of standard operating procedures, personnel training, instrumentation calibration and maintenance records, which can all be transferred directly into the LIMS/ELN platform to support the final analytical data.’

It’s all heading towards the paperless lab, and we are making good progress, Gabathuler suggests: ‘Sometimes it’s not possible to capture or transmit every piece of data electronically, and some data entry will have to be done manually. The key is to ensure that data management systems that are in place help to ensure that mistakes are not made, that irregularities are at least flagged up for further investigation, and that once captured, data is secure and fully traceable.’  

Limit the risks of manipulation

In most cases companies put systems in place to try to ensure that they can comply with data integrity guidelines and mandates, and that they are confident of their own data integrity. Relying on data that has been generated by a CRO is a different matter, points out Paul Denny-Gouldson, vice president of strategic solutions at IDBS, citing the Semler case: ‘Whether samples or data are manipulated intentionally or unintentionally is a question that needs to be answered, but there are IT approaches that can be implemented by the sponsor and the CRO that will limit significantly the risks of either intentional or unintentional data manipulation.’

In parallel with sample management and tracking, organisations can also implement a study master data management practice, to define all of the other sources of metadata around a study and make that data available to all other applications used in the collection and analysis of study data. ‘This can then be used by secondary applications to check at run time if samples and data are associated with the given study, alerting the user if they are not,’ Denny-Gouldson explains. 

Error by exception

Error by exception is a third IT element that can cross check samples against study data, alerting operators when there may be a problem. Error by exception applications will flag up out-of-scope data automatically to the user at the time of entry. ‘For example, a simple correlation of sample ID against subject ID against project ID can properly qualify whether a particular sample belongs to a particular study.’ This highlights another area of master data tracking around study and subjects – essentially the study design and subject.  ‘These errors by exception can also be automatically flagged to a QA/QC official for witnessing and handling. And when an error by exception has been flagged, process control applications can direct the user to help explain the cause of the error and what the corrective action should be.’ 

The obvious extension to these elements is to remove as much human interference in the process as possible, Denny-Gouldson notes. ‘Full lab automation of bioanalytical laboratories is not as farfetched as it might first seem – various organisations have developed near 100 per cent automation environments for processing bioanalytical lab samples and analysis. This step requires all the above elements to be in place, but it does reduce significantly the risk of bad data getting into the value chain – and it also significantly improves sample throughput.’

The right environment

The final option in the CRO space is either to push the validated data collection and execution environment that the sponsor organisation wants to use to the CRO itself – or to integrate elements of the master data management and exception handling elements into the CRO’s systems, Denny-Gouldson suggests. ‘The advent of secure, validated cloud-based collaboration environments that are designed to support this type of detailed and process-centric workflow makes it easily possible to give the CRO access to the validated data collection and execution environment. 

The alternative option, of integrating application services and master data management services between multiple organisations is perhaps another few years away yet – but it is something that is an active area of development now. It may never be possible to claim a zero risk of losing data integrity, but if multiple risk reduction elements are brought together, then the risk of using the wrong sample, creating data that is associated with the wrong sample, or reporting the wrong data to a study report, can be significantly reduced.’ 


Read more about:

Laboratory informatics

Media Partners