Skip to main content

Unlocking new treatments with AI

Drug discovery is undergoing a radical evolution of its capabilities due to the growing use of computational methods, including artificial intelligence (AI) and machine learning (ML) methods. These increasingly ubiquitous methods are driving companies to find new ways to develop candidate drug compounds and are opening the path toward more personalised medicine initiatives in the future.

One method that has gained a lot of popularity is harnessing AI and ML to find potential drug molecules faster. In a paper that will be presented at the International Conference on Machine Learning (ICML), MIT researchers developed a geometric deep-learning model called EquiBind that is 1,200 times faster than one of the fastest existing computational molecular docking models, QuickVina2-W, in successfully binding drug-like molecules to proteins.The researchers found that a geometric deep-learning model is faster and more accurate than state-of-the-art computational models, reducing the chances and costs of drug trial failures.

EquiBind is based on its predecessor, EquiDock, which specialises in binding two proteins using a technique developed by the late Octavian-Eugen Ganea, a recent MIT Computer Science and Artificial Intelligence Laboratory researcher and Abdul Latif Jameel Clinic for Machine Learning in Health (Jameel Clinic) postdoc, who also co-authored the EquiBind paper.

Before drug development can even take place, drug researchers must find promising drug-like molecules that can bind or ‘dock’ properly onto certain protein targets in a process known as drug discovery. After successfully docking to the protein, the binding drug, also known as the ligand, can stop a protein from functioning. If this happens to an essential protein of a bacterium, it can kill the bacterium, conferring protection to the human body.

However, the process of drug discovery can be costly both financially and computationally, with billions of dollars poured into the process and over a decade of development and testing before final approval from the US Food and Drug Administration (FDA). 

Moreover, 90 per cent of all drugs fail once they are tested in humans due to having no or too many side effects. One of the ways drug companies recoup the costs of these failures is by raising the prices of the drugs that are successful.

The current computational process for finding promising drug candidate molecules goes like this: most state-of-the-art computational models rely upon heavy candidate sampling coupled with methods like scoring, ranking, and fine-tuning to get the best ‘fit’ between the ligand and the protein. Hannes Stärk, lead author of the paper and a first-year graduate student advised by Regina Barzilay and Tommi Jaakkola in the MIT Department of Electrical Engineering and Computer Science, likens typical ligand-to-protein binding methodologies to ‘trying to fit a key into a lock with a lot of keyholes.’ Typical models time-consumingly score each ‘fit’ before choosing the best one. In contrast, EquiBind directly predicts the precise key location in a single step without prior knowledge of the protein’s target pocket, which is known as ‘blind docking’.

Unlike most models that require several attempts to find a favourable position for the ligand in the protein, EquiBind already has built-in geometric reasoning that helps the model learn the underlying physics of molecules and successfully generalise to make better predictions when encountering new, unseen data.

The release of these findings quickly attracted the attention of industry professionals, including Pat Walters, the chief data officer for Relay Therapeutics. Walters suggested the team try their model on an already existing drug and protein for lung cancer, leukaemia, and gastrointestinal tumours. Whereas most of the traditional docking methods failed to bind the ligands that worked on those proteins successfully, EquiBind succeeded. 

‘EquiBind provides a unique solution to the docking problem that incorporates both pose prediction and binding site identification,’ Walters says. ‘This approach, which leverages information from thousands of publicly available crystal structures, has the potential to impact the field in new ways.’

‘We were amazed that while all other methods got it completely wrong or only got one correct, EquiBind was able to put it into the correct pocket, so we were very happy to see the results for this,’ added Stärk.

While EquiBind has received a great deal of feedback from industry professionals that has helped the team consider practical uses for the computational model, Stärk hopes to find different perspectives at the upcoming ICML in July.

‘The feedback I’m most looking forward to is suggestions on how to improve the model further,’ he says. ‘I want to discuss with those researchers … to tell them what I think can be the next steps and encourage them to go ahead and use the model for their own papers and for their own methods … we’ve had many researchers already reaching out and asking if we think the model could be useful for their problem.’

This work was funded, in part, by the Pharmaceutical Discovery and Synthesis consortium; the Jameel Clinic; the DTRA Discovery of Medical Countermeasures Against New and Emerging threats program; the DARPA Accelerated Molecular Discovery program; the MIT-Takeda Fellowship; and the NSF Expeditions grant Collaborative Research: Understanding the World Through Code.

Understanding cardiovascular disease

CardiaTec Biosciences recently announced that it had secured a £1.4million pre-seed investment led by Laidlaw Scholars Ventures and APEX Ventures with participation from Crista Galli Ventures, o2h ventures and Cambridge Enterprise.

The AI drug target discovery company, which specialises in cardiovascular disease, was co-founded in 2021 by a trio of AI academics and ambitious alumni from the University of Cambridge. Of the three – Raphael Peralta (CEO), Thelma Zablocki (COO), and Namshik Han (CTO) – Raphael and Thelma are graduates of the University of Cambridge MPhil in bioscience enterprise. Dr Han is an academic in AI applications for target and drug discovery, and he holds positions at the University of Cambridge as head of AI, at the Milner Therapeutics Institute and associate faculty of the Cambridge Centre for AI in Medicine.

The company is developing a target discovery platform leveraging AI to make sense of large-scale multi-omic cardiovascular data. As opposed to conventional singular omic analysis, CardiaTec’s proprietary platform unravels relationships that span across every level of biology, from gene variation, methylation and expression, to their connection to proteomic and metabolomic functions to understand disease development best.

Dr Han said: ‘Recent advances in artificial intelligence are generating novel ways to interpret multi-omic data. I am excited to lead CardiaTec’s technology strategy to establish a new paradigm for understanding the pathophysiology of cardiovascular diseases.’

Raphael Peralta, CEO of CardiaTec, said: ‘We strongly believe, after several decades of stagnated investment and innovation, cardiovascular disease is re-emerging with a newfound interest, driven not only by the increasing requirements to fulfil the unmet need as it persists as the world-leading cause of death, but in the application of AI in being able to drive new and meaningful insights to help meet patients’ needs. Therefore, CardiaTec finds itself incredibly well placed to help drive innovation forwards within this space, now supported by a great syndicate of investors.’

Expanding AI 

It has recently been announced that AI-powered drug discovery company Healx has announced that is opening new labs at Chesterford Research Park. Healx, specialises in using AI models to find candidate drugs to help treat rare diseases. This is particularly important in this field as rare diseases can be much harder to treat as there is insufficient investment in these conditions due to the small number of people suffering from them. In addition, rare diseases are often not well studied and there is a limited understanding of many of the aspects necessary to support a drug discovery programme. Healx aims to change this through the application of AI. The company uses AI models helps to solve these challenges by analysing millions of drug and disease data points to find novel connections that could be turned into new treatment opportunities.

Dr Neil Thompson, Chief Scientific Officer, Healx said: ‘Historically, we have worked through partnerships to access the experimental systems we require for our preclinical and clinical programmes, but, as our team and operations have scaled, we started looking to secure our own labs in order to support our growing portfolio of disease projects and to expand the proprietary data types we use in our AI platform.

‘The Chesterford Research Park facility is ideal for us and will play a pivotal role in our vision for the next generation of drug discovery for rare diseases. The modern, purpose-built lab has the flexibility to support our scientific and technical ambitions and the location - close to our Cambridge headquarters and enable us to expand our team with experienced local talent. Importantly, the Park Management staff have been very supportive in enabling a speedy acquisition, and we are looking forward to getting up and running in the space,’ Thompson continued. 

Focused on finding novel treatments for patients with rare diseases, Healx is passionate about the application of AI to improve access to treatment and the new lab will allow the Healx team to accelerate the discovery and validation of potential new therapies for a range of rarer conditions.

Julian Cobourne, head of regional offices, Aviva Investors, joint owners of Chesterford Research Park with Uttlesford District Council commented: ‘We are thrilled to provide space for a new breed of healthcare technology companies, like Healx, to grow amongst our existing community of cutting-edge, global life science companies. With its new lab space in the Park’s innovative community, Healx will be able to continue its crucial research, rapid expansion and recruitment.’

CDD vault helps lab users overcome data challenges in modern drug discovery workflows

‘We had spreadsheets all over the place, and data from different projects that were just separated in different folders. It got to the point where we didn’t know where to put our data, or where to later find it.’ This is one of the common issues that has led global biotechs of all sizes to Collaborative Drug Discovery (CDD), a software provider for research and development data management.

Drug discovery is data driven, and that data underpins every scientific and commercial decision, which could ultimately spell the difference between the success and failure of a research and developmental program for new drugs. Yet in today’s labs the handling and management of data doesn’t necessarily maximise its value or usability. 

Scientists commonly store and manage their data in unsecure, often difficult to find disparate documents and spreadsheets. While this method might be okay for a lone scientist working in a vacuum, it is not likely to represent a smart approach for collaborative scientists working in drug discovery or in other chemical or biological fields that rely on the ability to store, recall, process and share large amounts of data quickly.

CDD Vault acts as a central smart warehouse for all drug discovery data, explains Kellan Gregory, the informatics firm’s head of product excellence. ‘Our platform offers a comprehensive set of core utilities to allow lab members to access all of their results data, in its contextual format.’ This means the ability to handle any type of data. ‘As well as being able to capture numbers and text, we can also capture the native file in line with the data.’ A formula builder, tools for activity and physical chemistry property calculations, the ELN and dynamic visualisation tools then provide extra layers of intuitive analyses. 

Julio Martin is director and head of the Kinetoplastid Discovery Performance Unit (DPU) at GlaxoSmithKline (GSK) R&D’s Tres Cantos Open Lab Foundation, a ground-breaking PPP initiative set up at GSK’s dedicated diseases of the developing world (DDW) research facility at Tres Cantos in Madrid. 

The Open Lab Foundation supports collaboration by giving external partners access to GSK compounds, infrastructure and drug discovery expertise, with a view to accelerating research in multiple areas from target discovery and validation, to compound screening lead identification, and optimisation. The organisation chose CDD Vault as is data management platform for all Tres Cantos anti-kinetoplastic screening data generated internally and through external partnerships. 

‘This kind of hosted solution means there is no need to have to navigate firewalls, and it also scales up in parallel with project growth. Using CDD Vault means we can put more of our internal resources into scientific research, rather than have increased costs associated with setting up, maintaining and upgrading complex platforms,’ Julio says. ‘Researchers can interrogate the database for their own and externally submitted data, and be confident of confidentiality and security for every user.’ 

Importantly, CDD Vault can be set up quickly, with no specialist IT input, and is very agile. ‘The days of heavy, custom solutions are numbered, Kellan states. They are expensive, difficult to roll out, and can require extensive ongoing IT expertise. In contrast, we can get a new customer up and running with CDD Vault in minutes. The CDD team manages all of the infrastructure and provides ongoing support for any additional configurations required at any point. The ability to become productive in a short amount of time is really big plus point for our customers.’

With UK government figures released in April 2018 indicating that more than four in 10 of all UK businesses – and 72 per cent of large businesses – suffered a cyber breach or attack in the last 12 months, CDD Vault is highly secure. Built into industry standard SSAE 16 Type II certified cloud storage, and with two-factor authentication and IP tracking on the ground, CDD Vault is designed to minimise the chance of a hack either from the outside, or from inside the client’s organisation.

James Moe is president, CEO and co-founder of Oligomerix, a small biotech company interested in understanding the role of tau protein in neurodegenerative diseases, and the discovery and development of treatments for Alzheimer’s disease and related tauopathies. 

Moe comments: ‘We’re using CDD Vault as a way of storing our chemical structures, as a way of searching them, as a way of storing all of our assay data. We’re also using it for doing calculations, for analysing our data, for creating reports and communicating the data to others, and then also, very importantly, for working securely with collaborators. So it’s been instrumental for all of those purposes. Another primary concern is having a database where we have better security over our molecules.’  

Find out more at:


Media Partners