In silico discovery and overcoming bias with AI
Credit: Dr Raminderpal Singh
Incubate Bio is an in silico drug discovery company dedicated to supporting the development of novel cancer therapies by democratising access to Causal AI across the life sciences industry. Their flagship product ALaSCA, has been developed specifically to enable rapid investigation and in silico experimentation to assist the discovery and development of novel cancer therapies, with an initial focus on the DNA Damage Response (DDR) pathway.
Can you tell our readers how your AI platform can benefit scientists working in drug discovery?
Forget the notion of AI and words like that. Let's just think about the biologist trying to solve a problem. They have a question. And their choice is to run some experiments. They can run the experiments in the lab. They can outsource it, but it is a slow, expensive process. They know that, and their management knows that. They’re limited in how much they can do. There's a bandwidth problem, but there's also a solution space problem.
What computers do, in general, is allow you to go global and deep-dive almost for free, at speed. We use cloud systems, ALaSCA, for example, is on the cloud. It is parallel. If you push a button, it'll run massively parallel systems, and you can run across different systems.
Suppose you've got a scientist who has a hypothesis about an experiment, and they want to take that hypothesis to a lab. Before stepping into the lab, a scientist can model that hypothesis into a network diagram. And let's run it 1000 times in different directions for far less cost and faster on a computer.
The scientist may well run the lab experiment anyway, but he/she can first explore around in silico. We enable this at scale, cost-effectively and very fast. This provides the scientist with lots of additional evidence very quickly. But, crucially, this approach also looks at spaces they wouldn't have thought of. Think of it as a suite of informative experiments, not just an application of AI.
And we're extending this by developing that approach into metascience. Metascience is a very broad, nebulous term, but for our purposes, it is the notion of workflow processes and the metadata created along them. Literature is inherently bias by both the interests of the author and funding party as well as the non-publication of negative results. We are researching how to understand, measure, and reduce these known biases in the approach. That's very important, especially when working with and using lots of literature. But there are statistical approaches to seeing and measuring bias that we will be building into the system.
I have personal experience with when IBM Research took Watson Genomics to market. I worked with several hospitals and saw the resistance to public literature and publications because of these issues. At Incubate Bio I am being cautious not to hide from these problems, to find ways through.
The ALaSCA platform uses Casual AI-augmented with Large Language Models (LLMs). Why did you choose to design the platform in this way?
LLM itself is a commodity technology. Lots of people are doing excellent work on LLM. It's becoming mainstream fast. Of course, there are challenges with its utility. In some contexts, it can be easier to implement than others.
To address this, we are using LLM as an applet to the black box application. We're fine-tuning it in the right way. Based on your question, the win is that causal AI will provide part of the bias and requirements for fine-tuning. So think of a learning loop between what the LLM generates, then the Causal AI does its work and then feeds back to further fine-tune the LLM.
What level of expertise or AI-specific skills would scientists need to use Causal AI?
We aim to make this technology available to all. However, some experience is required to maximise the value of its use. When you're working with smaller organisations, they often do not have access to the right range of data science skills internally. For these teams, we provide direct support as a service because we have strong, in-house biological skills, and we can use that to provide a bridge between understanding the results of the ALaSCA system and what that means in the biological context. We understand the scientific questions they're trying to answer.
Looking forward, the ALaSCA system itself, as it matures, will deploy straight to biologists. When working with a system where the biological mechanism is the model, the results relate directly back - there is no translation point. Therefore, the scientist understands their biological mechanism, i.e. their input. What you annotate on the output, whether it’s adding strength values, simulating, or optimising the mechanism based on interventions, you're bringing it back into their language, frame, and biological mechanism diagram.
If you've got a black box system, it needs to be interpreted so that other people can deploy it in their workflow, and therefore this question becomes very real. “If they don't have the skills, how can the user translate those results?” But our approach is different because it's native to the language of scientists, it means that work feeds from them and back to them immediately.
How will the platform evolve?
As we go forward, we're reducing the requirement for scientists to have decades of deep biological experience. This is where ALaSCA becomes compelling. We're reducing the onus on the biologists to have to be Key Opinion Leaders (KOL), essentially democratising access to the breadth of data and depth of knowledge required. We can build the underlying mechanism for them using the LLM engine, which itself will improve and evolve over time as more data because available, either open source and/or proprietary. In industry today if you don't have a KOL in your organisation, or within your direct network, it's tough to go after complex disease mechanisms because you need those 30-40 years of experience. But that bottlenecks the whole industry because there aren't enough KOLs, and most are already committed. So how do you allow new companies to grow and extract that knowledge as if they're getting access to a KOL? We're lowering the bar to access expertise and encapsulating a lot of that mechanism, including insights, experience, and thinking into the ALaSCA engine. Our vision is simple, we want to enable the lab scientist to easily explore their hypothesis quickly and comprehensively, speeding up the drug discovery and development process and increasing the chances of successful outcomes from their efforts.
Dr Raminderpal Singh is the CEO and co-Founder of Incubate.bio