Brain-inspired AI computing: merging GPU, CPU and neuromorphic processing

Christian Mayr, Professor at the Technische Universität Dresden

Christian Mayr, Professor at the Technische Universität Dresden

Credit: TUD

Artificial intelligence is playing a growing role in everyday life, but today’s AI hardware and algorithms still fall short of the brain’s efficiency and processing capabilities. In this article, Professor Christian Mayr discusses SpiNNaker2, a computing platform inspired by the brain that combines GPU, CPU, neuromorphic, and probabilistic elements in a single system.

Unlike traditional neuromorphic chips that focus only on mimicking individual neurons, SpiNNaker2 takes cues from biology across all levels of its architecture. Mayr also explains how key principles from neuroscience, such as hierarchical structure, asynchronous communication, dynamic sparsity, and distance-based connectivity, can be applied to reshape standard AI models. These approaches lead to significant gains in energy use and processing speed for both training and inference.

What inspired the development of SpiNNaker2, and how does it differ from conventional AI hardware?

Traditional AI hardware, like GPUs, is still far from matching the human brain’s incredible energy efficiency, low latency, and large-scale parallelism. Our aim with SpiNNaker2 is to close that gap.

SpiNNaker2 is bio-inspired throughout its architecture, not just at the neuron level like most neuromorphic chips, but at all levels of design. It merges features of GPUs, CPUs, and neuromorphic systems, and integrates probabilistic computing. We’re not just mimicking how a neuron fires; we’re implementing principles like dynamic sparsity, hierarchy, and asynchronous communication, all things the brain does naturally and efficiently.

What are the key architectural features of SpiNNaker2?

SpiNNaker2 consists of 4,848 individual chips, each with 152 parallel cores, essentially mini GPUs, and local memory. Each core has a CPU co-located, allowing us to run a deep learning model in the GPU section while running symbolic AI in the adjacent CPU, for example, in defence applications like airspace anomaly detection.

What’s crucial here is the distributed, local nature of memory and compute. Instead of funnelling everything through a single memory system, we activate only a fraction of resources at a time. For example, when running transformer models, only one chip might be active for a given token layer. This results in massive energy savings and bandwidth advantages.

The machine originated in the EU Human Brain Project. It began with brain simulation in mind, but today we use it for a wide range of tasks, from smart city AI and robotics to drug screening.

What kind of real-world performance gains are you seeing?

SpiNNaker2 significantly outperforms conventional hardware in energy and speed for specific tasks. We see one to two orders of magnitude improvement in both performance and energy efficiency. In drug discovery, we’ve seen up to a 100× speed-up compared to GPUs.

That’s a game changer. If you're tailoring a drug to a specific patient, and it currently takes a GPU-based data centre a week to simulate, that’s prohibitively expensive. But if SpiNNaker2 can do the same task in minutes or hours, suddenly personalised medicine becomes a real option.

How do you achieve low power consumption and high parallelism?

The system borrows heavily from how the brain handles information: selectively activating resources only when needed, whether it's compute, communication, or memory. Everything is task-dependent. If a unit isn’t needed, it doesn’t consume energy. That makes the whole machine extremely efficient.

Also, SpiNNaker2 is designed for strict real-time performance. Across its full 20-rack configuration, we guarantee sub-millisecond response times. Push data in one side and you get results out the other - fast.

This architecture is inherently parallel, like the brain. The brain has 80 billion neurons, each with thousands of synapses, all working in parallel at frequencies between 100 Hz and 1 kHz. It doesn’t rely on centralised scheduling or fixed pipelines.

What brain-inspired algorithmic principles do you apply in SpiNNaker2?

We look at neurobiological computing principles like dynamic sparsity, hierarchy, distance-dependent topologies, and asynchronous updates. These principles help us reframe conventional AI algorithms in ways that drastically improve the energy-delay product, by up to an order of magnitude in both inference and training.

A key concept is information gating, the idea that you only activate memory, computation, or communication when necessary. This selective activation mimics how the brain dynamically focuses attention and resources. In large language models, for example, you want to activate weights only when a prompt demands it, not burn energy across the whole model.

Another crucial insight: the brain doesn’t only run deep neural networks. The associative cortex operates more like symbolic AI. And it’s highly probabilistic, another principle we’ve integrated into the hardware.

How important is locality in your system design?

Hugely important. Energy is consumed when bits move, not just across chips, but across boards and racks. The closer compute and memory are to each other, the less energy is wasted on communication. Co-location isn’t just at the die level, it matters at every scale.

That’s why we’ve built SpiNNaker2 around chiplets, racks, and memory systems that favour tight physical integration. But that also means you need to rethink your algorithms. You can’t just assume centralised data access or synchronous execution. You have to recast your algorithms around locality and asynchronous processing.

What role does Saxony play in enabling this kind of innovation?

Saxony is Europe’s hidden semiconductor powerhouse. Around 36% of Europe’s chips come out of Saxony. But when people think of semiconductors in Europe, they think of France or Belgium, not Dresden.

There’s a complete ecosystem here: from SMEs building fab tools, to design houses, to high-density processing and chiplet ecosystems. That’s what enables high-performance hardware like SpiNNaker2 to be built and refined locally.

We’re working on expanding its application footprint, from AI accelerators to hybrid symbolic-probabilistic AI for autonomous systems and advanced robotics. And we’re focusing more on sustainable AI, AI that’s fast, accurate, and energy-efficient enough to be used in real-world, large-scale deployments.

Christian Mayr is a Professor at the Technische Universität Dresden.

Brain-inspired AI computing: merging GPU, CPU and neuromorphic processing

What inspired the development of SpiNNaker2, and how does it differ from conventional AI hardware?

What are the key architectural features of SpiNNaker2?

What kind of real-world performance gains are you seeing?

How do you achieve low power consumption and high parallelism?

What brain-inspired algorithmic principles do you apply in SpiNNaker2?

How important is locality in your system design?

What role does Saxony play in enabling this kind of innovation?

Topics

Read more about:

A thank you to our HPC Breakthroughs partners

Editor's picks

Enter the SCW75 - celebrating leaders in scientific computing

Free Online Panel Discussion | LIMS innovation boosts precision and security

On-Demand: Optimise your HPC storage strategy

On-demand | AI in Life Sciences: Practical applications in small molecule design

Protecting bioanalytical data integrity from bench to report

Why AILNs are the future of scientific discovery

Future-proofing your lab: key considerations for upgrading or switching chromatography data systems