Researchers transmit genome sequence using quantum cryptography

Share this on social media:

Tohoku University Medical Megabank Organisation (ToMMo) has demonstrated the world’s first quantum cryptography transmission of whole-genome sequence data in collaboration with Toshiba. The transmission exceeded several hundred gigabytes demonstrating that quantum encryption technology is now capable of large-capacity data transmission, opening up practical applications in genomic medicine and research.

Speeds for key distribution in quantum cryptographic communication technologies are currently about 10 Mbps, the speed at which data can be encrypted and transmitted with the one-time pad method is limited. So, there is a room for improvement for large-scale data transmission.

ToMMo developed a system for sequential encryption and transmission of large-scale data, thereby realising real-time transmission of whole-genome sequence data with the one-time pad method. One-time pad is an encryption technique that requires the use of a one-time pre-shared key the same size as, or longer than, the message being sent. 

Genomic data is information tied closely to human characteristics and as such is treated as personally identifiable information (PII). This means it falls under legal frameworks such as GDPR. Furthermore, while human genome information comprises approximately 3.2 billion bases, high-precision analysis using the latest sequencers obtains more than 90 billion bases, nearly thirty times that number. 

Especially, in the case of simultaneous analysis of multiple individuals, next-generation sequencers output more than several hundred gigabytes of data at a time. Storing and transporting such large amounts of confidential data requires very high-level security. Genome researchers have always been annoyed about the security of transferring large-scale genome sequence data, they sometimes physically transport hard disks in locked security boxes, which is problematic in terms of cost and time.

Impact of quantum cryptography on healthcare

Quantum cryptographic communication technologies apply the principles of quantum mechanics to realise secure cryptographic communications against any form of wiretapping or decryption. These technologies are therefore expected to be used for backups of confidential data and for encrypting medical data transmissions requiring high confidentiality. However, speeds for key distribution in quantum cryptographic communication technologies are currently about 10 Mbps at maximum, making it difficult to apply them to the transmission of large-scale, highly confidential data such as whole-genome sequence data.

Toshiba and ToMMo realised the efficient transmission of large-scale data by using high-speed quantum cryptographic communication technologies developed by Toshiba and Toshiba Europe’s Cambridge Research Laboratory. When transmitting large-scale, highly confidential genome analysis data, the developed technologies transmit genomic analysis data output from next-generation sequencers using quantum cryptography, instead of transmitting all the data at once. By sequentially transmitting data as it comes out of sequencers, it is possible to reduce any delay in transmission processing which would normally be expected for such large amounts of whole-genome-analysis data.

Using this technology, Toshiba and ToMMo transmitted data over an approximately 7-km dedicated optical fibre line between ToMMo (Seiryo, Aoba Ward, Sendai City) and a next-generation sequencer installed at the Toshiba Life Science Analysis Center (Minamiyoshinari, Aoba Ward, Sendai City). In this test, whole-genome sequence data output from the analysis of genome sequences in DNA samples held by ToMMo could be sequentially encrypted for quantum cryptography communications and transmitted in real-time without delay following the completion of analysis processing. These results confirmed that quantum cryptography technologies can be practically applied to the cryptographic transmission of large-scale, highly confidential genome analysis data.