
Summary Bullets:
- Synthetic DNA is seen as a solution to the challenge of how to store rising volumes of digital data generated by smartphones, tablets, and Internet-connected sensors.
- Innovations by U.S.-based startup Catalog promise to speed up and reduce the cost of encoding digital data for DNA storage, potentially benefitting commercial adoption.
U.S.-based startup Catalog recently revealed that it had successfully stored all 16 gigabytes of Wikipedia’s English-language text on tiny DNA strands within a laboratory vial, in the latest demonstration of the power and potential of synthetic DNA as a medium for storing digital data. The accomplishment marks a new record for the amount of digital information to be stored on DNA. Catalog used prefabricated synthetic DNA strands to store the Wikipedia data, along with a DNA writing machine, which currently writes data at a rate of 4 megabits per second, but which Catalog wants to make at least a thousand times faster.
Catalog is one of a growing number of technology companies (along with Microsoft, Intel, IBM, and Samsung) that see synthetic DNA as a potential solution to the challenge of how to store rising volumes of digital data generated by smartphones, tablets, and Internet-connected sensors. According to Cisco, the world will generate some 4.8 zettabytes of digital data by 2022, up from 1.5 zettabytes in 2017. The growing volume of data will challenge existing storage technologies such as magnetic tape, disk drives, and flash memory to keep pace with the rapidly expanding storage requirement. The attractions and benefits of DNA as a medium for digital data storage include its longevity; DNA lasts 1,000 times longer than silicon. In addition, DNA offers higher levels of storage density, with a single cubic millimeter of DNA able to hold a quintillion bytes of data.
DNA data storage works by taking digital content that is typically stored using a binary code of zeros and ones and converting it into the genetic code of As, Cs, Gs, and Ts that make up DNA’s chemical building blocks. The converted DNA code is then used to create synthetic strands of DNA, which can be put into cold storage. When needed, the DNA strands can be removed from cold storage and their information decoded using a DNA sequencing machine. The DNA sequence is then translated back into binary format.
However, existing DNA data storage techniques face challenges that include the prohibitively high cost of the DNA sequencing technology and the slow speed at which digital data is converted to DNA and the filed DNA code sequenced and decoded back into digital format. Catalog is addressing these challenges with a method that it claims is faster and cheaper than existing synthesis approaches. First, Catalog separates the process of synthesizing DNA molecules from that of encoding the digital data. Second, Catalog relies on a relatively small pool of pre-synthesized DNA molecules – fewer than 200 – that can be combined in an exponential number of ways. The approach requires less DNA synthesis, speeding up and reducing the overall cost of encoding data for storage.
Last year, Catalog announced that it had raised US$9 million from investors to help commercialize its DNA sequencing and storage technology. And although it has said little about who it expects will use the technology, Catalog is currently in discussions with government agencies, major international science projects, oil and gas firms, and businesses from media and entertainment, finance, and other industries, with a view to lining up pilot agreements.