DNA Data Storage Alliance Makes Its Case in New White Paper, Website
By Bio-IT World Staff
June 10, 2021 | The DNA Data Storage Alliance, an organization formed last year by Twist Bioscience Corporation, Illumina, and Western Digital together with Microsoft Research, has launched its website and released its first white paper, an introduction to DNA data storage, endorsed by 29 member organizations.
“There are phases in the lifecycle of technologies in which huge transformations lurk. We believe we are on the cusp of one such transformation today regarding archival storage and DNA,” the white paper authors write. And they present DNA data storage fundamentals in an accessible way for both technically curious readers and for IT business, computer science, or electrical engineering readers interested in the benefits, a technical overview, and the cost of ownership of this potential new storage medium.
The paper begins with why DNA data storage is needed: the data overwhelm. In 2020, humans likely generated in excess of 400 ZB—zettabytes—of digital stuff, according to a recent Gartner report. And it’s not slowing down. Gartner predicts a 35% per-year growth rate. While the white paper authors acknowledge, “staggering improvements in density, size and total capacity,” they also contend that, “key challenges remain for today’s storage technologies when considered for zettabyte scale and long storage duration.”
“It’s undeniable that data growth is outpacing the scalability of today’s storage solutions. Literally, everything we do revolves around data—and capturing, storing, processing and mining it only serves to create even more data. The density and stability of DNA storage will help the industry cost-effectively cope with the expected future growth of archival data for many decades to come,” said Steffen Hellmold, vice president, corporate strategic initiatives, Western Digital, in a statement announcing the white paper.
DNA addresses many of those challenges including costs, density limitations, and energy and sustainability. DNA as a storage medium is durable, simple, dense, and the format is universal and lasting. As a stable archive, DNA storage won’t incur significant expenses over time, and, the authors write, “Compared to today’s datacenters with today’s storage technologies, data stored in DNA consumes minimal to no resources while at rest.”
The white paper is careful to point out that DNA used for storage is synthetic, or manufactured, DNA. In bold the authors make their point: “DNA data storage medium does not require or use—nor does the resulting stored data result in the creation or modification of—any cells, organisms, or life.”
Instead, DNA is synthesized based on the code of the digital data being stored. “Currently, all commercial synthetic DNA is custom-built using the phosphoramidite synthesis method,
developed by the biochemist Marvin H. Caruthers,” the authors write. “In this approach, oligonucleotides are synthesized from building blocks that replicate natural bases. The process has been automated since the late 1980s and is used to form desired genetic sequences for applications in medicine and molecular biology as well as for data storage. This method is currently the most robust, best tested and highest quality way to construct synthetic DNA.”
The Alliance also highlights two other DNA synthesis methods—enzymatic synthesis and synthesis by ligation. Twist Biosciences, an Alliance founder, is pioneering a new method of manufacturing synthetic DNA by “writing” DNA on a silicon chip.
The process of DNA storage moves from encoding digital data into DNA bases, synthesizing that DNA, and storing it. Encapsulated DNA has been shown to remain stable for thousands of years, even in harsh conditions, the authors write. When it’s time to retrieve the data, it is simply sequenced and decoded.
This is a particular strength of DNA as a storage medium, the white paper authors believe. With existing storage technologies, the physical structure and format of the media and the methods used to read and write to it are fundamentally coupled. In contrast, DNA’s structure means that any generation of DNA readers and writers will be able to read and write DNA as long as the bit encoding formats are saved.
“The method for reading back data—either periodically to check data quality, or when needed for processing—is critically important in the life sciences as well as for data storage. Due to DNA’s universal format, DNA media will always be readable and writable,” stated Alex Aravanis, M.D., Ph.D., chief technology officer at Illumina, in the statement. “We continue to drive sequencing technology forward at a rapid rate, with new applications like DNA data storage, and anticipate an active role in this market.”
In all, the Alliance argues that the timing and technologies are now ripe for DNA data storage and as a storage medium, DNA will alleviate the challenges of traditional storage including expense, volume limits, energy use, and sustainability.
“In addition to density, stability and eternal relevance, DNA data storage provides a far more sustainable option, requiring negligible space and energy when compared to current data centers that use an ever-growing amount of power and land,” said Emily M. Leproust, Ph.D., CEO and co-founder of Twist Bioscience. “Taken together, DNA’s storage density, durability and minimal maintenance costs radically reduce the cost of maintaining digital data in DNA over time, making it a viable option for long-term archival data retention.”