Skip to Content

Solving the Big Data storage problem with DNA: we only need one gram for 700TB

Sogeti Labs
December 03, 2012

Where are we going to store all that Big Data? Well, DNA might be a somewhat unusual but not a completely unrealistic answer. In the infographic I posted on the blog we could read about how the data pile is growing and growing to heights we could not even imagine a few years ago. So the question is: what are effective (accessible, low cost and safe) places we can store this data? So how about encoding data in single grams of DNA. George Church and Sriram Kosuri from Harvard’s Wyss Institute and Yuan Gao, now at Johns Hopkins, have done just that by storing 700 terabytes of scanned book data into a single gram of DNA. Church et. al. treated the DNA as a digital storage device, but instead of binary data encoded as magnetic regions on a hard drive, they manipulated the composition of the DNA molecules. This actually isn’t a strange idea, since our genes, made of strands of DNA, are information storehouses. How they did it? The researchers converted a book into 10 MB of binary code and made 70 million copies of it. They then translated the code into the DNA. This resulted in almost 55,000 strands of DNA, each containing a portion of the text and an address block, indicating where it occurred in the book. What’s also interesting is the fact that they are able to read and write it using commercially available gene synthesis and sequencing machinery, in just a couple of weeks. Advantages of DNA As I found out myself still while waiting on a 2,5 TB hard drive to dive under 100 euro’s, the capacity of hard drives isn’t increasing fast enough to keep up with the explosion of data. But DNA is another story: four grams could theoretically hold 1.82 trillion gigabytes: All the data the world produces in a year. The only problem here is how we can make DNA storage accesible in such a way someone like me can use it like I use my USB-drive; plug it in and that’s it. Off course this type of storage is still under development and won’t reach the masses in the next months or years maybe, but it triggered me because this is another example, see the deep neural network research Google is doing for example, of how nature and technology are converging.

About the author

SogetiLabs gathers distinguished technology leaders from around the Sogeti world. It is an initiative explaining not how IT works, but what IT means for business.


    One thought on “Solving the Big Data storage problem with DNA: we only need one gram for 700TB

    1. “storing 700 terabytes of scanned book data into a single gram of DNA”
      No Church and his team did not do that, they even did not say that. They only said that they encoded 650 kbits of data at a density of 5,5 petabit/mm3 (may be 5,5 petabit is not far from 700 terabytes , but it is not per gram but per mm3). In fact the theoretical amount of data which can be put in dna (and retrieved with a reasonnable number or errors is much larger : something like1 exabit per gram.
      So such a density is theoretically possible, but it is like to say that it is theorically possible to emigrate on planet Mars to solve terrestrian overpopulation.
      May be the most interesting is that the stored information can be kept secure for thousands of years at room temperature. However, this may be so only if dna is carefully protected from atmosphere.

    Leave a Reply

    Your email address will not be published. Required fields are marked *

    Slide to submit