SOLVING THE BIG DATA STORAGE PROBLEM WITH DNA: WE ONLY NEED ONE GRAM FOR 700TB
December 3, 2012
Where are we going to store all that Big Data? Well, DNA might be a somewhat unusual but not a completely unrealistic answer. In the infographic I posted on the blog we could read about how the data pile is growing and growing to heights we could not even imagine a few years ago. So the question is: what are effective (accessible, low cost and safe) places we can store this data?
So how about encoding data in single grams of DNA. George Church and Sriram Kosuri from Harvard’s Wyss Institute and Yuan Gao, now at Johns Hopkins, have done just that by storing 700 terabytes of scanned book data into a single gram of DNA.
Church et. al. treated the DNA as a digital storage device, but instead of binary data encoded as magnetic regions on a hard drive, they manipulated the composition of the DNA molecules. This actually isn’t a strange idea, since our genes, made of strands of DNA, are information storehouses.
How they did it?
The researchers converted a book into 10 MB of binary code and made 70 million copies of it. They then translated the code into the DNA. This resulted in almost 55,000 strands of DNA, each containing a portion of the text and an address block, indicating where it occurred in the book. What’s also interesting is the fact that they are able to read and write it using commercially available gene synthesis and sequencing machinery, in just a couple of weeks.
Advantages of DNA
As I found out myself still while waiting on a 2,5 TB hard drive to dive under 100 euro’s, the capacity of hard drives isn’t increasing fast enough to keep up with the explosion of data. But DNA is another story: four grams could theoretically hold 1.82 trillion gigabytes: All the data the world produces in a year.
The only problem here is how we can make DNA storage accesible in such a way someone like me can use it like I use my USB-drive; plug it in and that’s it. Off course this type of storage is still under development and won’t reach the masses in the next months or years maybe, but it triggered me because this is another example, see the deep neural network research Google is doing for example, of how nature and technology are converging.