DNA for Digital Storage: Decoded

 

By: Shankar Dev

 

A data storage device is a device for recording (storing) information (data).Now we use variety of such devices like flash drives, SDcards, optical disks etc, which are “pocket friendly” ,but Imagine the storage devices of 1950’s!! They were as big as “AIRBUS-A380”. Would you believe if I told you the data in

3 million optical disks could be trapped within a gram of electron microscopic DNA ?

DNA is a helical molecule held together by four chemical groups, or nucleobases, which, when arranged in a specific order, carry the genetic instructions needed by a living organism to build and maintain itself. The European Bioinformatics Institute (EMBL-EBI) at Hinxton, had utilized this property of the DNA to transform into a storage device. Though many International scholarly papers have been published in this subject, EBI’s contribution is cited as major because of the reliability in their model.

Woolly Mammoth :

The woolly mammoth was a species of mammoth, the common name for the extinct elephant genus Mammuthus extinct about 200,000 years ago. In recent past the frozen and dried carcasses of the animal was discovered in Siberia and Alaska. Despite of such a great time travel, their DNA had not lost its data/instructions. The EBI team had conceived the idea of reliable DNA storage from the sample Woolly Mammoth DNA. The team says “If you keep it cold, dry and dark – DNA lasts for a very long time. We know that because we routinely sequence woolly mammoth DNA that is kept by chance in those sorts of conditions”.

Process:

DNA FOR DIGITAL STORAGE

To achieve this, the EBI has also looked deeper into some of the issues of scalability and practicality.

  • To copy a computer file, viz. text document, In the hard drive of computer the data(document) will be represented in the binary form (zeros and ones). As a first step the binary code have to be translated into the bespoke code (a code developed by EPI team specially for the purpose).
  • A standard DNA synthesis machine then churns out the corresponding sequence.But DNA not one long molecule. Rather, it is multiple copies of overlapping fragments, with each fragment also carrying some indexing details that identify where in the overall sequence it should sit.
  • This builds redundancy into the system, i.e some fragments become corrupted, the data will not  be lost.
  • Again,  the  same  standard  equipment  used  in  molecular  biology  labs  to  read  the  DNA  of
  • organisms is used to pull out the information so that it can be displayed on a computer screen once more.

 

(To get a picture, Recall a  DVD writer which “write/copy” movies in the disk, later we play that movie using a DVD player & TV. But the real process is so complex, sophisticated and involves lot of expertise)

For its experiment, the EBI team encoded a 26-second snippet of Martin Luther King’s classic anti-racism address from 1963, a “.jpg” photo of the EBI; a “.pdf” of the seminal 1953 paper by Crick and Watson describing the structure of DNA, “.txt” file containing all of Shakespeare’s sonnets; and a file about the encoding system itself, which is equivalent about 760 kilobytes in the computer hard disk & the information was then read back out with 100% accuracy. But the awesomeness is that, physically the DNA carrying all that information is no bigger than a speck of dust.

The molecule is incredibly dense storage medium. One gram of DNA can hold about two petabytes of data-the equivalent of about three million CDs.

The DNA can’t be incorporated accidently into a genome. It uses a completely different code to what the cells of living bodies use and even if someone put that DNA inside a living body, it would just be degraded and disposed of. It really has no place in a living being.

Advantages :

  •  No electricity is required to store data in DNA
  • Unlike other storage media presently in use such as hard disk-drives and magnetic tapes, the DNA “library” would not demand constant maintenance. Once encoded in DNA, it could be put away safely in a vault until it was needed.
  • If it is maintained cold, dry and dark - DNA lasts for a very long time (as we saw in mammoth case).

Disadvantages:

  • The costs involved in synthesizing the molecule in the lab make this type of information storage “breathtakingly expensive” at the moment, but in near future –the newer, faster technologies will soon make it much more affordable, especially for long-term archiving.
  • It is not feasible for our day to day means of portable data storage.

Albeit with drawbacks, this breakthrough initiative which could solve the problem of storing the ever- growing mountain of data is laudable.

( Source: The New Indian Express(24/01/2013), Nature Journal, Wikipedia )