Will DNA storage technology be the future direction of sustainable storage?

Last Update Time: 2021-04-16 10:35:33

When humanity enters the era of explosive data growth, DNA storage technology as a potential unlimited storage method in the future may open a new era of data storage.

After years of development in the mobile Internet, the global coverage population has exceeded the 4 billion mark, and Internet application services such as chat software, short videos, online shopping, and search engines generate a large amount of data every day. With the advent of the 5G era, the popularity of sensors and various digital terminal devices will create a world of interconnected things, and mean that a new round of digital torrents will also surge.

According to IDC's forecast by the international organization, the global data generation volume will rapidly increase from 33ZB (Zebyte) in 2018 to 175ZB in 2025. 1ZB is equivalent to approximately 1 billion TB (terabyte) of data storage. It will have an impact on the construction of global data centers and will also challenge the storage of data. According to the largest single hard disk data storage capacity conversion, at least about 12 billion hard disks are required to store all 175ZB data. According to the IDEMA (International Disk Drive Equipment Manufacturers Association) research report, the global solid state hard disk shipments in 2018 were approximately 170 million.

Judging from the current speed of data generation, by 2040 the world will need at least one million tons of silicon-based chips to store the data generated in that year. Within less than 100 years, the magnetic storage system or optical storage system used now will be Reached the upper limit of capacity. Therefore, in the near future, storage hard drives will usher in rapid growth in demand, but in the long run, global data storage will face a severe test, which will have to create sustainable storage media and new storage alternatives.

A few days ago, Io think tank recently released the "Technology Trend Report 2020". Through technology screening and evaluation of key indicators, DNA storage technology with technical undertaking, resource continuity, and subversive innovation are listed as key development trends after 2020. , Small size, easy access, and having extremely high-density DNA storage may become the future of data storage development. DNA is a macromolecule of double helix structure composed of phosphate groups, polysaccharides and four bases. The four basic base units are A, T, C and G, which are paired by bases to form a double-stranded DNA. DNA is the longest-preserved biological information and the oldest known information storage system. The half-life of DNA is about 521 years, that is, every 521 years, half of the chemical bonds between the nucleotides that make up the DNA backbone will be broken.


image.png


DNA storage uses 4 bases to encode information in data files in the form of binary codes, and forms long-chain DNA by artificial synthesis technology in order of base sequences to save data. At the same time, according to the latest research progress, it has been found that the data storage density per gram of DNA has reached 215PB (about 220,000 TB), theoretically it can reach a maximum of 455EB (about 470 million TB), and DNA is used as a storage medium for storage at room temperature The half-life can reach thousands of years. Therefore, DNA storage with large storage density, low energy consumption, and long storage period has gradually become a research hotspot in global storage technology.

Technology giant Microsoft is one of the first companies to study DNA storage technology. Microsoft has always believed that DNA is the best medium for long-term storage of data. In 2016, Microsoft announced that it would purchase 10 million long oligonucleotide molecules of DNA from a San Francisco biotechnology company to explore how DNA molecules store data. In March 2019, researchers from Microsoft and the University of Washington have developed a fully automated system for writing, storing, and reading DNA-encoded data.

The DNA coding process includes three parts: compression, error correction and conversion. Among them, after a long-term continuous development, compression methods have formed a variety of compression methods represented by Huffman coding and fountain codes; in terms of error correction, the emergence of error correction methods such as Hamming code error correction and RS code error correction have improved data encoding. And the accuracy of reading; DNA coding conversion has evolved from the original binary model to three common conversion models for ternary and quaternary coexistence.

At present, DNA storage data reading is mainly achieved through traditional sequencing methods. Researchers at Microsoft Research and the University of Washington have tested a scheme for randomly reading data, but the positioning of this scheme is still not precise enough and the efficiency is very low. In addition, the data reading technology for sequencing through nanopores is still in the research and development stage, but as an emerging fourth-generation sequencing technology, nanopore sequencing reading may become a new breakthrough in reading technology.

From the perspective of technological development maturity, DNA data storage also faces greater technical challenges, mainly reflected in the high cost of artificial synthesis, slow synthesis speed, long data reading time, low accuracy and so on. For example, the current DNA synthesis cost is about 0.05-0.1 US dollars per base, and the storage of 200MB of data needs to cost millions of US dollars, and it takes at least two weeks. Therefore, if the cost of DNA synthesis reading can be greatly improved through technological development, the application of DNA storage will be very considerable.

DNA storage is extremely suitable for some application scenarios that are not commonly used but require long-term preservation of information, such as government documents, patient clinical information, research data, historical archives, and video materials. Secondly, DNA storage, as a brand-new storage method, may become a data storage method for special encryption purposes in the military field and the economic field, and will also play a unique storage advantage in the front-end of artificial intelligence applications and cloud storage.

Although DNA has the natural advantages of sustainable access, how to make DNA storage achieve the efficiency and convenience of existing hard disk storage systems, and to achieve sustainable development and subversive transformation of data storage, requires a lot of theoretical research and technical exploration. As an infinite storage method for human beings in the future, DNA storage technology will develop together with human progress.

 

If you want to know more, our website has product specifications for DNA storage technology, you can go to ALLICDATA ELECTRONICS LIMITED to get more information