One Day a Forest Could Store All of Humanity’s Knowledge

Scientists are encoding data into the DNA of trees.

May 17 2014, 1:30pm
Image: Shutterstock

Every 10 minutes, humans are generating as much information as the entirety of recorded history combined. The question is: Where are we going to put all of it? In an attempt to answer that question, researchers have been trying to code data into the DNA of living things.

One of these projects made headlines in the New Yorker this week. Scientist and artist Joe Davis’ latest undertaking is to imbue apple trees with DNA that has been engineered to spell out the entirety of Wikipedia in genetic data: As, Gs, Cs, and Ts. He’ll use a unique strain of bacteria to carry the coded DNA through plant cell walls and then graft the modified saplings onto apple stock that will grow into adult trees. The data can then be read by decoding the nucleotide acid sequence.

Davis has been on the forefront of DNA data storage technology since 1986 when he coded an image of the ancient Germanic rune for “Earth” into the building blocks of life. His apple tree project is a proof of concept, dangling the possibility that someday a forest could store the whole of human knowledge.

“To some extent, if [our data] is out there in the wild, it increases the possibility that it will survive through any kind of disaster,” said George Church, a longtime collaborator of Davis’ and the head of a Harvard research lab studying DNA data storage.

“The advantage of DNA is that it has a record of longevity,” Church told me in an interview.  “You could store it, left in optimal conditions, for seven hundred thousand years. There’s no disk drive that has anything close to that record.”

While a disk drive’s components start to wear out after three years, it would take an asteroid or some other major destructive event to completely destroy data that’s been coded into the DNA of a forest.

Storing data in the wild has its drawbacks. For one, the natural evolution of living things could distort the encoded information over time as the host’s DNA changes. But biodiversity also means there’s built-in redundancy. If a bit of code is lost in one place, another copy is likely to be intact.

“If you have enough copies then you could reconstruct what the original [message] was,” Church said. His lab is also researching ways to stabilize the data using parts of the genome that remain unchanged over long periods of time, called ultra-conserved elements.

DNA could theoretically last longer if isolated and stored long-term in places that don’t grow and evolve, like a cave or outer space, Church said, but then it wouldn’t be easily accessible. “What does it mean to have our culture out there if there’s nobody to read it?” he said. Forest storage might be a slightly riskier proposition, but it would allow for greater access to the information data than sticking it up among the stars.

Archiving humanity’s knowledge is the most prescient application of DNA coding technology, but it could one day be possible to program data forests to record information as well.

“It’s conceivable that we could make photosensitive recording devices," Church said, "essentially like film in a camera, except instead of having just black and white or three colours, you could have the richness of DNA at every pixel position. In principle, the trees would not just keep one image forever, or one text, but they could continue updating it and become essentially a recording device as well as a storage device.”

That futuristic concept is still speculative and at this point the Harvard lab is chiefly looking at the practical applications of DNA data storage. The team has been contacted by several archivists and companies interested in the technology.  

To illustrate the potential commercial applications, Church coded seven billion copies of his book Regenesis into DNA. Those copies will last for millennia in the DNA archive, a potential cost-saving solution for organizations looking to backup vast amounts of data.

“You don’t want to overhype it, but you don’t want to keep it a secret either,” Church noted. “You just talk about it calmly and consider the positive and negative aspects. A negative right now is that we still need to bring the cost down even more. It’s very promising in terms of the cost of copying and storage, but the reading and writing could use some help. Fortunately, it’s what my lab does full-time.”

There’s still a lot of work to be done in the field before we see trees recording our actions for posterity, or even replacing energy-guzzling sprawling data centers. But if we don’t find a fix for our accelerating data storage problems soon, we could be facing a genuine storage crisis and one with dire environmental repercussions. It makes data forests a very tempting solution for our info-saturated world.