The Largest Genome Ever Sequenced Belongs to a Tree

The loblolly pine has 22 billion base pairs, compared to 3 billion in the human genome.

Victoria Turk

Victoria Turk

Image: Wikimedia Commons/Dcrjsr

We humans tend to be pretty arrogant when it comes to placing ourselves in the biological pecking order; we think our genetic make-up is just so complicated. But while we’re congratulating ourselves for sequencing the human genome with increasing (if yet imperfect) accuracy, there are in fact much larger genomes out there. 

Today, researchers published their report on the largest genome yet to be sequenced, at 22 billion base pairs, seven times the comparatively measly 3 billion in the human genome. What creature did the huge genome belong to? A conifer tree. 

The delightfully named loblolly pine (Pinus taeda) was sequenced by a team of scientists led by the University of California-Davis, who published a report in Genome Biology. To be fair to our own species, genome size isn’t a marker of complexity, and the loblolly’s sequence contained a lot of repetition—82 percent of it was “repetitive in nature.” 

But that doesn’t make sequencing easy, and the researchers ended up applying a whole new method to cut down the time needed to analyse the DNA. As Motherboard’s Jason Koebler wrote recently, it still takes a long time to analyse a human genome, never mind anything larger.

The problem is that in conventional genome sequencing you get small fragments of the letters in DNA, but they all have to be assembled in order. They used a new method developed by scientists at the University of Maryland which pre-processes some of this data to effectively compress the sequence data 100-fold. It’s the first time the novel approach has been tested.

As well as being a test subject for the new technique, the loblolly genome is of a lot of interest given the tree’s starring role in the US landscape, both natural and commercial. It’s the second most common tree species in the country, and it’s a major player in the lumber and paper industry.

Understanding its genome could therefore help develop more efficient crops; for instance, the researchers found genes associated with fighting pathogens such as fusiform rust, which is estimated to cost $28 million in losses every year.

On a research level, the loblolly’s genome might also shed some light on the evolution of other plants, as their roots go back to the dinosaur age. As the researchers wrote, the tree belongs to “one of the oldest of the major plant clades, having arisen from ancestral seed plants some 300 million years ago,” and that conifers will therefore likely provide “many genome-level insights on the origins of genetic diversity in higher plants.”

As for the record genome sequencing, the loblolly might soon be knocked off the top spot. While it’s the largest to be sequenced, it’s not a patch on the largest we know to exist. That accolade belongs to a Japanese flower called Paris japonica, which has 149 billion base pairs—50 times more than the considerably humbled human genome.