For many people, the giant panda, Ailuropoda melanoleura, is synonymous with conservation. This gentle, bamboo-munching animal is down to around 2,500 individuals, and a combination of small population size and man-made environmental change, means that it is probably doomed to extinction in the wild. All is not gloom, however: the latest issue of Nature [subscription needed to get past abstract] announces the completion of the sequencing of the giant panda genome (or rather, a giant panda genome – the individual in question was a 3-year old female (name unknown). No pandas were hurt in the making of this sequence.
This enormous undertaking, completed by a vast horde of Chinese researchers, reveals a number of fascinating things about the giant panda, and its position on the evolutionary tree. First the scientists investigated which genes are common to three mammals – panda, dog and human. This was a massive computing task – each of the genomes contains around 1.4 gigabases of non-repetitive sequence, or 1,400,000,000 “letters” (A,T,C or G). If each of these DNA molecules was stretched out, it would be about 5 metres long.
It turns out that about 846 megabases (or 3 metres) are common to all three species; of the remainder, more were common to the dog/panda pair than to either the human/panda or human/dog pairs. As you might expect, these shared areas tended to show high levels of synteny – they clump together physically, presumably because they are to do with fundamental biological processes, and have been passed down the eons without much physical or genetic alteration.
The one thing everyone knows about pandas is that they eat bamboo. However, it appears that, strictly speaking, they do not digest it – the genome contains no enzyme-producing genes that could help dissolve the hard plant tissue. That seems to be done by the bacteria that live in the animal’s gut. Unlike the cat – but like the dog and humans – the panda has several T2R genes, which code for the receptor for sweet tastes. On the other hand, the umami taste receptor, which enables animals to taste meat and protein, is not functional. This helps explain why this animal that is classified as a carnivore does not in fact eat meat. It may not even be able to taste it.
One of the most intriguing findings is that, despite the small population size, the panda genome they sequenced is highly heterozygous – each gene is present in two copies, and in this individual the frequency with which those two copies were different (“heterozygous”), was nearly twice that seen in humans. However, the authors note that the individual they studied was a cross between pandas from two regions – it may be that, in each region, pandas tend to have less variability. This would be worrying because it would suggest that in the wild pandas are more inbred, with associated problems for conservation.
Finally, there is the possibility that direct help for panda conservation may come from the identification of what may be a non-functional copy of a hormone involved in stimulating egg production. It is possible that this may explain the notoriously low fecundity of the panda. Or not.
For scientists, probably the most important thing about the panda sequence – and this also explains why it was published in Nature – is the way they went about it. This is an incredibly technical issue, but basically, the authors have shown that it is possible to sequence whole genomes accurately and rapidly (and relatively cheaply) using a new wave of sequencing technology which relies on sequencing lots of small bits of DNA and then assembling them like some massive jigsaw. Unlike previous efforts, the panda sequence was done from scratch, and has been completed. Other mammal sequences (eg the macaque or the cow) were done by less precise methods, with software to work out the gaps.
By sequencing many many small bits of DNA, the Chinese scientists ended up with a coverage that was about eight times as dense as that of previous mammalian sequences. However, the consequence of this approach is that these bits were assembled into chunks (“scaffolds”) that were smaller and more numerous than in previous sequences (there are over 3,800 panda scaffolds as against less than 100 in the dog). This means that some data may be lost when looking for some genes, or looking for large-scale genomic organisation.
Most striking is the cost. A year ago, when the data was acquired, it cost about $900,000, compared to well over $10,000,000 for a genome using classic techniques. 12 months on, prices of sequencers and computers have declined even more, making the possibility of sequencing many more genomes increasingly real.
Ruiqiang Li, et al. (2010) The sequence and de novo assembly of the giant panda genome Nature 463:311-318