Let's Hear It For Those Other Genomes!

by Jay Withgott

Posted March 2, 2001 · Issue 97
Modified  for teaching purposes by William S. Barnes, Clarion University of Pennsylvania, Jan. 7, 2002

Abstract

Several dozen organisms, in addition to humans, have been fully sequenced. Genomes of these organisms can help us locate human genes, identify their structure, delineate their function, and find regulatory regions. In this article, the author explores using comparative genomics as a tool for interpreting the human genome.

In addition, we present here a number of other important articles on the human genome, including selected BioMedNet commentaries and articles from various Trends journals, not yet in print or online, which we are happy to be "pre-publishing" on the Beagle:
 


Comparative Genomics is the Key to Interpreting the Human Genome

The race to decipher the human genome has often been presented (both by scientists and the media) as a quest for the Holy Grail, a single treasure chest of human secrets that, once discovered and unlocked, would quickly shower riches upon humankind. Indeed, the genome's publication spurred some striking conclusions - we possess many fewer genes than expected, hundreds of bacterial genes have jumped into our genome, and only 1 percent of our genome actually codes for proteins. Future analyses will doubtless give us further insights into the nature of our genetic heritage.

But those insights and the treasure chest's bounty of medical applications depend, critically, on our being able to interpret the bewildering strings of As, Gs, Cs, and Ts that have gushed from the sequencing machines. And it's nearly impossible to interpret that data without reference to the genomes of other organisms.

Luckily, the human genome hunt has not been conducted in isolation, despite what popular accounts might have us believe. Several dozen other organisms (mostly bacteria) have been fully sequenced as well, and they are now serving as models and guides for the exploration of our own genome.
 
Comparative genomics is the single most important tool in deciphering the human genome. 

"Comparative genomics," Celera Genomics' president J. Craig Venter says simply, "is going to be the single most important tool going forward."

In the spotlight now is the lab mouse (Mus musculus). Celera assembled its mouse genome in February, and the Human Genome Project's consortium says its sequence will be 95% complete by April 1, 2001. Comparing the genomes of humans and mice should spur substantial advances in understanding the genetic bases of disease in our species.

But while other organisms' genomes will help us develop medical applications, they will also bear rich rewards in basic science. For the field of biology as a whole, completion of the human genome is a lesser development than the fact that we are deciphering genomes of many diverse organisms, allowing us to pose and answer questions that were unthinkable previously.

This is the beginning and not the end of the genomics era, says Francis Collins, director of the Human Genome Project. "There is a lineup of organisms with their hands raised, saying 'Sequence me next!'" Collins says. As we do so, he adds, we should look to "some of the less-trammeled parts of the evolutionary tree."

Nonhuman Genomes as Tools for Interpreting the Human Genome

The functions of roughly 40 percent of human genes are completely unknown, and most genes whose functions we think we know await confirmation. Genomes of other organisms can help us locate human genes, identify their functions, delineate the structure of exons, and find regulatory regions.
 

Sequencing nonhuman genomes is "crucial, [because] our ability to find human genes blind is pretty crummy," says Eric Lander, director of the Whitehead Institute Center for Genome Research. With only 1 percent of our DNA coding for proteins, Lander points out, "we have a huge signal-to-noise problem. We need help to enrich the signal from the noise."

Soon after the human genome project was proposed in 1985, forward-thinking biologists realized we'd need to sequence other organisms. Indeed, says Lander, early proponents debated whether to name their effort the "Human Genome Project" or something else that would reflect taxonomic diversity. They decided on "Human Genome Project," Lander says, "because it would make Congress happy."
 
Forty nonhuman species - and 5 eukaryotes - have been sequenced already.

Nonetheless, the nuclear genomes of some 40 species have been fully sequenced so far. Most are bacteria, but the list includes five nonhuman eukaryotes - the mouse, the fruit fly, the nematode worm, the Arabidopsis mustard plant, and baker's yeast.

Comparisons among such organisms are meaningful because all organisms share similiarity in their DNA due to common ancestry. The amount of similarity reflects organisms' relatedness, but also gives information as to functionality of genes. Lengths of DNA that remain similar over the time since organisms diverged from one another suggest that natural selection maintains these areas because they are functionally important. Regions that are unimportant (and thus not under selective pressure) will diverge more quickly. Therefore, regions that are highly conserved among distant relatives - such as humans, Arabidopsis, and yeast - are probably most important.
 
Comparing genomes is meaningful because highly conserved regions may be functionally important.  This is a way of enriching the "signal-to-noise ratio"

This line of reasoning enables the discovery of unknown genes or of regulatory regions. For example, comparisons of human and mouse genomes show surprising amounts of similarity just upstream of coding regions; these represent "tantalizing candidates for conserved regulatory regions," says Eric Green of the National Human Genome Research Institute.

Depending on the question a researcher is asking, any one organism may be either too closely related (and, thus, too similar) or too distantly related (and, thus, too dissimilar) to be of help. That's why researchers agree that a wide diversity of organisms needs to be sequenced.
 

First-pass comparisons among human and nonhuman eukaryote genomes have shown humans to have greatly expanded gene families in several areas, including neural development, the acquired immune system, and alternative gene splicing and posttranslational modification. Such findings make biological sense; the first helps explain our elaborate mental abilities, the second reflects the fact that only vertebrates have immune systems, and the third helps account for how something as complex as a human being could arise from only 30,000 genes.

Of course, we humans are not more complex in everything. The sense of smell is of paramount importance to a mouse, whereas in humans it has been relegated to something of a sensory afterthought, far behind vision and hearing. This is reflected in the mouse's genome, which shows expanded gene families involved in olfaction.
 
The mouse may provide the greatest benefits for human medicine.

The mouse genome may provide the greatest benefits for human medicine.The mouse, the first nonhuman mammal fully sequenced, has been a model organism in lab studies for a century. It can now continue to serve as an experimental testing ground for whatever genetic engineering or drug testing we might want to examine that arises from genome analysis.

Beyond the mouse, genome sequencers are pursuing other mammals (although, so far, this involves mostly mapping and not yet large-scale sequencing). These include pets (cats and dogs) and agricultural animals (cows, pigs, horses, and sheep). Because such domesticated animals are inbred, they possess many genetically based diseases and deficiencies. Cats are good models for certain acquired diseases, while dogs are good models for some inherited diseases, particularly cancers, says Leslie Lyons of the Population Health and Reproduction Department at the University of California at Davis. Canine genomic data has already helped identify the human proline oxidase gene, and cats are a better human model than mice for polycystic kidney disease.

Organisms beyond mammals may be helpful as well. The Arabidopsis genome shares with the human homologs a number of disease genes. Across fly, mouse, and human, knocking out the PAX 6 gene produces an eyeless phenotype. Says Carol Bult of the Mouse Genome Informatics Group at Jackson Laboratory, Bar Harbor, Maine, "We never would have thought that the fly genome would give us anything medically relevant, but [it has]. So we've got to remain open to the fact that discovery comes from many different places and many different organisms, and if we focus narrowly too soon, we close ourselves off to a lot of possibilities."

Nonhuman Genomes and Evolutionary Biology

Comparing genomes is useful beyond its medical applications. Comparative genomics can improve our understanding of the relatedness and evolutionary history of organisms.

  • Biologists have long constructed phylogenies using phenotypes and gene sequences, but genomes provide desirable new characters, such as gene rearrangements. Already, genome comparisons have addressed the shape of the tree of life itself. They've clinched the notion that the Archaea are a group separate from the Bacteria, confirming that the tree of life is fundamentally three-limbed.
Function and history - these are the two major elements that comparative biology tries to tease apart.
  • With phylogenetic knowledge, researchers can trace evolution across lineages at the level of the gene, gene cluster, and chromosome - tracking, for instance, the history of gene duplication events or horizontal transfers. Furthermore, integrating comparative genomics and phylogenetics can reveal origins of gene function in complex organisms. University of California at Berkeley botanist Brent Mishler gives this analogy: "If you want to see how an airplane works, you wouldn't look at a modern jet plane. You'd look at a glider, at the Wright brothers' plane, at a Cessna . . . some of the simpler antecedents."


Function and history - these are the two major elements that comparative biology tries to tease apart. In the human genome, traces of history are rampant. Seventy-five percent of our genome consists of intergenic DNA full of repeat elements - nonfunctional "junk" that is useless but for its recording of history. Phylogeny and comparative genomics - particularly with organisms like deer and bats that lack much of the junk DNA - can help separate history from current function.

So What's Next?
 

By Venter's estimate, the current 40 or so completed genomes may swell to 100 by the end of this year. Soon we may well see chickens and rats, rice and maize, zebrafish and rhesus macaques. We'll see a raft of pathogens like the malaria-causing Plasmodium falciparum, and vectors such as the malaria mosquito Aedes aegypti. We may also see fish with very small genomes, such as Tetraodon and the Japanese puffer fish, which may tell us about the minimal genetic complement a vertebrate needs.

How to prioritize? Obviously, organisms of medical, agricultural, and economic interest will predominate. But Green says we will need to diversify our sampling, pointing out that nearly all mammals well sequenced so far come from only one of the four major mammal groups. Mishler proposes fully sequencing "landmark genomes" from organisms widely spaced throughout the tree of life, and then filling in the gaps with more targeted work.
 

Currently, funding is still a major obstacle. A once-over sequencing of a typical mammalian genome costs around $10-15 million, Green says. But researchers seem optimistic. Sequencing and computing technology should improve further. There is some trickle-down effect, with resources from human genome efforts now being adopted for other organisms. Celera's success may spur more private investment in genome sequencing. And governments seem supportive. "I have detected no resistance in Washington that we should not do other organisms," Lander says. The American people and the Congress understand, he says, that solutions to cancer come from yeast, and solutions to birth defects from flies.

And what of our closest relative, the chimpanzee? Svante Pääbo, of the Max Planck Institute for Evolutionary Anthropology in Leipzig, has lobbied for a chimp genome project, and labs in several countries are seeking funding. Critics say that chimp and human DNA is so similar that little useful information would come of it. However, as Pääbo has argued, the few differences we'd find might be of enormous interest, as they would represent the genetic component of what makes us human. 
History may say that the Human Genome Project .... helped abate humanity's arrogant human assumption of its central place in the universe. 

Furthermore, our very similarity with chimps would make a strong statement about our true place in the world. Venter says he hopes when history looks back upon the human genome's sequencing, it will note that, as with the Copernican and Galilean revolutions, we were humbled by its implications - that it helped abate humanity's arrogant human assumption of its central place in the universe. "We're clearly part of a biological continuum," Venter says. Perhaps that realization is what comparative genomics will do for us, most of all.

 
Jay Withgott is a science writer in San Francisco. His interests range widely from evolution to ecology to behavior to natural history.


Endlinks

- an extensive review of mouse genomics. FromTrends in Genetics. Full text available from BioMedNet.

- a review of sequence data and their applications. From Current Opinion in Plant Biology

The Human Genome - the February 16, 2001 issue of Science is devoted to the Human Genome. Free online access.

Genome Gateway - Nature's online publication and full analysis of the initial sequencing of the human genome. Free online access.

Online Mendelian Inheritance in Animals - offers an extensive database searchable by species, disease, or disorder.

Genome Programs - a collection of links to genome sequencing projects for humans and many other organisms.

Related HMS Beagle articles:

Archives