|
Abstract In addition, we present here a number of other important articles
on the human genome, including selected BioMedNet commentaries and articles
from various Trends journals, not yet in print or online, which
we are happy to be "pre-publishing" on the Beagle:
Comparative Genomics is the Key to Interpreting the Human Genome The race to decipher the human genome has often been presented (both by scientists and the media) as a quest for the Holy Grail, a single treasure chest of human secrets that, once discovered and unlocked, would quickly shower riches upon humankind. Indeed, the genome's publication spurred some striking conclusions - we possess many fewer genes than expected, hundreds of bacterial genes have jumped into our genome, and only 1 percent of our genome actually codes for proteins. Future analyses will doubtless give us further insights into the nature of our genetic heritage. But those insights and the treasure chest's bounty of medical applications depend, critically, on our being able to interpret the bewildering strings of As, Gs, Cs, and Ts that have gushed from the sequencing machines. And it's nearly impossible to interpret that data without reference to the genomes of other organisms. Luckily, the human genome hunt has not been conducted in isolation,
despite what popular accounts might have us believe. Several dozen other
organisms (mostly bacteria) have been fully sequenced as well, and they
are now serving as models and guides for the exploration of our own genome.
"Comparative genomics," Celera Genomics' president J. Craig Venter says simply, "is going to be the single most important tool going forward." In the spotlight now is the lab mouse (Mus musculus). Celera assembled its mouse genome in February, and the Human Genome Project's consortium says its sequence will be 95% complete by April 1, 2001. Comparing the genomes of humans and mice should spur substantial advances in understanding the genetic bases of disease in our species. But while other organisms' genomes will help us develop medical applications, they will also bear rich rewards in basic science. For the field of biology as a whole, completion of the human genome is a lesser development than the fact that we are deciphering genomes of many diverse organisms, allowing us to pose and answer questions that were unthinkable previously. This is the beginning and not the end of the genomics era, says Francis Collins, director of the Human Genome Project. "There is a lineup of organisms with their hands raised, saying 'Sequence me next!'" Collins says. As we do so, he adds, we should look to "some of the less-trammeled parts of the evolutionary tree." Nonhuman Genomes as Tools for Interpreting the Human Genome The functions of roughly 40 percent of human genes are completely unknown,
and most genes whose functions we think we know await confirmation. Genomes
of other organisms can help us locate human genes, identify their functions,
delineate the structure of exons, and find regulatory regions.
Sequencing nonhuman genomes is "crucial, [because] our ability to find human genes blind is pretty crummy," says Eric Lander, director of the Whitehead Institute Center for Genome Research. With only 1 percent of our DNA coding for proteins, Lander points out, "we have a huge signal-to-noise problem. We need help to enrich the signal from the noise." Soon after the human genome project was proposed in 1985, forward-thinking
biologists realized we'd need to sequence other organisms. Indeed, says
Lander, early proponents debated whether to name their effort the "Human
Genome Project" or something else that would reflect taxonomic diversity.
They
decided on "Human Genome Project," Lander says, "because it would make
Congress happy."
Nonetheless, the nuclear genomes of some 40 species have been fully sequenced so far. Most are bacteria, but the list includes five nonhuman eukaryotes - the mouse, the fruit fly, the nematode worm, the Arabidopsis mustard plant, and baker's yeast. Comparisons among such organisms are meaningful because all organisms
share similiarity in their DNA due to common ancestry. The amount of similarity
reflects organisms' relatedness, but also gives information as to functionality
of genes. Lengths of DNA that remain similar over the time since organisms
diverged from one another suggest that natural selection maintains these
areas because they are functionally important. Regions that are unimportant
(and thus not under selective pressure) will diverge more quickly. Therefore,
regions that are highly conserved among distant relatives - such as humans,
Arabidopsis,
and yeast - are probably most important.
This line of reasoning enables the discovery of unknown genes or of regulatory regions. For example, comparisons of human and mouse genomes show surprising amounts of similarity just upstream of coding regions; these represent "tantalizing candidates for conserved regulatory regions," says Eric Green of the National Human Genome Research Institute. Depending on the question a researcher is asking,
any one organism may be either too closely related (and, thus, too similar)
or too distantly related (and, thus, too dissimilar) to be of help. That's
why researchers agree that a wide diversity of organisms needs to be sequenced.
First-pass comparisons among human and nonhuman eukaryote genomes have shown humans to have greatly expanded gene families in several areas, including neural development, the acquired immune system, and alternative gene splicing and posttranslational modification. Such findings make biological sense; the first helps explain our elaborate mental abilities, the second reflects the fact that only vertebrates have immune systems, and the third helps account for how something as complex as a human being could arise from only 30,000 genes. Of course, we humans are not more complex in everything. The sense of
smell is of paramount importance to a mouse, whereas in humans it has been
relegated to something of a sensory afterthought, far behind vision and
hearing. This is reflected in the mouse's genome, which shows expanded
gene families involved in olfaction.
The mouse genome may provide the greatest benefits for human medicine.The mouse, the first nonhuman mammal fully sequenced, has been a model organism in lab studies for a century. It can now continue to serve as an experimental testing ground for whatever genetic engineering or drug testing we might want to examine that arises from genome analysis. Beyond the mouse, genome sequencers are pursuing other mammals (although, so far, this involves mostly mapping and not yet large-scale sequencing). These include pets (cats and dogs) and agricultural animals (cows, pigs, horses, and sheep). Because such domesticated animals are inbred, they possess many genetically based diseases and deficiencies. Cats are good models for certain acquired diseases, while dogs are good models for some inherited diseases, particularly cancers, says Leslie Lyons of the Population Health and Reproduction Department at the University of California at Davis. Canine genomic data has already helped identify the human proline oxidase gene, and cats are a better human model than mice for polycystic kidney disease. Organisms beyond mammals may be helpful as well. The Arabidopsis genome shares with the human homologs a number of disease genes. Across fly, mouse, and human, knocking out the PAX 6 gene produces an eyeless phenotype. Says Carol Bult of the Mouse Genome Informatics Group at Jackson Laboratory, Bar Harbor, Maine, "We never would have thought that the fly genome would give us anything medically relevant, but [it has]. So we've got to remain open to the fact that discovery comes from many different places and many different organisms, and if we focus narrowly too soon, we close ourselves off to a lot of possibilities." Nonhuman Genomes and Evolutionary Biology Comparing genomes is useful beyond its medical applications. Comparative genomics can improve our understanding of the relatedness and evolutionary history of organisms.
So What's Next?
By Venter's estimate, the current 40 or so completed genomes may swell to 100 by the end of this year. Soon we may well see chickens and rats, rice and maize, zebrafish and rhesus macaques. We'll see a raft of pathogens like the malaria-causing Plasmodium falciparum, and vectors such as the malaria mosquito Aedes aegypti. We may also see fish with very small genomes, such as Tetraodon and the Japanese puffer fish, which may tell us about the minimal genetic complement a vertebrate needs. How to prioritize? Obviously, organisms of medical, agricultural, and
economic interest will predominate. But Green says we will need to diversify
our sampling, pointing out that nearly all mammals well sequenced so far
come from only one of the four major mammal groups. Mishler proposes fully
sequencing "landmark genomes" from organisms widely spaced throughout the
tree of life, and then filling in the gaps with more targeted work.
Currently, funding is still a major obstacle. A once-over sequencing of a typical mammalian genome costs around $10-15 million, Green says. But researchers seem optimistic. Sequencing and computing technology should improve further. There is some trickle-down effect, with resources from human genome efforts now being adopted for other organisms. Celera's success may spur more private investment in genome sequencing. And governments seem supportive. "I have detected no resistance in Washington that we should not do other organisms," Lander says. The American people and the Congress understand, he says, that solutions to cancer come from yeast, and solutions to birth defects from flies. And what of our closest relative, the chimpanzee? Svante Pääbo, of the Max Planck Institute for Evolutionary Anthropology in Leipzig, has lobbied for a chimp genome project, and labs in several countries are seeking funding. Critics say that chimp and human DNA is so similar that little useful information would come of it. However, as Pääbo has argued, the few differences we'd find might be of enormous interest, as they would represent the genetic component of what makes us human.
Furthermore, our very similarity with chimps would make a strong statement
about our true place in the world. Venter says he hopes when history looks
back upon the human genome's sequencing, it will note that, as with the
Copernican and Galilean revolutions, we were humbled by its implications
- that it helped abate humanity's arrogant human assumption of its central
place in the universe. "We're clearly part of a biological
continuum," Venter says. Perhaps that realization is what comparative genomics
will do for us, most of all.
Endlinks
-
an extensive review of mouse genomics. FromTrends in Genetics. Full text
available from BioMedNet.
-
a review of sequence data and their applications. From Current Opinion
in Plant Biology
The
Human Genome - the February 16, 2001 issue of Science is devoted to
the Human Genome. Free online access.
Genome Gateway
- Nature's online publication and full analysis of the initial sequencing
of the human genome. Free online access.
Online
Mendelian Inheritance in Animals - offers an extensive database searchable
by species, disease, or disorder.
Genome
Programs - a collection of links to genome sequencing projects for
humans and many other organisms.
Related HMS Beagle articles:
|