Gene Definitions
Evolving terminology for emerging technologies
Last revised December 27, 2001
Author: Mary Chitty

modified by W. Barnes
Clarion University of Pennyslvbania
May 13, 2002
 

One of the most unfortunate legacies of Mendelian genetics is the lumping together  of gene defects and genes. People with various genetic defects may or may not manifest a disease phenotype.  As both Horace Freeland Judson and Sydney Brenner point out in the articles cited below classical genetics was so firmly based on gene defects that only recently have we begun to determine what "normal" or wild- type genes really are. And careful reading and/or listening will often reveal that people use the word gene and a number of related words and phrases (mutations and other variants) very loosely and interchangeably.
 

How past history leads to present confusion

Horace Freeland Judson, writing in the Feb. 2001 human genome issue of Nature notes problems with terminology. "The phrases current in genetics that most plainly do violence to understanding begin "the gene for": the gene for breast cancer, the gene for hypercholesterolaemia, the gene for schizophrenia, the gene for homosexuality, and so on. We know of course that there are no single genes for such things. We need to revive and put into public use the term "allele". Thus, "the gene for breast cancer" is rather the allele, the gene defect - one of several - that increases the odds that a woman will get breast cancer. "The gene for" does, of course, have a real meaning: the enzyme or control element that the unmutated gene, the wild- type allele, specifies. But often, as yet, we do not know what the normal gene is for. ... Pleiotropy. Polygeny. Perhaps these terms will not easily become common parlance; but the critical point never to omit is that genes act in concert with one another - collectively with the environment. Again, all this has long been understood by biologists, when they break  free of habitual careless words. We will not abandon the reductionist Mendelian programme for a hand- wringing holism: we cannot abandon the term gene and its allies. On the contrary, for ourselves, for the general public, what we require is to get more fully and precisely into the proper language of genetics." [Horace Freeland Judson "Talking about the genome" Nature 409: 769, 15 Feb. 2001]

Sydney Brenner, writing in the special Drosophila genome issue of Science made a similar observation "Old geneticists knew what they were talking about when they used the term "gene", but it seems to have become corrupted by modern genomics to mean any piece of expressed sequence, just as the term algorithm has become corrupted in much the same way to mean any piece of a computer program. I suggest that we now use the term "genetic locus" to mean the stretch of DNA that is characterized either by mapped mutations as in the old genetics or by finding a complete open reading frame as in the new genomics. In higher organisms, we often find closely related genes that subserve closely related, but subtly different, functions." [Sydney Brenner "The End of the beginning" Science 287 (5451): 2173, Mar. 24, 2000]
 

Does defining "gene" only get harder?  Or are we making progress by recognizing how complicated it really is?

This is not a new problem. The report of the Invitational DOE Workshop on Genome Informatics (26-27 April 1993, Baltimore MD) pointed out "The concept of  "gene" is perhaps even more resistant to unambiguous definition now than before the advent of molecular biology.   http://www.ornl.gov/hgmis/publicat/miscpubs/bioinfo/inf_rep2.html

A tutorial "Ontologies for Molecular Biology Workshop: Semantic Foundations for Molecular Biologies" at the Intelligent Systems for Molecular Biology Conference ( June 27-28, 1998) in Montreal, Canada noted "Molecular biology has a communication problem.  Many researchers and databases use (at least partially) idiosyncratic terms and concepts for representing biological information. Often, terms and definitions differ between groups, with different groups not infrequently using identical terms with different meanings. The concept 'gene', for example, is used with different semantics by the major international genomic databases.  http://www-lbit.iro.umontreal.ca/ISMB98/anglais/ontology.html
 

Definitions of gene

Gene is a good example of a word in the process of evolving from classical genetics meanings (fairly abstract concepts, rooted in the Mendelian model of monogenic diseases with high penetrance). The concept of "gene" has been changing so fast that most print resources (and some online) are out of date. The best source I've found is at http://www.ergito.com/ a project of Benjamin Lewin and colleagues (requires free registration) Molecular Biology: The best- selling textbook GENES online (which also has an extensive glossary).

The definition of gene is evolving (and lengthening) as we tease apart the incredible complexity of biological and molecular processes and discover that "junk DNA" has important regulatory functions.  Gene identification in prokaryotes is almost trivial as their genomes consist almost entirely of exons.  However human genes are only about 2 % of total human DNA.  Human exons are widely separated by immense stretches of introns.

The concept of "gene" didn't come along until 1909, three years after the term genetics in 1906 (Evelyn Fox Keller, The century of the gene, Harvard University Press, 2000).  For some time it remained a quite abstract term.  With advances in molecular biology the definition is less and less abstract - but far from settled. Is a monolithic gene concept still valid?

Bioinformatics expert Nat Goodman writing in the April 2001 issue of Genome Technology states that gene "is a highly nuanced noun like "truth". Ten years ago, it commonly meant "genetic locus" - a region of the genome linked to a disease or other phenotype. Over time biologists became more comfortable thinking of a gene as a transcribed region of the genome that results in functional molecular product.
 

Gene definitions

  1. Structurally, a basic unit of hereditary material; an ordered sequence of nucleotide bases that encodes one polypeptide chain (via mRNA). [IUPAC Compendium]  The gene includes, however, regions preceding and following the coding region (leader and trailer) as well as (in eukaryotes) intervening sequences (introns) between individual coding segments (exons). Functionally, the gene is defined by the cis-trans test that determines whether independent mutations of the same phenotype occur within a single gene or in several genes involved in the same function.

  2.  
  3. The functional and physical unit of heredity passed from parent to offspring. Genes are pieces of DNA, and most genes contain the information for making a specific protein [NHGRI glossary] This definition doesn't specify that it applies only to humans - but by specifying "parents" it seems to rule out non- animal genes, and almost implies mammals, or at least warm- blooded organisms.

  4.  
  5. A gene is a DNA segment that contributes to phenotype/ function. In the absence of demonstrated function a gene may be characterized by sequence, transcription or homology. [HUGO, J.A. White et. al. Guidelines for Human Gene Nomenclature HGNC Human Genome Nomenclature Committee, 1997] http://www.gene.ucl.ac.uk/nomenclature/guidelines.html#2.2

  6.  
  7. A gene is a set of connected transcripts. A transcript is a set of exons via transcription followed (optionally) by pre- mRNA splicing. Two transcripts are connected if they share at least part of one exon in the genomic coordinates. At least one transcript must be expressed outside of the nucleus and one transcript must encode a protein (see Footnotes).

  8.  
  9. "A gene can be defined as an abstraction that is useful for the purposes of nomenclature and for the assignment of a symbol. It was originally described as a "unit of inheritance" and has since been described a "set of features on the genome that can produce a functional unit".  [An account of a Gene Nomenclature workshop held in conjunction with the annual American Society of Human Genetics meeting in Philadelphia PA, US Oct. 2 2000.] However the term "functional unit" does not encompass all of those objects to which symbols are assigned. Designations in MGD [Jackson Lab's Mouse Genome Database] specify whether each object is a marker, gene, D segment etc., regardless of whether it acutally represents a "functional unit".

  10.  
  11. In its published human genome paper [Science Feb. 16, 2001] Celera defines a gene as "a locus of cotranscribed exons" in order to emphasize the importance of alternative splicing.

Assessment of the method used to determine the gene will occur by voting at Cold Spring Harbor Genome Meeting 2002
Footnotes

Cambridge Healthtech Institute
1037 Chestnut Street
Newton Upper Falls, Ma 02464
Phone: 617-630-1300
Fax:  617-630-1325
Email: chi@healthtech.com