![]()
|
TUTORIAL (20-30
minutes)
The GenBank flat files contain a large amount of information. However, it is presented in a highly telegraphic style. Although the information in itself is generally straightforward, some background on the format aids in decoding the information. To begin this tutorial, go to the NCBI home page. From the Search pull-down menu, select "Nucleotide", and click on "Go". The "Entrez Nucleotide" page should appear, which looks like this. Type U49845 into the search box, and click on the "Go" button. When the results of the search are displayed, there should be a link to "U49845", the TCP1-beta gene in Saccharomyces cervisiae (Bakers Yeast). Click on this link. The nucleotide record whose Accession Number is U49845 - which looks like this- should appear.
Look at this sample file and its organization during the rest of this tutorial.
|
LOCUS Section |
GenBank Division (4)
The GenBank database is divided into 17 divisions: 1. PRI - primate
sequences
Some of the divisions contain sequences from specific groups of organisms, while others (EST, GSS, HTG, etc.) contain data generated by specific sequencing technologies from many different organisms. The organismal divisions are historical and do not reflect the current NCBI Taxonomy. Instead, they merely serve as a convenient way to divide GenBank into smaller pieces for those who want to FTP the database. Because of this, and because sequences from a particular organism can exist in technology-based divisions such as EST, HTG, etc., the NCBI Taxonomy Browser should be used for retrieving all sequences from a particular organism.
|
SOURCE
(11)
Free-format information including an abbreviated form of the organism name - this may be a common name or scientific name. It is sometimes followed by a molecule type.
|
Organism (12)
The formal scientific name for the source organism (genus and species,
where appropriate) and its lineage, based on the phylogenetic classification
scheme used in the NCBI
Taxonomy Database. If the complete lineage of an organism is very long,
an abbreviated lineage will be shown in the GenBank record and the complete
lineage will be available in the Taxonomy Database.
|
REFERENCES Section |
FEATURES TABLE Section (19) |
Definitions of the elements of the Features Table:
|
BASE COUNT and ORIGIN Section |
|
|