Table of Contents
You may use PyPop
to analyze many
different kinds of data, including allele-level genotype data (as
in Example 2.1, “Multi-locus allele-level genotype data”), allele-level
frequency data (as in Example 2.6, “Allele count data”),
microsatellite data, SNP data, and nucleotide and amino acid
sequence data.
There are two ways to run PyPop
:
interactive mode (where the program will prompt you to directly type the input it needs); and
batch mode (where you supply all the command line options the program needs).
For the most straightforward application of
PyPop
, where you wish to analyze a
single population, the interactive mode is the simplest to use. We
will describe this mode first then describe batch mode.
To run PyPop
, click the
pypop.bat
file (Windows) or type
./pypop
at the command prompt (GNU/Linux).
You should see something like the following output (this is also
described in detail in the instructions in the installation
guide):
PyPop: Python for Population Genomics (0.4.3) Copyright (C) 2003 Regents of the University of California This is free software. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. You may redistribute copies of PyPop under the terms of the GNU General Public License. For more information about these matters, see the file named COPYING. To accept the default in brackets for each filename, simply press return for each prompt. Please enter config filename [config.ini]:sample.ini
Please enter population filename [no default]:sample.pop
PyPop is processing sample.pop (Note: some messages with the prefix "LOG:" may appear here. They are informational only and do not indicate improper operation of the program) PyPop run complete! XML output can be found in: sample-out.xml Plain text output can be found in: sample-out.txt
You should substitute the names of your own configuration
(e.g., config.ini
) and population file (e.g.,
Guatemalan.pop
) for
sample.ini
and
sample.pop
. The formats for these files are
described in Section 2.2, “The data file” and Section 2.3, “The configuration file”, below.
To run PyPop
in batch mode, you
can start PyPop
from the command line
(in Windows: open a DOS shell, GNU/Linux: open a terminal window),
change to the directory where you unpacked
PyPop
and type
pypop-batch Guatemalan.pop
![]() | Note |
---|---|
If your system administrator has installed
|
Batch mode assumes two things: that you have a file called
config.ini
in your current folder and that
you also have your population file also in the current folder. You
can specify a particular configuration file for
PyPop
to use, by supplying the
-c
option as follows:
pypop-batch-c
newconfig.ini
Guatemalan.pop
You may also redirect the output to a different directory
(which must already exist) by using the -o
option:
pypop-batch-c
newconfig.ini
-o altdir
Guatemalan.pop
For a full list of options supported by
PyPop
, type pypop-batch
--help
. You should receive a screen resembling the
following:
Usage: pypop [OPTION] INPUTFILE Process and run population genetics statistics on an INPUTFILE. Expects to find a configuration file called 'config.ini' in the current directory or in /usr/share/PyPop/config.ini. -l, --use-libxslt filter XML via XSLT using libxslt (default) -s, --use-4suite filter XML via XSLT using 4Suite -x, --xsl=FILE use XSLT translation file FILE -h, --help show this message -c, --config=FILE select alternative config file -d, --debug enable debugging output (overrides config file setting) -i, --interactive run in interactive mode, prompting user for file names -g, --gui run GUI (currently disabled) -o, --outputdir=DIR put output in directory DIR -V, --version print version of PyPop INPUTFILE input text file
![]() | Warning |
---|---|
Documentation for these options is underway, but not currently available. |
The most common types of analysis will involve the editing of
your config.ini
file to suit your data (see
Section 2.3, “The configuration file”) followed by the
selection of either the interactive or batch mode described
above. If your input configuration file is
and
your population file name is
configfilename
the initial output will be generated quickly, but your the
popfilename
.txtPyPop
execution will not be finished
until the text output file named
popfilename
-out.txt has been created. A
successful run will produce two output files:
,
popfilename
-out.xml
.
A third output file will be created if you are using the Anthony
Nolan HLA filter option for HLA data to check your input for
valid/known HLA alleles:
popfilename
-out.txt
).popfilename
-filter.xml
The
file is the primary output created by PyPop and the human-readable
popfilename
-out.xml
file is a summary of the complete XML output. It is generated from
the XML output via XSLT (eXtensible Stylesheet Language for
Transformations) using the default XSLT stylesheet
popfilename
-out.txttext.xsl
, which is located in the xslt
directory. The XML output can be further transformed using
customized XSLT stylesheets into other formats for input to
statistical software (e.g., R/Splus, SAS) or other population
genetic software (e.g., PHYLIP). The popmeta
script (popmeta.bat
on Windows,
popmeta
on GNU/Linux) calls on other XSLT
stylesheets to aggregate results from a number of output XML files
from individual populations into a set of tab-separated (TSV)
files containing summary statistics. These TSV files can be
directly imported into a spreadsheet or statistical software. This
script will be further documented in the next release.
A typical PyPop
run might take
anywhere from a few of minutes to a few hours, depending on how
large your data set is and who else is using the system at the
same time. Note that performing the
allPairwiseLDWithPermu
test may take several
days if you have highly
polymorphic loci in your data set.