JSC-BIO-2710

Data Intensive Computing for Applied Bioinformatics

Table Of Contents

Previous topic

Syllabus Spring 2012

Next topic

Week Two 24-Jan-12

This Page

Week One 17-Jan-12

Tuesday

Note

Learning Goals:

  • Learn what UNIX is
  • Setup class

Lecture

What is UNIX?

UNIX is an operating system. It was developed in the 1960s. It is multi-user and multi-tasking.

There are many types of UNIX. Some examples are: * Linux * OS X * Sun Solaris.

Why are we learning UNIX?

We need to do big computations on big data sets. This requires a big computer, such as the Lonestar cluster at TACC. Large compute systems almost exclusively use UNIX as the operating system.

How will we learn UNIX?

Practice, practice, practice.

Homework

Reading

http://www.ee.surrey.ac.uk/Teaching/Unix/unixintro.html

Exercises

Make sure your laptop has a network connection wherever you will use it.

Log in to your UNIX laptop. Open a terminal window

Turn In

Send email to me: James.Vincent@jsc.edu

Include in your message:

  • Your preferred user ID, such as jjv01140
  • Your programming experience, if any
  • The computer system you will use for the class, i.e. your own Mac or a laptop handed out in class


Thursday

Note

Learning Goals:

  • manipulate files and directories
  • transfer files from other computers
  • view files, search for text in files

Lecture

What will be covered in class:

Exercises from class: filsystem, part 1:

bash-3.2$ cd
bash-3.2$ pwd
/Users/jsc
bash-3.2$ mkdir projects
bash-3.2$ cd projects
bash-3.2$ mkdir data
bash-3.2$ mkdir programs
bash-3.2$ mkdir results
bash-3.2$ ls
data            programs        results
bash-3.2$ ls -l
total 0
drwxr-xr-x  2 jsc  staff  68 Jan 19 05:59 data
drwxr-xr-x  2 jsc  staff  68 Jan 19 06:00 programs
drwxr-xr-x  2 jsc  staff  68 Jan 19 06:00 results
bash-3.2$ ls -Rl
total 0
drwxr-xr-x  2 jsc  staff  68 Jan 19 05:59 data
drwxr-xr-x  2 jsc  staff  68 Jan 19 06:00 programs
drwxr-xr-x  2 jsc  staff  68 Jan 19 06:00 results

./data:

./programs:

./results:

Exercises from class: filsystem, part 1, again:

bash-3.2$ clear
bash-3.2$ cd
bash-3.2$ mkdir projects
bash-3.2$ cd projects/
bash-3.2$ mkdir data programs results papers
bash-3.2$ ls
data              papers          programs        results
bash-3.2$ ls -Rl
total 0
drwxr-xr-x  2 jsc  staff  68 Jan 19 06:09 data
drwxr-xr-x  2 jsc  staff  68 Jan 19 06:09 papers
drwxr-xr-x  2 jsc  staff  68 Jan 19 06:09 programs
drwxr-xr-x  2 jsc  staff  68 Jan 19 06:09 results

./data:

./papers:

./programs:

./results:

Exercises from class: filesystem, part 2:

bash-3.2$ clear
bash-3.2$ pwd
/Users/jsc/projects
bash-3.2$ cd
bash-3.2$ pwd
/Users/jsc
bash-3.2$ rmdir projects/
rmdir: projects/: Directory not empty
bash-3.2$ cd projects
bash-3.2$ ls
data              programs        results
bash-3.2$ rmdir data
bash-3.2$ rmdir programs
bash-3.2$ rmdir results
bash-3.2$ ls
bash-3.2$ cd ..
bash-3.2$ rmdir projects
bash-3.2$ ls
Desktop           Downloads       Movies          Pictures
Documents Library         Music           Public
bash-3.2$

Exercises from class: filsystem structure:

bash-3.2$ clear
bash-3.2$ cd /
bash-3.2$ ls
Applications                      etc
Developer                 home
Library                           mach_kernel
Network                           macqiime
System                            net
User Guides And Information       opt
Users                             private
Volumes                           sbin
bin                               tmp
cores                             usr
dev                               var
bash-3.2$ cd
bash-3.2$ ls /tmp
1709f4f237676                     launch-h3IKAT
CrashReportCopyLock-iPodTouch     launch-yOzZFT
launch-5ckara                     launchd-111.U1vJIU
launch-6cAiA3                     launchd-3130.lvurk2
launch-TdKNwo                     one
launch-Yijz8v
bash-3.2$ ls /home
bash-3.2$ ls /Users
Guest     Shared  jjv5    jsc
bash-3.2$ cd /
bash-3.2$ mkdir jim
mkdir: jim: Permission denied
bash-3.2$ cd
bash-3.2$ pwd
/Users/jsc

Exercises from class: touch, copy, move:

bash-3.2$ cd
bash-3.2$ cd projects/data
bash-3.2$ touch taxonomy.sh
bash-3.2$ rm taxonomy.sh
bash-3.2$ touch taxonomy.txt
bash-3.2$ cd ../results/
bash-3.2$ touch results.txt
bash-3.2$ cd ../programs/
bash-3.2$ touch myJob.sh
bash-3.2$ cd ../results/
bash-3.2$ cp results.txt results-1.txt
bash-3.2$ mv results-1.txt final_taxonomy.csv
bash-3.2$ ls
final_taxonomy.csv        results.txt
bash-3.2$ cd
bash-3.2$ ls -RL projects/
data              papers          programs        results

projects//data:
taxonomy.txt

projects//papers:

projects//programs:
myJob.sh

projects//results:
final_taxonomy.csv        results.txt

Homework

Reading

Exercises

  • Complete all of the exercises embedded in the UNIX tutorial reading material.

  • Log in to your UNIX computer, open a terminal window and delete all of the subdirectories within your home directory:

    jjv5$ rmdir Music
    jjv5$ rmdir Documents
    ... .and more
  • Create new directories with the names of the odd numbers from one to nine:

    jjv5$ mkdir one
    jjv5$ mkdir three
    jjv5$ mkdir five
    jjv5$ mkdir seven
    jjv5$ mkdir nine
  • Inside each odd named directory create a new directory with the name data:

    jjv5$ cd one
    jjv5$ mkdir data
    jjv5$ cd ..
      .... repeat for all diretories
  • Remove the directory named ‘one’. What happens? Read the man page on rmdir to find out what happened and why.

  • Try again to remove the directory named ‘one’. Delete the contents of the directory first.

  • Recursively list all files and direcotries in your home directory. Read the man page on ls to find out how.

  • Read the man page for the command wc

Note

Here is a link to a step by step video showing how to use ftp. The video completes the steps given below. Step by step video

  • Transfer a file from NCBI:

    Use ftp, (file transfer protocol) client to transfer a file from NCBI.
    When prompted use ftp as the user ID and your email address as password.
    
    The prompts and commands below do not show all of the screen
    output but rather just the commands I typed.
    
    
    jjv5$ ftp ftp.ncbi.nih.gov
    
    Name (ftp.ncbi.nih.gov:jjv5): ftp
    
    Password:
    
    ftp> cd genomes
    
    ftp> cd H_sapiens
    
    ftp> ls
    
    ftp> cd CHR_21
    
    ftp> ls
    
    ftp> get hs_alt_HuRef_chr21.fa.gz
    
    ftp> bye
    
    jjv5$ gunzip hs_alt_HuRef_chr21.fa.gz
    
    You should now have a file named hs_alt_HuRef_chr21.fa
    in your current directory.

Turn In

  • Log in to your UNIX computer, open a terminal window and create the directory structure shown in Figure 1 of the PLoS paper given in the reading list. Do not create all of the files shown - just the empty directories.
  • Change the name of the yeast directory to human
  • Download chromosome 21 from the NCBI ftp site into the ‘human’ directory
  • Unzip the chromosome file using gunzip
  • Run a command in your shell to count the number of lines in the unzipped human chromosome 21 file.
  • Bring your laptop to class with the above tasks completed. This will be checked in class and graded.