Instructor James Vincent Teaching Assistant Colin Delaney cdelaney2@mail.smcvt.edu Computing Professional Patrick Clemins Class T & Th 8:30-9:45 Room WLLL 215 Office Hours TBD Office Bentley 330 James.Vincent@jsc.edu
There are no course prerequisites for this class. Students are assumed to have no prior experience with computing or bioinformatics.
It is expected that students will have general familiarity with the use of a personal computer, such as copying and pasting text, opening and closing windows and saving and moving files.
There are no printed textbooks for this course. We will use online materials as needed. Most material will be handed out in class.
Several of the resources we will use are:
UNIX Tutorial:
http://www.ee.surrey.ac.uk/Teaching/Unix/
Python programming:
http://www.openbookproject.net/thinkcs/python/english2e/
Office hours will be held online with the Teaching Assistant. Additional hours may be available in person. Details will be given in class.
Much of modern science is carried out with large data sets that require significant compute resources to analyze. Students will receive an introduction to this method of science research. This course will introduce students to bioinformatics and the basics of using remote computers to carry out bioinformatics tasks. The skills acquired will apply equally to many areas of research and scientific fields.
This course is especially targeted to early undergraduates in order to introduce these aspects of modern science research at an early stage in the science education program.
During this course students will learn what the field of bioinformatics is, how it is integral to modern life sciences research and how bioinformatics research is performed. Students will complete basic bioinformatics tasks first using web based tools and then using remote computing resources.
The use of remote computers requires learning the basics of the Unix operating system. In addition, studenst will learn very basic programming skills using the python programming language.
All subjects will be introduced assuming no prior knowledge. Successful learning will depend primarily on completing exercises that are designed to provide hands on practice. These tasks will not be difficult but will require steady work and attention to detail. Completion of both in class group exercises and assigned homework exercises will be critical to success in this course.
By the end of the course you should be able to: * login to a remote compute cluster, * create, save, delete and move files on a remote cluster, * write shell scripts to automate tasks on a remote computer, * complete DNA sequence comparisons using BLAST, * write basic python language programs, * analyze the results of BLAST jobs using a python program.
The pace of material covered and the order of presentation will be determined in the beginning of the course and adjusted throughout the course based on the ability and performance of the class.
Homework will consist of practice exercises and will be given every class period. Learning by doing is critical in this course. Assignments are designed to provide practice exercises for every day outside of class. Students are strongly encouraged to follow the schedule of exercises and practice each day.
Quizzes will be given once weekly and will constitute the majority of the final class grade. Each quiz will be similar in nature to homework exercises assigned during the previous week. Students are encouraged to work together for better understanding of material but all quizzes and tests will require individual work.
A single class project will be given at the start of the course. Students will work in teams of three or two (depending on course enrollment). All students will complete the same project.
Students will determine the species composition of bacterial communities present in water samples taken from a blue-green algae bloom in Lake Champlain. Data will be provided in the form of DNA sequences from these samples. Using the skills learned in class, students will transfer the data sets to remote compute clusters, run DNA sequence analyses on these data sets and summarize the results using computer programs written for this task.
Final grades will be determined using the grading criteria outlined below.
Assignment Explanation Points Total 15 Quizzes Weekly 20 300 30 Homeworks Each class 20 600 1 Team Project 100 Total class points 1000 Grading Scale
A+ 98-100% B- 80-82% D 63-66% A 93-97% C+ 77-79% D- 60-62% A- 90-92% C 73-76% F below 60% B+ 87-89% C- 70-72% B 83-86% D+ 67-69%