BCH394P BCH364C 2022
BCH394P/BCH364C Systems Biology & Bioinformatics
Course unique #: 54540/54450
Lectures: Tues/Thurs 11 – 12:30 PM on Zoom until Jan 27 (log in to Canvas for the link), then in WEL 2.110
Instructor: Edward Marcotte, marcotte @ utexas.edu
- Office hours: Wed 11 AM – 12 noon on Zoom
TA: Muyoung Lee, ml49649 @ utexas.edu
- TA Office hours: Mon 1-2/Fri 11-12 on Zoom
Class Slack channel: ut-sp22-bioinfo.slack.com
Class Canvas site: https://utexas.instructure.com/courses/1325179
Lectures & Handouts
Jan 27, 2022 - Sequence Alignment I
Problem Set I, due before midnight Feb. 7, 2022:
- Problem Set 1
- H. influenzae genome. Haemophilus influenza was the first free living organism to have its genome sequenced. NOTE: there are some additional characters in this file from ambiguous sequence calls. For simplicity's sake, when calculating your nucleotide and dinucleotide frequencies, you can just ignore anything other than A, C, T, and G.
- T. aquaticus genome. Thermus aquaticus helped spawn the genomic revolution as the source of heat-stable Taq polymerase for PCR.
- 3 mystery genes (for Problem 5): MysteryGene1, MysteryGene2, MysteryGene3
- *** HEADS UP FOR THE PROBLEM SET *** If you try to use the Python string.count function to count dinucleotides, Python counts non-overlapping instances, not overlapping instances. So, AAAA is counted as 2, not 3, dinucleotides. You want overlapping dinucleotides instead, so will have to try something else, such as the python string[counter:counter+2] command, as explained in the Rosalind homework assignment on strings.