Difference between revisions of "CH391L/BLOSUM"

From Marcotte Lab
Jump to: navigation, search
(Created page with "== Source Code == * http://www.marcottelab.org/users/CH391L/FromTA/BLOSUM.pl.txt * You also need BLOSUM50.txt file to run this script: http://www.marcottelab.org/users/CH391L/Fro...")
 
(Comments)
 
Line 7: Line 7:
  
 
== Comments ==
 
== Comments ==
 +
Two 'not-easy' method used to read BLOSUM table in PERL.
 +
 +
my @aa1_list = split(/\s+/,$aa1_line);
 +
* This splits the line with whitespace (space, tab, etc.). The pattern in '//' sign is called regular expression (I mentioned this in my lecture), and it recognize a pattern in the string. For example, '\s+' means 'one or more(+) whitespace(\s)'. You can find more information about PERL regular expression at http://perldoc.perl.org/perlretut.html .
 +
 +
$score_map{$tmp[0]}->{$aa1_list[$i]} = $tmp[$i];
 +
* This is called 'hash of hashes'. The reason why I use this data structure is (1) I cannot initialize two dimensional array in PERL (means it is little bit complex to handle this in array form), and (2) all I need is to access the value with two amino acids. $tmp[0] (row) and $aa_list[$i] (column) are two amino acids, and $tmp[$i] is BLOSUM50 score for their substitution. As you see later in the code, you can call this score by $score_map{'A'}->{'S'}. See [http://www.cs.mcgill.ca/~abatko/computers/programming/perl/howto/hash/#create_a_hash_of_hashes__via_references here] for more about 'hash of hashes'.
  
 
----
 
----
 
[[Category:CH391L]]
 
[[Category:CH391L]]

Latest revision as of 21:44, 9 February 2011

Source Code

How it works

BLOSUM_output.jpg

Comments

Two 'not-easy' method used to read BLOSUM table in PERL.

my @aa1_list = split(/\s+/,$aa1_line);
  • This splits the line with whitespace (space, tab, etc.). The pattern in '//' sign is called regular expression (I mentioned this in my lecture), and it recognize a pattern in the string. For example, '\s+' means 'one or more(+) whitespace(\s)'. You can find more information about PERL regular expression at http://perldoc.perl.org/perlretut.html .
$score_map{$tmp[0]}->{$aa1_list[$i]} = $tmp[$i];
  • This is called 'hash of hashes'. The reason why I use this data structure is (1) I cannot initialize two dimensional array in PERL (means it is little bit complex to handle this in array form), and (2) all I need is to access the value with two amino acids. $tmp[0] (row) and $aa_list[$i] (column) are two amino acids, and $tmp[$i] is BLOSUM50 score for their substitution. As you see later in the code, you can call this score by $score_map{'A'}->{'S'}. See here for more about 'hash of hashes'.