MSblender TACC

From Marcotte Lab
Revision as of 17:15, 2 March 2015 by TaejoonKwon (Talk | contribs)

Jump to: navigation, search

Contents

Before you start

  • To use this setting, your TACC account needs to be allocated to our lab project('A-cm10'). If you don't have an account, create it at https://portal.tacc.utexas.edu/. Then, ask Edward to assign your account as a member of lab project.
  • This document is for 'stampede'.
  • Currently in most cases I use three search engines: comet, X!Tandem, and MS-GF+.

Install MSblender (and comet, MSGF+, X!Tandem)

$ cd ~
$ mkdir git
$ cd git
$ git clone https://github.com/marcottelab/MSblender.git

Prepare a working space

$ module load python
$ cd $SCRATCH
$ mkdir myProject
$ cd myProject
$ mkdir mzXML
$ mkdir DB
$ mkdir comet
$ mkdir MSGF+
$ mkdir tandemK

Prepare database

  • You can run this process on any computer. If it takes longer than a minute, it would be better to process it on other than TACC login node (your account may be locked).
$ python $HOME/git/MSblender/pre/fasta-reverse.py my_seq.fa
$ cat my_seq.fa.* > my_seq.combined.fa

DB setup for X!tandem

 $ $HOME/git/MSblender/extern/fasta_pro.exe my_seq.conbind.fa

You may see the message like below:

$ ~/git/MSblender/extern/fasta_pro.exe my_seq.combined.fa 
fasta_pro file conversion utility, v. 2006.09.15
 input path = my_seq.combined.fa
output path = my_seq.combined.fa.pro
db type = plain

DB setup for comet

You don't need to do anything for this.

DB setup for MSGF+

It uses significant amount of computing resources (i.e. memory), so it may not be suitable to run on login node.

 $ module load jdk64
$ java -Xmx4000M -cp /home1/00992/linusben/git/MSblender/extern/MSGFPlus.jar edu.ucsd.msjava.msdbsearch.BuildSA -d XenopusHybrid_xlJGIv16_xtJGIv83.combined.fa -tda 0

Prepare mzXML files

Copy your mzXML files on this diretory ($SCRATCH/myProject/mzXML).

Run comet

$ ~/git/MSblender/extern/comet.linux.exe -p

Edit 'comet.params.new' file. Typically, you need to change the following lines.

 num_threads = 16
peptide_mass_tolerance = 20.0
peptide_mass_units = 2
search_enzyme_number = 2   ## See the end of param file for the type of enzymes

output_txtfile = 1
output_pepxmlfile = 0

Then, create the launcher script as below.

#!/bin/bash
#SBATCH -n 16
#SBATCH -p normal
#SBATCH -t 24:00:00

#SBATCH -o cmt.o%j
COMET="$HOME/git/MSblender/extern/comet.linux.exe"

DB="../DB/my_seq.combined.fa"
DBNAME=$(basename $DB)
DBNAME=${DBNAME/.fa/}

PARAM="./comet.params.new"

#SBATCH -J "cmt"
for MZXML in $(ls ../mzXML/*mzXML)
do
  OUT=$(basename $MZXML)
  OUT=${OUT/.mzXML/}"."$DBNAME".comet"
  time $COMET -P$PARAM -D$DB -N$OUT $MZXML
done

Run MSGF+

Run X!Tandem