Difference between revisions of "MSblender TACC"

From Marcotte Lab
Jump to: navigation, search
(Run comet)
(Run MSGF+)
Line 95: Line 95:
  
 
== Run MSGF+ ==
 
== Run MSGF+ ==
 +
 +
Create 'stampede-MSGF+.sh' file as below.
 +
<pre>#!/bin/bash
 +
#SBATCH -n 16
 +
#SBATCH -p normal
 +
#SBATCH -t 24:00:00
 +
 +
#SBATCH -o mg+.o%j
 +
set -x
 +
 +
module load jdk64
 +
 +
MSGFplus_JAR="$HOME/git/MSblender/extern/MSGFPlus.jar"
 +
 +
DB="../DB/my_seq.combined.fa"
 +
 +
DBNAME=$(basename $DB)
 +
DBNAME=${DBNAME/.fa/}
 +
 +
#SBATCH -J "mg+"
 +
for MZXML in $(ls ../mzXML/*mzXML)
 +
do
 +
  OUT=$(basename $MZXML)
 +
  OUT=${OUT/.mzXML/}"."$DBNAME".MSGF+.mzid"
 +
  TBL=${OUT/.mzid/.tsv}
 +
  time java -Xmx20000M -jar $MSGFplus_JAR -d $DB -s $MZXML -o $OUT -t 20ppm -tda 0 -ntt 2 -e 1 -inst 3
 +
  time java -Xmx20000M -cp $MSGFplus_JAR edu.ucsd.msjava.ui.MzIDToTsv -i $OUT -o $TBL -showQValue 1 -showDecoy 1 -unroll 1
 +
done</pre>
  
 
== Run X!Tandem ==
 
== Run X!Tandem ==

Revision as of 18:21, 2 March 2015

Contents

Before you start

  • To use this setting, your TACC account needs to be allocated to our lab project('A-cm10'). If you don't have an account, create it at https://portal.tacc.utexas.edu/. Then, ask Edward to assign your account as a member of lab project.
  • This document is for 'stampede'.
  • Currently in most cases I use three search engines: comet, X!Tandem, and MS-GF+.
  • You don't need to run 'MSblender' modeling on TACC, because it does not take that long. I normally run all searches at TACC, then transfer the output to my local machine to run MSblender. So it only covers 'search' part. For running MSblender, please see MSblender page.

Install MSblender (and comet, MSGF+, X!Tandem)

$ cd ~
$ mkdir git
$ cd git
$ git clone https://github.com/marcottelab/MSblender.git

Prepare a working space

$ module load python
$ cd $SCRATCH
$ mkdir myProject
$ cd myProject
$ mkdir mzXML
$ mkdir DB
$ mkdir comet
$ mkdir MSGF+
$ mkdir tandemK

Prepare database

  • You can run this process on any computer. If it takes longer than a minute, it would be better to process it on other than TACC login node (your account may be locked).
$ python $HOME/git/MSblender/pre/fasta-reverse.py my_seq.fa
$ cat my_seq.fa.* > my_seq.combined.fa

DB setup for X!tandem

 $ $HOME/git/MSblender/extern/fasta_pro.exe my_seq.combined.fa

You may see the message like below:

$ ~/git/MSblender/extern/fasta_pro.exe my_seq.combined.fa 
fasta_pro file conversion utility, v. 2006.09.15
 input path = my_seq.combined.fa
output path = my_seq.combined.fa.pro
db type = plain

DB setup for comet

You don't need to do anything for this.

DB setup for MSGF+

It uses significant amount of computing resources (i.e. memory), so it may not be suitable to run on login node.

$ module load jdk64
$ java -Xmx4000M -cp /home1/00992/linusben/git/MSblender/extern/MSGFPlus.jar edu.ucsd.msjava.msdbsearch.BuildSA -d XenopusHybrid_xlJGIv16_xtJGIv83.combined.fa -tda 0

Prepare mzXML files

Copy your mzXML files on this diretory ($SCRATCH/myProject/mzXML).

Run comet

$ cd $SCRATCH/myProject/comet
$ ~/git/MSblender/extern/comet.linux.exe -p

Edit 'comet.params.new' file. Typically, you need to change the following lines.

num_threads = 16

peptide_mass_tolerance = 20.0
peptide_mass_units = 2

search_enzyme_number = 2   ## See the end of param file for the type of enzymes

output_txtfile = 1
output_pepxmlfile = 0

Then, create the launcher script (called 'stampede-comet.sh') as below.

#!/bin/bash
#SBATCH -n 16
#SBATCH -p normal
#SBATCH -t 24:00:00

#SBATCH -o cmt.o%j
COMET="$HOME/git/MSblender/extern/comet.linux.exe"

DB="../DB/my_seq.combined.fa"
DBNAME=$(basename $DB)
DBNAME=${DBNAME/.fa/}

PARAM="./comet.params.new"

#SBATCH -J "cmt"
for MZXML in $(ls ../mzXML/*mzXML)
do
  OUT=$(basename $MZXML)
  OUT=${OUT/.mzXML/}"."$DBNAME".comet"
  time $COMET -P$PARAM -D$DB -N$OUT $MZXML
done

Then, submit the job by '$ sbatch stampede-comet.sh'

Run MSGF+

Create 'stampede-MSGF+.sh' file as below.

#!/bin/bash
#SBATCH -n 16
#SBATCH -p normal
#SBATCH -t 24:00:00

#SBATCH -o mg+.o%j
set -x

module load jdk64

MSGFplus_JAR="$HOME/git/MSblender/extern/MSGFPlus.jar"

DB="../DB/my_seq.combined.fa"

DBNAME=$(basename $DB)
DBNAME=${DBNAME/.fa/}

#SBATCH -J "mg+"
for MZXML in $(ls ../mzXML/*mzXML)
do
  OUT=$(basename $MZXML)
  OUT=${OUT/.mzXML/}"."$DBNAME".MSGF+.mzid"
  TBL=${OUT/.mzid/.tsv}
  time java -Xmx20000M -jar $MSGFplus_JAR -d $DB -s $MZXML -o $OUT -t 20ppm -tda 0 -ntt 2 -e 1 -inst 3
  time java -Xmx20000M -cp $MSGFplus_JAR edu.ucsd.msjava.ui.MzIDToTsv -i $OUT -o $TBL -showQValue 1 -showDecoy 1 -unroll 1
done

Run X!Tandem