MSblender TACC: Difference between revisions
From Marcotte Lab
Jump to navigationJump to search
TaejoonKwon (talk | contribs) No edit summary |
TaejoonKwon (talk | contribs) |
||
Line 19: | Line 19: | ||
== Prepare database == | == Prepare database == | ||
* | * You can run this process on any computer. If it takes longer than a minute, it would be better to process it on other than TACC login node (your account may be locked). | ||
<pre>$ python | <pre>$ python $HOME/git/MSblender/pre/fasta-reverse.py my_seq.fa | ||
$ | $ cat my_seq.fa.* > my_seq.combined.fa</pre> | ||
=== DB setup for X!tandem === | |||
> | <pre> $ $HOME/git/MSblender/extern/fasta_pro.exe my_seq.conbind.fa</pre> | ||
You may see the message like below: | |||
<pre>$ ~/git/MSblender/extern/fasta_pro.exe my_seq.combined.fa | |||
<pre> $~/ | |||
fasta_pro file conversion utility, v. 2006.09.15 | fasta_pro file conversion utility, v. 2006.09.15 | ||
input path = | input path = my_seq.combined.fa | ||
output path = | output path = my_seq.combined.fa.pro | ||
db type = plain</pre> | db type = plain</pre> | ||
=== DB setup for | === DB setup for comet === | ||
You don't need to do anything for this. | |||
=== DB setup for | === DB setup for MSGF+ === | ||
It uses significant amount of computing resources (i.e. memory), so it may not be suitable to run on login node. | |||
<pre> $ module load jdk64 | |||
$ java -Xmx4000M -cp /home1/00992/linusben/git/MSblender/extern/MSGFPlus.jar edu.ucsd.msjava.msdbsearch.BuildSA -d XenopusHybrid_xlJGIv16_xtJGIv83.combined.fa -tda 0 | |||
== Prepare mzXML files == | |||
Copy your mzXML files on this diretory ($SCRATCH/myProject/mzXML). | |||
== Prepare search == | == Prepare search == |
Revision as of 18:00, 2 March 2015
Before you start
- To use this setting, your TACC account needs to be allocated to our lab project('A-cm10'). If you don't have an account, create it at https://portal.tacc.utexas.edu/. Then, ask Edward to assign your account as a member of lab project.
- This document is for 'stampede'.
- Always work at $SCRATCH directory, not at /corral or your $HOME.
Install MSblender (and comet, MSGFDB, X!Tandem)
$ cd ~ $ mkdir git $ cd git $ git clone https://github.com/marcottelab/MSblender.git
Prepare a working space
$ module load python $ cd $SCRATCH $ mkdir myProject $ cd myProject $ mkdir mzXML $ mkdir DB
Prepare database
- You can run this process on any computer. If it takes longer than a minute, it would be better to process it on other than TACC login node (your account may be locked).
$ python $HOME/git/MSblender/pre/fasta-reverse.py my_seq.fa $ cat my_seq.fa.* > my_seq.combined.fa
DB setup for X!tandem
$ $HOME/git/MSblender/extern/fasta_pro.exe my_seq.conbind.fa
You may see the message like below:
$ ~/git/MSblender/extern/fasta_pro.exe my_seq.combined.fa fasta_pro file conversion utility, v. 2006.09.15 input path = my_seq.combined.fa output path = my_seq.combined.fa.pro db type = plain
DB setup for comet
You don't need to do anything for this.
DB setup for MSGF+
It uses significant amount of computing resources (i.e. memory), so it may not be suitable to run on login node.
$ module load jdk64 $ java -Xmx4000M -cp /home1/00992/linusben/git/MSblender/extern/MSGFPlus.jar edu.ucsd.msjava.msdbsearch.BuildSA -d XenopusHybrid_xlJGIv16_xtJGIv83.combined.fa -tda 0 == Prepare mzXML files == Copy your mzXML files on this diretory ($SCRATCH/myProject/mzXML). == Prepare search == <pre>$ python ~/git/MS-toolbox/bin/prepare-tandemK.py Create /scratch/00992/linusben/xenopus.prot/TXGP_XENLA_Prot_Kwon201109/tandemK. Write /scratch/00992/linusben/xenopus.prot/TXGP_XENLA_Prot_Kwon201109/tandemK/tandem-taxonomy.xml. Write /scratch/00992/linusben/xenopus.prot/TXGP_XENLA_Prot_Kwon201109/tandemK/20110713_XENLA_Egg1_1.tandemK.xml ... TandemK is ready. Run /scratch/00992/linusben/xenopus.prot/TXGP_XENLA_Prot_Kwon201109/scripts/run-tandemK.sh.
$ python ~/git/MS-toolbox/bin/prepare-inspect.py Create /scratch/00992/linusben/xenopus.prot/TXGP_XENLA_Prot_Kwon201109/inspect. Write /scratch/00992/linusben/xenopus.prot/TXGP_XENLA_Prot_Kwon201109/inspect/20110713_XENLA_Egg1_1.inspect_in. Write /scratch/00992/linusben/xenopus.prot/TXGP_XENLA_Prot_Kwon201109/inspect/20110713_XENLA_Egg1_2.inspect_in. ... InsPecT is ready. Run /scratch/00992/linusben/xenopus.prot/TXGP_XENLA_Prot_Kwon201109/scripts/run-inspect.sh.
$ python ~/git/MS-toolbox/bin/prepare-MSGFDB.py Create /scratch/00992/linusben/xenopus.prot/TXGP_XENLA_Prot_Kwon201109/MSGFDB. 20110713_XENLA_Egg1_1.mzXML 20110713_XENLA_Egg1_2.mzXML .... MSGFDB is ready. Run /scratch/00992/linusben/xenopus.prot/TXGP_XENLA_Prot_Kwon201109/scripts/run-MSGFDB.sh.
Run search
In a standalone workstation, you can run ./script/run-(search_engine).sh directly to start. But you shouldn't do this in TACC login terminal. Put the following parameters on each run-*.sh script, then submit a job by qsub.
- If you use lonestar, replace '4way 8' to '8way to 24'. See Lonestar user guide and Longhorn user guide for detail.
- Don't forget to put your email address at -M.
- Put short job name to check the status easily.
#!/bin/bash #$ -V # Inherit the submission environment #$ -cwd # Start job in submission directory #$ -j y # Combine stderr and stdout #$ -o $JOB_NAME.o$JOB_ID #$ -pe 4way 8 #$ -q long #$ -l h_rt=24:00:00 # Run time (hh:mm:ss) #$ -M (your email) #$ -m be # Email at Begin and End of job #$ -P hpc set -x #$ -N (job name) (put the remaining part of run-* script after #!/bin/bash line)