Difference between revisions of "TXGP RNAseq analysis"

From Marcotte Lab
Jump to: navigation, search
(Scripts for BFAST)
(Database)
 
(3 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 +
== Overview ==
 +
* We are using [http://bfast.sf.net bfast] and [http://solidsoftwaretools.com/gf/project/mapreads/ mapreads] as main mappers, because they are reasonably fast and have good sensitivity (error-tolerant), with native color-space support.
 +
* We used two strategies for mapping: (1) to JGI draft genome assembly scaffold, and (2) RefSeq MRNA data from XenBase. These would be combined together in the future (when we have enough gene models).
  
 +
== Database ==
 +
See [[TXGP_Xenbase_Data]].
  
 
== Scripts for BFAST ==
 
== Scripts for BFAST ==
Line 30: Line 35:
 
   echo "Usage: run-bfast-match.sh <DB fasta file>"
 
   echo "Usage: run-bfast-match.sh <DB fasta file>"
 
fi</pre>
 
fi</pre>
 +
 +
== Differential Expression ==
  
 
== See also ==
 
== See also ==
 
* https://github.com/MarcotteLabGit/HTseq-toolbox/ (GitHub repository for scripts used in TXGP)
 
* https://github.com/MarcotteLabGit/HTseq-toolbox/ (GitHub repository for scripts used in TXGP)
 
* http://linusben.net/wiki/index.php/BFAST (Taejoon's personal document for using BFAST)
 
* http://linusben.net/wiki/index.php/BFAST (Taejoon's personal document for using BFAST)

Latest revision as of 12:15, 31 August 2011

Contents

Overview

  • We are using bfast and mapreads as main mappers, because they are reasonably fast and have good sensitivity (error-tolerant), with native color-space support.
  • We used two strategies for mapping: (1) to JGI draft genome assembly scaffold, and (2) RefSeq MRNA data from XenBase. These would be combined together in the future (when we have enough gene models).

Database

See TXGP_Xenbase_Data.

Scripts for BFAST

  • Prepare csfastq files: csfasta + qual --> fastq
$ solid2fastq -o reads foobar.csfasta foobar_QV.qual
  • Prepare database sequences. Multiple indexes are not used yet.
$ bfast fasta2brg -f mygenome.fa -A 1
$ bfast fasta2brg -f mygenome.fa
$ bfast index -f mygenome.fa -m 1111111111111111111111 -d 1 -w 14 -A 1
  • run-bfast-match.sh : a script to map csfastq reads to FASTA file.
#!/bin/bash
FASTA=$1
if [[ -f $FASTA ]];
then
  echo "File:",$FASTA
  for FASTQ in $(ls ../fastq/*.fastq.gz)
  do
    BASENAME=$(basename $FASTQ)
    BMF=${BASENAME/".fastq.gz"/}".bmf"
    BAF=${BASENAME/".fastq.gz"/}".baf"
    SAM=${BASENAME/".fastq.gz"/}".sam"
    echo "$FASTQ -- $FASTA --> $SAM"
    bfast match -A 1 -n 4 -f $FASTA -r $FASTQ -z > $BMF
    bfast localalign -A 1 -n 4 -f $FASTA -m $BMF > $BAF
    bfast postprocess -A 1 -n 4 -f $FASTA -i $BAF > $SAM
  done
else
  echo "Usage: run-bfast-match.sh <DB fasta file>"
fi

Differential Expression

See also