Difference between revisions of "TXGP RNAseq analysis"

From Marcotte Lab
Jump to: navigation, search
(Database)
(Database)
Line 6: Line 6:
 
* [[xdata::xlaevisMRNAref.20110501.v1.fa]] (RefSeq mRNA FASTA file)
 
* [[xdata::xlaevisMRNAref.20110501.v1.fa]] (RefSeq mRNA FASTA file)
 
* [[xdata::xlaevisMRNAref.20110501.v1.tbl]] tbl (RefSeq mRNA annotation table)
 
* [[xdata::xlaevisMRNAref.20110501.v1.tbl]] tbl (RefSeq mRNA annotation table)
* [[xdata::xlaevisProteinsref.20110501.v1.fa]] (RefSeq protein FASTA file)
+
* [[xdata::xlaevisProteinref.20110501.v1.fa]] (RefSeq protein FASTA file)
* [[xdata::xlaevisProteinsref.20110501.v1.tbl]] (RefSeq mRNA annotation table)
+
* [[xdata::xlaevisProteinref.20110501.v1.tbl]] (RefSeq mRNA annotation table)
  
 
== Scripts for BFAST ==
 
== Scripts for BFAST ==

Revision as of 11:28, 17 June 2011

Contents

Overview

  • We are using bfast and mapreads as main mappers, because they are reasonably fast and have good sensitivity (error-tolerant), with native color-space support.
  • We used two strategies for mapping: (1) to JGI draft genome assembly scaffold, and (2) RefSeq MRNA data from XenBase. These would be combined together in the future (when we have enough gene models).

Database

Scripts for BFAST

  • Prepare csfastq files: csfasta + qual --> fastq
$ solid2fastq -o reads foobar.csfasta foobar_QV.qual
  • Prepare database sequences. Multiple indexes are not used yet.
$ bfast fasta2brg -f mygenome.fa -A 1
$ bfast fasta2brg -f mygenome.fa
$ bfast index -f mygenome.fa -m 1111111111111111111111 -d 1 -w 14 -A 1
  • run-bfast-match.sh : a script to map csfastq reads to FASTA file.
#!/bin/bash
FASTA=$1
if [[ -f $FASTA ]];
then
  echo "File:",$FASTA
  for FASTQ in $(ls ../fastq/*.fastq.gz)
  do
    BASENAME=$(basename $FASTQ)
    BMF=${BASENAME/".fastq.gz"/}".bmf"
    BAF=${BASENAME/".fastq.gz"/}".baf"
    SAM=${BASENAME/".fastq.gz"/}".sam"
    echo "$FASTQ -- $FASTA --> $SAM"
    bfast match -A 1 -n 4 -f $FASTA -r $FASTQ -z > $BMF
    bfast localalign -A 1 -n 4 -f $FASTA -m $BMF > $BAF
    bfast postprocess -A 1 -n 4 -f $FASTA -i $BAF > $SAM
  done
else
  echo "Usage: run-bfast-match.sh <DB fasta file>"
fi

Differential Expression

See also