TXGP RNAseq analysis

From Marcotte Lab
Revision as of 11:15, 31 August 2011 by Taejoon (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Contents

Overview

  • We are using bfast and mapreads as main mappers, because they are reasonably fast and have good sensitivity (error-tolerant), with native color-space support.
  • We used two strategies for mapping: (1) to JGI draft genome assembly scaffold, and (2) RefSeq MRNA data from XenBase. These would be combined together in the future (when we have enough gene models).

Database

See TXGP_Xenbase_Data.

Scripts for BFAST

  • Prepare csfastq files: csfasta + qual --> fastq
$ solid2fastq -o reads foobar.csfasta foobar_QV.qual
  • Prepare database sequences. Multiple indexes are not used yet.
$ bfast fasta2brg -f mygenome.fa -A 1
$ bfast fasta2brg -f mygenome.fa
$ bfast index -f mygenome.fa -m 1111111111111111111111 -d 1 -w 14 -A 1
  • run-bfast-match.sh : a script to map csfastq reads to FASTA file.
#!/bin/bash
FASTA=$1
if [[ -f $FASTA ]];
then
  echo "File:",$FASTA
  for FASTQ in $(ls ../fastq/*.fastq.gz)
  do
    BASENAME=$(basename $FASTQ)
    BMF=${BASENAME/".fastq.gz"/}".bmf"
    BAF=${BASENAME/".fastq.gz"/}".baf"
    SAM=${BASENAME/".fastq.gz"/}".sam"
    echo "$FASTQ -- $FASTA --> $SAM"
    bfast match -A 1 -n 4 -f $FASTA -r $FASTQ -z > $BMF
    bfast localalign -A 1 -n 4 -f $FASTA -m $BMF > $BAF
    bfast postprocess -A 1 -n 4 -f $FASTA -i $BAF > $SAM
  done
else
  echo "Usage: run-bfast-match.sh <DB fasta file>"
fi

Differential Expression

See also