Difference between revisions of "Xenopus reference"

From Marcotte Lab
Jump to: navigation, search
(Version 1. RefSeq of cDNA & protein)
Line 2: Line 2:
  
 
== Version 1. RefSeq of cDNA & protein ==
 
== Version 1. RefSeq of cDNA & protein ==
 +
# Read gene name for each NCBI id from 'Ncbi...' file. Filter out genes with 'unnamed' in gene name field.
 +
# Read all sequences from '.fasta' file. Convert all sequence character to upper case.
 +
# If I find a sequence with '>gi|<gi number>|ref|<genbank accession>' header (means it is RefSeq entity), write it down.
 +
 
[[xdata:/xenbase/XENLA_cDNA_ref.v1.fasta|XENLA_cDNA_ref.v1.fasta]] (8,879 sequences)
 
[[xdata:/xenbase/XENLA_cDNA_ref.v1.fasta|XENLA_cDNA_ref.v1.fasta]] (8,879 sequences)
 
* Used XenBase files: NcbiMrnaXenbaseGene_laevis.txt, xlaevisMRNA.fasta
 
* Used XenBase files: NcbiMrnaXenbaseGene_laevis.txt, xlaevisMRNA.fasta
Line 8: Line 12:
 
* Used XenBase files: NcbiProteinXenbaseGene_laevis.txt, xlaevisProtein.fasta
 
* Used XenBase files: NcbiProteinXenbaseGene_laevis.txt, xlaevisProtein.fasta
  
# Read gene name for each NCBI id from 'Ncbi...' file. Filter out genes with 'unnamed' in gene name field.
+
 
# Read all sequences from '.fasta' file. Convert all sequence character to upper case.
+
# If I find a sequence with '>gi|<gi number>|ref|<genbank accession>' header (means it is RefSeq entity), write it down.
+
  
 
----
 
----
 
[[Category:XenopusGenome]]
 
[[Category:XenopusGenome]]

Revision as of 12:16, 31 August 2011

All of these files are derived from XenBase (downloaded on May, 01, 2011).

Version 1. RefSeq of cDNA & protein

  1. Read gene name for each NCBI id from 'Ncbi...' file. Filter out genes with 'unnamed' in gene name field.
  2. Read all sequences from '.fasta' file. Convert all sequence character to upper case.
  3. If I find a sequence with '>gi|<gi number>|ref|<genbank accession>' header (means it is RefSeq entity), write it down.

XENLA_cDNA_ref.v1.fasta (8,879 sequences)

  • Used XenBase files: NcbiMrnaXenbaseGene_laevis.txt, xlaevisMRNA.fasta

XENLA_prot_ref.v1.fasta (8,878 sequences; 'taf5' is not annotated as RefSeq in protein, although its corresponding mRNA sequence is annotated as RefSeq.)

  • Used XenBase files: NcbiProteinXenbaseGene_laevis.txt, xlaevisProtein.fasta