Difference between revisions of "XENLA WorldCup"

From Marcotte Lab
Jump to: navigation, search
(Browser)
(Browser)
Line 9: Line 9:
 
* ChIP-seq data is kindly contributed by following groups:
 
* ChIP-seq data is kindly contributed by following groups:
 
** H3K27ac, H3K4me1, H3K4me3 - [http://web.stanford.edu/group/bakerlab/Welcome.html Rakhi Gupta/Julie Baker lab, Stanford University, USA]
 
** H3K27ac, H3K4me1, H3K4me3 - [http://web.stanford.edu/group/bakerlab/Welcome.html Rakhi Gupta/Julie Baker lab, Stanford University, USA]
** H3K27ac, H3K4me3, H3K4me2, E2f4, E2f4+Mci - [[http://www.salk.edu/faculty/kintner.html Ian Quigley/Chris Kintner lab, Salk Institute, USA]. E2f4 data is published [http://genesdev.cshlp.org/content/28/13/1461 here]
+
** H3K27ac, H3K4me3, H3K4me2, E2f4, E2f4+Mci - [http://www.salk.edu/faculty/kintner.html Ian Quigley/Chris Kintner lab, Salk Institute, USA]. E2f4 data is published [http://genesdev.cshlp.org/content/28/13/1461 here]
 
** H3K4me3 - [http://www.gurdon.cam.ac.uk/research/gurdon Marta Teperek/John Gurdon lab, Cambridge, UK]
 
** H3K4me3 - [http://www.gurdon.cam.ac.uk/research/gurdon Marta Teperek/John Gurdon lab, Cambridge, UK]
** Rfx2 - , published [http://elifesciences.org/content/3/e01439 here]
+
** Rfx2 - [http://www.bio.utexas.edu/faculty/wallingford/ Mei-I Chung/John Wallingford lab, University of Texas at Austin, USA], published [http://elifesciences.org/content/3/e01439 here]
  
 
= Raw materials =
 
= Raw materials =

Revision as of 18:00, 31 July 2014

Information about Xenopus laevis gene annotation released on July, 2014.

Contents

Browser

Raw materials

  • Xenopus laevis Reference sequences - http://daudin.icmb.utexas.edu/xenopus-pub/ref/
    • mgEST_Xl4jul2012.fa - Michael Gilchrist's assembled transcript (2012 July version)
    • XENLA_XBv5_cdna.fa - XenBase NCBI mRNA sequences (2012 June version)
    • XENLA_UG94.fa - X. laevis UniGene (version 94)
    • XENLA_xb201405_mrna.fa - XenBase NCBI mRNA sequences (2014 May version)
    • XGI_022511_TC.fa - John Quackenbush's assembled ESTs (XGI 022511 version)
    • XENTR_UG52_uniq.fa - X. tropicalis UniGene (version 52).
    • XENTR_xb201405_mrna.fa - XenBase NCBI mRNA sequences (2014 May version)
  • Reference species proteome sequence - http://daudin.icmb.utexas.edu/xenopus-pub/ens72/
    • CHICK_ens72_prot_annot_longest.fa - Chicken
    • DANRE_ens72_prot_annot_longest.fa - Zebrafish
    • MOUSE_ens72_prot_annot_longest.fa - Mouse
    • XENTR_ens72_prot_annot_longest.fa - X. tropicalis
    • HUMAN_ens72_prot_annot_longest.fa - Human

Merge

  1. Map on JGI ver 7.1 genome with GMAP (default setting).
  2. Sort all transcripts based on CDS length identified by GMAP (from longest to shortest). For transcripts with identical CDS length, sort them based on exon length also identified by GMAP (from shortest to longest; when I did this second sorting in opposite way, there were so many fused genes produced so I decide to sacrifice long UTRs instead).
  3. Choose longest transcripts per give genome scaffold region and direction of transcription.

Translation

  1. Translate non-redundant transcripts into all possible 6 frames, with standard codon usage table.
  2. Search it against Reference species proteome (human, mouse, zebrafish, chicken, X. tropicalis; EnsEMBL ver. 72)
  3. Determine the translation frame

= Merge