Difference between revisions of "TXGP Data Description"
From Marcotte Lab
(→TXGP X. laevis BAC data) |
(→TXGP X. laevis BAC data) |
||
Line 13: | Line 13: | ||
== TXGP ''X. laevis'' BAC data == | == TXGP ''X. laevis'' BAC data == | ||
+ | One plate of CHORI-219 BAC library. | ||
+ | |||
=== TXGP_XENLA_BAC2k_SA09023: Mate-pair(F3=50bp; R3=50bp; insert_size=2kbp), SOLiD v2 === | === TXGP_XENLA_BAC2k_SA09023: Mate-pair(F3=50bp; R3=50bp; insert_size=2kbp), SOLiD v2 === | ||
− | |||
* SA09023_XENLA_96BAC2kb_F3.called.fastq.gz: read_count=35M, file_size=1.8GB | * SA09023_XENLA_96BAC2kb_F3.called.fastq.gz: read_count=35M, file_size=1.8GB | ||
* SA09023_XENLA_96BAC2kb_R3.called.fastq.gz: read_count=35M, file_size=1.9GB | * SA09023_XENLA_96BAC2kb_R3.called.fastq.gz: read_count=35M, file_size=1.9GB | ||
=== TXGP_XENLA_BAC5k_SA09023: Mate-pair(F3=50bp; R3=50bp; insert_size=5kbp), SOLiD v2 === | === TXGP_XENLA_BAC5k_SA09023: Mate-pair(F3=50bp; R3=50bp; insert_size=5kbp), SOLiD v2 === | ||
− | |||
* SA09023_XENLA_96BAC5kb_F3.called.fastq.gz: read_count=28M, file_size=1.3GB | * SA09023_XENLA_96BAC5kb_F3.called.fastq.gz: read_count=28M, file_size=1.3GB | ||
* SA09023_XENLA_96BAC5kb_R3.called.fastq.gz: read_count=28M, file_size=1.4GB | * SA09023_XENLA_96BAC5kb_R3.called.fastq.gz: read_count=28M, file_size=1.4GB |
Revision as of 12:35, 4 October 2011
Contents |
Naming convention
- Directory name: '(project group)_(species code)_(sample type)_(run ID)'
- File name: '(run ID)_(species code)_(description)_(sample prep ID,barcode,F3/F5/R3)'
- Species code
- XENLA (Xenopus laevis)
- XENTR (Xenopus tropicalis a.k.a. Silurana tropicalis)
- ENGPU (Engystomops pustulosus a.k.a. Túngara Frog or Physalaemus pustulosus).
Data pre-processing
- Remove reads with any no-call('N' in Illumina fastq file; '.' in SOLiD csfasta file).
- Remove low-complex reads, with less than 4 letters ('0123' for color space, 'ATGC' for base space).
TXGP X. laevis BAC data
One plate of CHORI-219 BAC library.
TXGP_XENLA_BAC2k_SA09023: Mate-pair(F3=50bp; R3=50bp; insert_size=2kbp), SOLiD v2
- SA09023_XENLA_96BAC2kb_F3.called.fastq.gz: read_count=35M, file_size=1.8GB
- SA09023_XENLA_96BAC2kb_R3.called.fastq.gz: read_count=35M, file_size=1.9GB
TXGP_XENLA_BAC5k_SA09023: Mate-pair(F3=50bp; R3=50bp; insert_size=5kbp), SOLiD v2
- SA09023_XENLA_96BAC5kb_F3.called.fastq.gz: read_count=28M, file_size=1.3GB
- SA09023_XENLA_96BAC5kb_R3.called.fastq.gz: read_count=28M, file_size=1.4GB
TXGP X. laevis whole genome data
- J-strain from Mustafa Khokha (Yale University).
- Mate-pair library with 1,500 bp insertion.
TXGP_XENLA_WG1500_SA10026 (SOLiDv3)
- SA10026_XENLA_WG1500_HiAmp1ManF3 (80.1M)
- SA10026_XENLA_WG1500_HiAmp1ManR3 (78.8M)
- SA10026_XENLA_WG1500_HiAmp2ManF3 (77.0M)
- SA10026_XENLA_WG1500_HiAmp2ManR3 (76.7M)
- SA10026_XENLA_WG1500_HiAmpEZF3 (83.3M)
- SA10026_XENLA_WG1500_HiAmpEZR3 (81.6M)
- SA10026_XENLA_WG1500_LoAmpManF3 (65.0M)
- SA10026_XENLA_WG1500_LoAmpManR3 (63.7M)
TXGP X. laevis RNA-seq data
TXGP_XENLA_RNA_SA11017 (SOLiDv3)
- SA11017_XENLA_Heart_JA11050v3BC10F3 (24.0M)
- SA11017_XENLA_Heart_JA11050v3BC10F5 (23.4M)
- SA11017_XENLA_Testis_JA11050v3BC04F3 (33.1M)
- SA11017_XENLA_Testis_JA11050v3BC04F5 (32.4M)
TXGP_XENLA_RNA_SA11022 (SOLiDv3)
- SA11022_XENLA_Egg_JA11015v4BC001F3 (19.3M)
- SA11022_XENLA_Egg_JA11015v4BC001F5 (19.4M)
- SA11022_XENLA_Stage24_JA11015v2BC13F3 (16.5M)
- SA11022_XENLA_Stage24_JA11015v2BC13F5 (16.6M)
TXGP_ENGPU_RNA_SA11022 (SOLiDv3)
- SA11022_ENGPU_Larnyx_JA11015v4BC002F3 (21.1M)
- SA11022_ENGPU_Larnyx_JA11015v4BC002F5 (21.1M)
TXGP_XENLA_RNA_SA11024 (SOLiDv3)
- SA11024_XENLA_Liver_JA11055v2BC12F3 (21.0M)
- SA11024_XENLA_Liver_JA11055v2BC12F5 (22.0M)
- SA11024_XENLA_Lung_JA11055v2BC11F3 (35.1M)
- SA11024_XENLA_Lung_JA11055v2BC11F5 (36.7M)
- SA11024_XENLA_Stomach_JA11055v4BC003F3 (27.8M)
- SA11024_XENLA_Stomach_JA11055v4BC003F5 (29.1M)
TXGP X. tropicalis data
TXGP_XENTR_WG5k_SA09023 (SOLiDv2)
Contributed X. laevis Data
We are looking for X. laevis RNA-seq data for building comprehensive gene models.
ConlonUNC_XENLA_RNA_Amin201106: Single-end (76bp), Illumina
Data from Frank Conlon lab at University of North Carolina at Chapel Hill.
- Amin201106_XENLA_Stage38Heart_MO.fastq.gz: read_count=31M, file_size=2.3GB
- Amin201106_XENLA_Stage38Heart_WT.fastq.gz: read_count=28M, file_size=2.0GB
- Amin201106_XENLA_Stage45Heart_CtrlMO.fastq.gz: read_count=33M, file_size=2.2GB
HarlandUBC_XENLA_RNA_Park201106: Single-end (50bp), Illumina
Data from Richard Harland lab at University of California, Berkeley.
- Park2011_XENLA_Arch1_WT.fastq.gz: read_count=101M, file_size=4.8GB
- Park2011_XENLA_Arch2_WT.fastq.gz: read_count=102M, file_size=4.8GB
- Park2011_XENLA_Arch3_WT.fastq.gz: read_count=96M, file_size=4.4GB
- Park2011_XENLA_ArchD_WT.fastq.gz: read_count=115M, file_size=5.5GB
- Park2011_XENLA_ArchV_WT.fastq.gz: read_count=103M, file_size=4.8G
LauBrandeis_XENLA_RNA_Lau201109: Single-end (38bp), Illumina
Data from Nelson Lau lab at Brandeis University.
- Lau201109_XENLA_TadpoleBrain6.fastq.gz: read_count=25M, file_size=953MB
- Lau201109_XENLA_TadpoleBrain7.fastq.gz: read_count=22M, file_size=879MB
- Lau201109_XENLA_TadpoleBrain8.fastq.gz: read_count=27M, file_size=984MB
Contributed X. tropicalis Data
ConlonUNC_XENTR_RNA_Amin201106 (Illumina HiSeq)
Data from Frank Conlon lab at University of North Carolina at Chapel Hill.
- ConlonLab2011_XENTR_Heart_WT1
- ConlonLab2011_XENTR_Heart_WT2