Our group studies the large-scale organization of proteins, essentially trying to reconstruct the 'wiring diagrams' of cells by learning how all of the proteins encoded by a genome are associated into functional pathways, systems, and networks. We are interested both in discovering the functions of the proteins as well as in learning the underlying organizational principles of the networks. The work is evenly split between computational and experimental approaches, with the latter tending to be high-throughput functional genomics and proteomics approaches for studying thousands of genes/proteins in parallel.
Bioinformatics of protein function and interactionsWe've discovered a number of features of genomes that allow us to predict functions for proteins that have never been experimentally characterized. Using these techniques and information from over 30 fully sequenced genomes, we were able to calculate the first genome-wide predictions of protein function, finding very preliminary function for over half the 2,500 uncharacterized genes of yeast. Now, with hundreds of genomes in hand, we're extending these techniques, as well as asking fundamental questions about the evolution of protein interactions and the evolution of genomes.
Some of our recent papers on gene networks and the systematic discovery of gene function include:
Lee et al., A single gene network accurately predicts phenotypic effects of gene perturbation in Caenorhabditis elegans, Nature Genetics, 40(2):181-8 (2008) PubMed Link
Link to our large-scale gene networks for yeast, worms, mouse: http://www.functionalnet.org
Link to some of our public bioinformatics resources: http://bioinformatics.icmb.utexas.edu
Rational identification of genes affecting traits and diseasesUsing the gene networks and other computational tools, we've now gained some ability to rationally predict the consequences to an organism of mutating or interrupting a specific gene. This means that by using these tools, we can often select a small set of candidate genes to be implicated in a particular disease or trait. We've now experimentally validated >100 such candidate genes for diverse traits in a wide range of organisms, including yeast, worms, Arabidopsis, C. elegans, frogs, mice, and humans. For example, in yeast we've used network models to discovery a large number of new ribosome biogenesis genes, as well as genes controlling such features as cell size. In animals, e.g. using our worm gene network models, we could successfully identify new genes controlling longevity, as well as genes capable of suppressing the loss of the Retinoblastoma tumor suppressor, thus 'curing' worms of model tumors. In plants, with now ex-postdoc Insuk Lee and collaborator Sue Rhee, we could rationally identify new genes regulating root growth, drought resistance, and seedling pigmentation. In vertebrates, working with the Wallingford and Finnell labs, we've been able to use gene network models to help assign functions to a birth defect gene, as well as to identify entirely new birth defect genes, confirming their roles in vivo.
Some of our recent papers on the rational association of genes with traits and diseases:
Li et al., Rational Extension of the Ribosome Biogenesis Pathway Using Network-Guided Genetics, PLoS Biology, (in press): (2009) PubMed
Gray et al., The planar cell polarity effector protein Fuzzy is essential for targeted membrane trafficking, ciliogenesis, and mouse embryonic development, Nature Cell Biology, (in press): (2009) PubMed
Read more about some of our computational approaches to developmental biology
Proteomics: High-throughput protein expression and interaction profiling
From our work and others, it is apparent that proteins in the cell participate in extended protein interaction networks involving thousands of proteins. By defining these networks, we can not only discover the functions of specific proteins based on their connections, but also use these networks as tools to predict the outcome of perturbing the cell. As part of our research efforts in this area, we have been developing high-throughput methods to measure protein abundances in complex biological samples (e.g., by quantitative shotgun proteomics mass spectrometry) and protein localization with cells (e.g., by high-throughput automated fluorescence microcopy, such as of cell microarrays). These sorts of data help us build a catalog of protein, mRNA and metabolite expression from cells grown under many different conditions, forming a quantitative picture of these molecular events inside cells. We expect that data of these sorts will put us on the road to developing predictive, rather than merely descriptive, theories of biology.
Recent papers in this area include:
Narayanaswamy et al., Widespread reorganization of metabolic enzymes into reversible assemblies upon nutrient starvation, Proc Natl Acad Sci U S A, 106(25):10147-52 (2009) PubMed Link
Lu, Vogel, Wong et al., Absolute protein expression profiling estimates the relative contributions of transcriptional and translational regulation, Nature Biotechnology, 25(1):117-24 (2007) PubMed Link
Link to the Open Proteomics Database: http://bioinformatics.icmb.utexas.edu/OPD/
Link to our MS/MS data repository: http://www.marcottelab.org/MSdata/