Abstract. Motivation: Tandem mass spectrometry (MS/MS) offers fast and reliable characterization of complex protein mixtures, but suffers from low sensitivity in protein identification. In a typical shotgun-proteomics experiment, it is assumed that all proteins are equally likely to be present. However, there is often other information avail-able, e.g. the probability of a protein's presence is likely to correlate with its mRNA concentration. Results: We develop a Bayesian score that estimates the posterior probability of a protein's presence in the sample given its identifica-tion in an MS/MS experiment and its mRNA concentration measured under similar experimental conditions. Our method, MSpresso, sub-stantially increases the number of proteins identified in an MS/MS experiment at the same error rate, e.g. in yeast, MSpresso increases the number of proteins identified by ~40%. We apply MSpresso to data from different MS/MS instruments, experimental conditions, and organisms (E.coli, human), and predict 19 to 63% more proteins across the different datasets. MSpresso demonstrates that incorpo-rating prior knowledge of protein presence into shotgun-proteomics experiments can substantially improve protein identification scores.
This website provides supplementary postprocessed data used in MSpresso analyses.
Supplementary Notes (preliminary file) (PDF)
Main paper, table 1: different experiments using self models for P(K|M), and cytosolic proteins only.
Description of directory content and file formats.
Gold standard of protein expression in yeast. We used the intersection of MS-based and non-MS-based experimental datasets that are publically available as reference set for the presence (expression) of proteins in wild-type yeast, growing in rich medium, log phase. The file names list first (and last) author, journal and publication year.
Raw MS data is available at the MS Data Repository.
Please contact Smriti (smriti at cs utexas edu) for questions about software. Please contact Smriti (smriti at cs utexas edu) or Christine (cvogel at mail utexas edu) for further information on data, calculations, or results.