APEX Protocol

Calculating Absolute and Relative Protein Abundance

from Mass Spectrometry based Protein Expression Data

Supplementary Material

Christine Vogel, Edward M. Marcotte
Center for Systems and Synthetic Biology, Institute for Cellular and Molecular Biology, University of Texas at Austin, 2500 Speedway, MBB 3.210, Austin, TX 78712
Email: cvogel at mail utexas edu, edward.marcotte at gmail com

Abstract. Mass spectrometry (MS)-based shotgun proteomics allows protein identifications even in complex biological samples. Protein abundances can then be estimated from the counts of MS/MS spectra attributable to each protein, provided one accounts for differential MS-detectability of contributing peptides. We developed a method, APEX, which calculates Absolute Protein EXpression levels based upon learned correction factors, MS/MS spectral counts, and each protein's probability of correct identification.

The APEX protocol describes APEX-based calculations in three parts:

1. Using training data, peptide sequences and their sequence properties, a model is built to estimate MS-detectability (Oi) for any given protein.
2. Absolute protein abundances are calculated from spectral counts, identification probabilities and the learned Oi-values.
3. Simple statistics allow calculation of differential expression in two distinct biological samples, i.e. measuring relative protein abundances.

APEX-based protein abundances span 3-4 orders of magnitude and are applicable to mixtures of 100s to 1000s of proteins.

Protocol paper (Nature Protocols, 2008). Link to journal website.
Supplementary Notes (to Protocol)

Original citation: Lu, Vogel, Wang, Yao, Marcotte. Absolute protein expression profiling estimates the relative contributions of transcriptional and translational regulation. Nat Biotechnol. 2007 Jan;25(1):117-24

This website provides scripts and data files to conduct analyses described in the APEX Protocol.


Perl Scripts




Ready-made Oi-values for both LCQ and ORBI

Z-score calculation

Note (August 29, 2008): I am actively working on different ways to calculate significance of differential protein expression. There will be an updated perl script and a more detailed explanaition soon. C.V.

Example: yeast grown in YMD and YPD (analyzed on LCQ) - data from OPD(opd00038_YEAST...opd00042_YEAST and opd00047_YEAST...opd00098_YEAST)

Other links


C. Vogel, cvogel at mail utexas edu
August 2008