APEX Protocol Procedure

For more elaborate descriptions of the procedure, see protocol. The list below only describes the functions of the individual perl scripts.
  1. np_peptide_properties.pl: convert list of proteins and their peptides into list of peptides and peptide properties
    input: .txt file
    output: .arf file

  2. np_parse_ProteinProphet.pl: convert ProteinProphet output into tab-deliminted list of proteins, their observed peptides, probabilities and spectral counts
    input: -prot.xml file, desired FDR (usually 0.05)
    output: .protlst file

  3. np_arf_to_arff_TRAINING.pl: convert peptide properties file and parsed ProteinProphet file (data file) into WEKA input file (.arff) for TRAINING
    input: .protlst file, .arf file
    output: .arff file and corresponding .names file

  4. WEKA: create .model from training data

  5. np_arf_to_arff_TEST.pl: convert peptide properties file into set of WEKA input files with x lines (.arff) for TESTING
    input: .arf file, desired number of lines (e.g. 100000)
    output: .arff files (in separate directory) and corresponding .names files

  6. WEKA: create .wekaout predictions (for each individual peptide) from testing data

  7. np_PeptidePredictions_to_ProteinOi.pl: convert saved WEKA output (predictions per peptide, .wekaout) into O_i values (predictions per protein)
    input: .wekaout files and .names files (with peptides in identical order in each of the two files)
    output: .oi file

  8. np_APEX_from_Oi_and_protlist.pl: convert MS output file (parsed from ProteinProphet, for example) and .oi file into predictions of absolute expression values
    input: .oi file, .protlst file, estimate of average number of molecules per protein (or constant: 1)
    output: .apex file

  9. np_two_files_Zscore.pl: compare two APEX files (.apex) and estimates significance of differential expression of a protein
    input: two .apex files (same organism, same protein IDs)
    output: .zscore file

C. Vogel, cvogel at mail utexas edu
April 2008