Note that all scripts got the extension .txt instead of .pl to be readable by standard web servers. When using the files, consider removing the .txt extension and re-adding the .pl extension. # np_peptide_properties.pl: convert list of proteins and their peptides into list of peptides and peptide properties input: .txt file output: .arf file # np_parse_ProteinProphet.pl: convert ProteinProphet output into tab-deliminted list of proteins, their observed peptides, probabilities and spectral counts input: -prot.xml file, desired FDR (usually 0.05) output: .protlst file # np_arf_to_arff_TRAINING.pl: convert peptide properties file and parsed ProteinProphet file (data file) into WEKA input file (.arff) for TRAINING input: .protlst file, .arf file output: .arff file and corresponding .names file # WEKA: create .model from training data # np_arf_to_arff_TEST.pl: convert peptide properties file into set of WEKA input files with x lines (.arff) for TESTING input: .arf file, desired number of lines (e.g. 100000) output: .arff files (in separate directory) and corresponding .names files # WEKA: create .wekaout predictions (for each individual peptide) from testing data # np_PeptidePredictions_to_ProteinOi.pl: convert saved WEKA output (predictions per peptide, .wekaout) into O_i values (predictions per protein) input: .wekaout files and .names files (with peptides in identical order in each of the two files) output: .oi file # np_APEX_from_Oi_and_protlist.pl: convert MS output file (parsed from ProteinProphet, for example) and .oi file into predictions of absolute expression values input: .oi file, .protlst file, estimate of average number of molecules per protein (or constant: 1) output: .apex file # np_two_files_Zscore.pl: compare two APEX files (.apex) and estimates significance of differential expression of a protein input: two .apex files (same organism, same protein IDs) output: .zscore file