The MSF Files node submits the selected input files to the Workflow Editor for processing in the consensus workflow. It is always the first node in the consensus workflow.

The MSF Files node also transfers all target and decoy PSMs that pass filtering by a peptide Delta Cn value from the selected result files to a report file. It also transfers the proteins, modifications, and spectrum information that belong to the transferred PSMs. It does not store this information without a transferred PSM.

You can also use the MSF Files node to determine how the application parses the FASTA title lines of found proteins. When the application parses these titles lines, it applies a set of predefined parsing rules to extract the accession and description of the protein. If it finds a protein in more than one results file from which it generates a report, and if the FASTA files of the two reports differ, the application displays the first available description and accession on the Proteins page. You can view all other accessions and descriptions when you move the cursor over the cells in the corresponding columns.

Changing the parameter settings of the MSF Files node changes the display of protein accession and description information in the report. For information on this procedure, see Special Consensus Workflow Nodes and Node Settings.

For information on modifying the FASTA parsing rules and making them available in the MSF Files node, see Add or modify FASTA parsing rules.

The following table describes the parameters for the MSF Files node.

MSF Files node parameters

Parameter

Definition

Spectra to Store

Determines the type of spectra that the application stores in the result file.

  • All: Stores all spectra that were searched.
  • Identified: Stores only identified spectra.
  • (Default) Identified or Quantified: Stores only identified or quantified spectra.
  • None: Does not store any spectra.

Feature Traces to Store

Specifies the traces to store in the result file:

  • ‘None’: no feature traces are stored, specialized traces are stored.
  • ‘All’: all feature traces are stored.

Merge Mode

Specifies the mode used to merge identification results.

  • (Default) Globally by Search Engine Type: Globally merges all identifications of the same search engine type and creates one column per search engine type and value reported by the search engine.
  • Per File and Search Engine Type: Merges all identifications of the same search engine type for every input file and creates one column per search engine type and value reported by the search engine for every input file.
  • Do Not Merge: Does not merge identifications. It creates one column per search engine node and value reported by the search engine for every input file.

Reported FASTA Title Lines

Specifies whether to report the best matched or all FASTA title lines of a protein.

  • (Default) Best Match: Reports the best matched FASTA title lines of a protein.
  • All: Reports all FASTA titles lines of a protein.

Title Line Rule

Determines the type of parsing rule to apply to extract the primary accession and description from a FASTA title line.

  • SGD: Applies a special rule for FASTA files downloaded from the repositories of the yeast database (Saccharomyces Genome Database (SGD)).
  • (Default) Standard: Applies the standard parsing rules.
  • PDBj: (Protein Databank (Japan))

Preferred Accession

Selects a parsing rule to extract the preferred protein accession from the FASTA entry. If the application finds a preferred accession, it displays this accession instead of the primary accession.

Preferred Taxonomy

Selects a parsing rule to extract the preferred taxonomy from the FASTA entry. If the application finds a preferred taxonomy, it displays the accession and description of this entry, except when an entry containing a preferred accession is better than an entry containing preferred taxonomy and no preferred accession.

Avoid Expressions

Selects the terms to avoid when parsing the protein description. If more than one description is available, the application prefers the description containing none of the specified terms.

  • Predicted or Hypothetical Proteins: Matches words that are associated with protein descriptions that have not been validated, such as “predicted,” “putative,” “hypothetical,” and “highly similar.”

Maximum Delta Cn

Specifies a threshold that determines which PSMs the Proteome Discoverer application transfers to the report file. The application transfers only those PSMs with a Delta Cn value less than the specified threshold to the report file.

Range: 0–0.1; default: 0.05

Maximum Rank

Specifies a maximum search engine rank threshold value and filters out all PSMs with a search engine rank higher than this value.

Range: 0–no maximum; default: 0

If you set this value to 0, the application uses all available PSMs.

Maximum Delta Mass

Determines whether to apply a delta mass filter and exclude from the final result peptide matches with a larger mass difference between the theoretical and the found peptide m/z.

If you set this parameter to zero, the application does not apply a delta mass filter.

Range: 0.0–5.0 Da or 0.0–5000 ppm; default: 0 ppm

x. Score

Specifies the type of score to subject to filtering.

x. Threshold

Specifies the threshold below which the application excludes PSMs for the designated type of score.

Default: 0