De novo sequencing uses MS/MS spectra to identify unknown components. For an experiment with a single raw data file, the raw file must contain MS/MS data. For an experiment with multiple raw data files, the first raw data file mist contain MS/MS data. The application does not perform de novo sequencing if the appropriate MS/MS data is not available .
You can perform de novo sequencing on only one component at a time. The de novo identification results automatically overwrite all previous results, including any identification results from the original Peptide Mapping Analysis.
NOTE
De novo sequencing cannot be used with electron-transfer dissociation (ETD) experiment data.
Prerequisites
- For a Peptide Mapping Analysis experiment, you have processed your experiment and are viewing the results on the Process and Review page.
Procedure
- On the Process and Review page, right-click the row for the component of interest in the Results table and select Run De Novo Processing.
- The De Novo Sequencing dialog box opens, displaying the de novo sequencing processing parameters.
Figure De Novo Sequencing dialog box - To specify the size of the monoisotopic mass of the precursor ion, type a value in the Monoisotopic Mass box.
- For reliable sequencing, set the value within 0.5 Da of the actual mass. The application usually provides the value.
- When performing de novo sequencing to identify multiple peptides, the application uses this value to define the heaviest peptide for sequencing.
- To specify the charge of the peptide, type a value in the Charge box.
- To specify possible N-terminal and C-terminal residues, types them in the appropriate boxes.
- For example, if the peptide is generated from a tryptic digest of a protein, set the C-terminal as KR; otherwise, leave it blank.
- To specify how much effort to apply to de novo sequencing, type a value in the De Novo Sequencing Effort box.
TIP
A De Novo Sequencing Effort of 5 is typically a good starting point.
- To specify the maximum time you want to spend on each sequencing task, type a value in the Maximum Sequence Evaluation Time box.
TIP
Select a time of 60 seconds for most tasks.
For large peptides (greater than1500 Da), you can set a longer time.
- In the Other Options area, select the checkboxes to specify if you want the algorithm to distinguish K/Q, I/L, or both.
- The algorithm can distinguish I/L amino acids to some extent, but not reliably. The distinction of K/Q amino acids is more reliable
- To define the amino acids to include in the de novo sequencing, select Select Amino Acids, or to start de novo sequencing, select OK.
- See Define amino acids for de novo sequencing.
- The application displays a progress indicator, as some searches might take longer to complete. You can perform other actions while the search continues.
NOTE
To cancel the de novo processing while the search is in progress, right-click the Results table on the Process and Review page and select Cancel De Novo Processing.
The application cancels the search for the component that you previously selected for the de novo processing and does not save any de novo results.
- If the experiment contains data from multiple raw data files, the application uses the first raw data file (in the order listed in the Results table) that contains MS/MS data for the de novo search.
- The application searches for the best identification for the selected component and, if it is found, displays the results in the Results table in the following columns:
- •Identification
- •Peptide Sequence
- •Delta (ppm)
- •Confidence Score
- •ID Type
- •Mono Mass Exp.
- If the selected component is identified, the application overwrites previous data and displays "De Novo" in the Protein column of the Results table.
- The application also updates the Fragment Coverage Map and the predicted and experimental spectra to include the identified component information and saves all of the de novo results.
- When the search is completed, right-click the component row in the Results table and select Show Component Information.
- Other identification possibilities appear in a dialog box, listed in descending order of confidence score, with the best identification displayed at the top.
- The other possibilities have lower confidence scores than the identified component originally displayed in the Results table.