You can extract information from an existing FASTA file and place it into a new FASTA file, replace an existing FASTA file with the information, or append it to an existing FASTA file. Then you must compile the new or changed FASTA file to make it available in the Proteome Discoverer application.
Procedure
- From the application menu bar, select Tools > FASTA Database Utilities.
- In the FASTA Database Utilities dialog box, select the Compile FASTA Database tab.
- The Compile FASTA Database page opens.
- In the Original box, do one of the following:
- Select the Browse icon to locate the FASTA file that you are taking the information from.
- Enter its path and name.
- In the Please Select a FASTA Database dialog box, select a database and then select Open.
- In the Target box, do one of the following:
- Locate the FASTA file that you are placing the extracted information into.
- Enter its path and name.
- In the Save/Add to FASTA File dialog box, select the file, verify that the file extension is .fasta, and then select Save.
- In the Target Database Options area, do one of the following to indicate what you want to do with the extracted information:
- Select Create/Replace to create a new FASTA file for storing the information or overwriting an existing FASTA file.
- Select Append to add the extracted information to an existing FASTA file.
- In the Search In area, specify whether the application should search for the search string in the protein references or sequences.
- References: Searches for the search string in the protein references.
- Sequences: Searches for the specified amino acid sequence within the protein sequences.
- To disregard the case of the information to be extracted, select the Ignore Case of Reference Strings checkbox.
- Specify the information to be extracted as follows:
- Select above the Step 1: String(s) to Include box.
- A line where you can specify the first set of conditions appears in the box.
- Select the first line in the Select Operator column, and then select the operator to apply to the information to be extracted. You can select from the following:
- Starts With: Extracts information that begins with this string.
- Does Not Start With: Extracts information that does not begin with this string.
- Ends With: Extracts information that ends with this string.
- Does Not End With: Extracts information that does not end with this string.
- Contains: Extracts information that includes this string.
- Does Not Contain: Extracts information that does not includes this string. - Select the first line in the Condition column, and then enter the condition that the information must meet in order to be extracted.
- Repeat steps a through c to add more sets of conditions for the information to be extracted.
- To delete a set of conditions, in the Active column, select the line that you want to delete, select , and then select Yes.
- Ensure that the Compile FASTA Database page resembles the following image.
- Select Compile Database.
- Select Stop to halt the compilation.
- After the compilation, select Start Search on the Find Protein References page to view the results of the extraction. For an example, see the following image.
- You do not have to enter information into the Search For box.
- (Optional) To specify any information that you want to exclude from the extracted results, follow these steps:
- Select above the Step 2: String(s) to Exclude From the Results of Step 1 box on the Compile FASTA Database page.
- A line where you can specify the first set of conditions now appears in the box.
- Select the first line in the Select Operator column, and select the operator to apply to the information from the list. You can select from the following:
- Starts With: Excludes information that begins with this string
- Does Not Start With: Excludes information that does not begin with this string.
- Ends With: Excludes information that ends with this string.
- Does Not End With: Excludes information that does not end with this string.
- Contains: Excludes information that includes this string.
- Does Not Contain: Excludes information that does not include this string. - Select the first line in the Condition column, and type the condition that the information must meet in order to be excluded.
- Repeat the steps as needed to add more sets of conditions for the information that you want to exclude.
- To delete a set of conditions, in the Active column select the line that you want to delete and select .
- Select Compile Database.
- Select Start Search on the Find Protein References page to view the results of the extraction.
- You do not have to enter information into the Search For box.