Nucleoitide BLAST

Description

This tool runs nucleotide BLAST search using the NCBI BLAST server. The query sequence file can contain up to 10 sequences.

Parameters

  • Database This option selects the target protein database in the NCBI BLAST server. The available databases are: NCBI non-redundant proteins, PDB, UniProt/Swiss, RefSeq reference proteins, Patented protein sequences, Metagenomic proteins, Transkriptome Shotgun Assembly proteins. By default, SwissProt is used as the target database.
  • BLAST program to useb> BLAST algorithm to use. BLAST algorithm to use.Megablastis intended for comparing a query to closely related sequences and works best if the target percent identity is 95% or more but is very fast. Discontiguous megablast uses an initial seed that ignores some bases (allowing mismatches\) and is intended for cross-species comparisons. BlastN is slow, but allows a word-size down to seven bases.
  • Expectation threshold for saving hits E-value specifies the statistical significance threshold for reporting matches against database sequences. The default value 10 means that 10 such matches are expected to be found merely by chance. Lower thresholds are more stringent, leading to fewer chance matches being reported.
  • Word size The length of the seed that initiates an alignment. BLAST works by finding word-matches between the query and database sequences. One may think of this process as finding hot-spots that BLAST can then use to initiate extensions that might eventually lead to full-blown alignments. For BLASTP searches non-exact word matches are taken into account based upon the similarity between words.
  • Maximun number of hits to collect per sequence This parameter limits the number of hit sequences reported for one query sequence. By default up to 100 hits are reported, but if you wish to collect all the hits, and not just the best ones you should in many cases increase this value significantly.
  • Output format type The BLAST results can be presented in many different formats. The classical BLAST report is not optimal for big data query sets or in the cases where the results will be analyzed with other tools. In addition to the text based BLAST reports, the results can be presented as table or XML file. You can also produce a fasta formatted sequence file containing the matching hit sequence regions or a list of hit sequence names.
  • Filter low complexity regions Use DUST program for filtering low complexity regions in the query sequence.
  • Entrez query to limit search You can use Entrez query syntax to search a subset of the selected BLAST database. This can be helpful to limit searches to molecule types, sequence lengths or to exclude organisms.
  • Location on the query sequence Location of the search region in the query sequence, for example: 23-66.
  • Reward for matchReward for a nucleotide match. The scoring system consists of a reward for a match and a penalty for a mismatch. The absolute reward/penalty ratio should be increased as one looks at more divergent sequences. A ratio of 0.33 (/-3) is appropriate for sequences that are about 99% conserved; a ratio of 0.5 (1-2) is best for sequences that are 95% conserved; a ratio of about one (1/-1) is best for sequences that are 75% conserved.
  • Penalty for a mismatch. Penalty for a nucleotide mismatch. The scoring system consists of a reward for a match and a penalty for a mismatch. The absolute reward/penalty ratio should be increased as one looks at more divergent sequences. A ratio of 0.33 1/-3 is appropriate for sequences that are about 99% conserved; a ratio of 0.5 (1/-2) is best for sequences that are 95% conserved; a ratio of about one (1/-1) is best for sequences that are 75% conserved.)
  • Gap opening penalty Cost to open a gap. Integer value from 6 to 25. The default value of this parameter depends on the selected scoring matrix. Note that if you assign this value, you must define also the gap extension penalty
  • Gap extension penalty Gap extension penalty. Integer value from 1 to 3.The default value of this parameter depends on the selected scoring matrix. Note that if you assign this value, you must define also the gap opening penalty
  • Output a log file Collect a log file for the BLAST run.