Virus Detect
Description
This tool runs the VirusDetect pipeline, that performs virus identification using sRNA sequencing data.
Given a FASTQ file, it performs de novo assembly and reference-guided assembly by
aligning sRNA reads to the reference database of known viruses. The assembled contigs are
compared to the reference virus sequences for virus identification.
More detailed definition of Virus detect pipeline can be found from
the
home page of VirusDetect.
Input data
Input data (the reads) should be given as FASTQ formatted sequence file. If several FASTQ files are provided,
a separate VirusDetect analysis will be done for each file.
Parameters
- Reference virus database VirusDetect is mainly used for detecting plant viruses,
but you can use it for other viruses too. Use this parameter to select a virus reference database
matching you virus type.
- Host Organism If possible, the input data should be cleaned from sequences originating from the genome of the host organism.
This can be done by mapping the query sequences against the genome of the host organisms and selecting only those reads that do not match
to the host genome.
- If the host genonome is not available you should set this parameter to value: none.
- From the drop-down list you can select one of the organisms, for which there
are pre-calculated index, to be used as host organism.
- If the genome of the host organism is not available in Chipster, but you have the host genome
as a fasta formatted sequence file, you should use tool VirusDetetct with own genome in stead of this tool.
- Reference virus coverage cuttoff Coverage cutoff of a reported virus contig by reference virus sequences.
- Assembled virus contig cuttoff Coverage cutoff of a reported virus reference sequences by assembled virus contigs.
- Depth cutoff Depth cutoff of a reported virus reference
Output
VirusDetect produces large amount of different files as reports. Output related options are used to
select, what data is returned. By default VirusDtetect returns following files:
- virusdetect_contigs.faSequences of non-redundant contigs derived through reference-guided and de novo assemblies.
- contig_sequences.undetermined.fa Sequences of contigs that do not match to virus references.
- blastn_matching_refrences.html files listing reference viruses that have corresponding virus contigs identified by BLASTN. A pdf formatted report file is returned for each match
- blastn_matches.tsv a table of blastn matches to the reference virus database.
- blastx_matching_refrences.html files listing reference viruses that have corresponding virus contigs identified by BLASTX. A pdf formatted report file is returned for each match
- blastx_matches.tsv a table of blastn matches to the reference virus database.
If parameter Return matching reference sequences is turned on the also following files are returned
- blastn_matching_references.fa and .fai. Virus reference sequences that produced hits for blastn search with the potential virus contigs.
- blastn_matching_references.fa and .fai. Virus reference sequences that produced hits for blastn search with the potential virus contigs.
If parameter Return matching reference sequences is turned on the also following files are returned
- blastn_matches.bam and .bai. BAM file containing the blastn alignment of each contig to its corresponding virus reference sequences.
- blastx_matches.bam and .bai. BAM file containing the blastx alignment of each contig to its corresponding virus reference sequences.
Note: If you select both the blastn_matching_references.fa + .fai and blastn_matches.bam + .bai,
(or the corresponding blastx files) you can use the Genome Browser to visualize the BLAST reulst.
In the Genome Browser the blastn_matching_references.fa should be assigned to be used as the genome.
Each reference virus sequence is then listed in the Chromosome pull down menu.
If parameter Return results in one archive file is selected, the all the outputfiles are stored to
a single tar formatted output file. This feature is useful if you run VirusDetetc to several input files in
the same time. The tar formatted output file can be expanded with tool Extract .tar.gz file.