RNA-seq / Differential expression analysis using DESeq

Description

Differential expression analysis using the exact test of the DESeq Bioconductor package ("nbinomTest"). Please note that this tool is suitable only for two group comparisons. For multifactor experiments you can use the tool "Differential expression using edgeR for multivariate experiments", which uses generalized linear models -based statistical methods ("glm edgeR").

Parameters

Details


This tool takes as input a table of raw counts from the different samples. The count file has to be associated with a phenodata file describing the experimental groups. These files are best created by the tool "Utilities / Define NGS experiment", which combines count files for different samples to one table, and creates a phenodata file for it.

When normalization is enabled, size factors are calculated by summing the counts for each sample, or using the library size given by the user in the phenodata.tsv. The former allows to correct for RNA composition bias (which can arise for example when only a small number of genes are very highly expressed in one experiment condition but not in the other).

A dispersion value is estimated for each gene through a model fit procedure, which can be performed in a "local" or "parametric" mode. The former is more robust, but users are encouraged to experiment with the setting to optimize results. Users can select to replace the original dispersion values by the fitted ones always, or only when the fitted value is higher than the original one (more conservative option).

You need to have biological replicates of each experiment condition in order to estimate dispersion properly. If you have biological replicates only for one condition, DESeq will estimate dispersion using the replicates of that single condition. If there are no replicates at all, DESeq will estimate dispersion using the samples from the different conditions as replicates.

Statistical testing is performed using a negative binomial test.

Output

The analysis output consists of the following files:


References

This tool uses the DESeq package for statistical analysis. Please read the following article for more detailed information:

S Anders and W Huber: Differential expression analysis for sequence count data. Genome Biology 2010, 11:R106.