Metagenomics / Statistical analysis for marker gene studies
Description
Compares the diversity or abundance between groups of samples using several ANOVA-type of analyses.
Parameters
- Method for standardizing species abundance values (total, normalize, pa, chi.square, hellinger, log [hellinger]
Details
This tool takes two input files: a count table that contains the counts of the identified species or operational taxonomic units (OTUs) in each sample, and a phenodata
table that specifies the grouping of the samples. Statistical tests only work for a dataset that contains 2-3 groups.
The analyses are performed using the functionality offered by the R packages vegan, rich, biodiversityR, pegas and labdsv.
The tool produces rank abundance curves and rarefaction curves for all groups, and an RDA ordination plot.
- Rank abundance curve is used for displaying the relative species abundances or species evenness across the groups.
- Rarefaction curve allows one to assess how much the sampling efficiency affects the number of observed taxa or OTUs.
- RDA ordination plot shows how certain environmental factors (here groups) affect the species abundance. Note that RDA might not be suitable for your particular data set, but at the moment it is the only ordination technique the tool offers.
Before running the RDA analysis, the species abundance values are standardized using the method specified by the parameter. The standardized values
are also used for running the following ANOVA type of statistical analyses, which compare the diversity or abundance between the groups:
- Permutational Multivariate Analysis of Variance Using Distance Matrices
- Analysis of Molecular Variance
- Multivariate homogeneity of groups dispersions (variances)
Contributed diversity analysis splits the observed "species" (taxa or OTUs) diversity to
- alpha = diversity inside one ecosystem (one group of samples)
- beta = change of diversity between two ecosystems (two groups of samples)
- gamma = overall diversity of the whole region across ecosystems (all samples in the experiment)
Indicator species analysis using the Dufrene-Legendre method tries to find the "species" (taxa or OTUs) that have a high specificity to a single group.
Indicator Species Analysis Minimizing Intermediate Occurrences calculates the degree of the species being present or absent in a certain group of samples.
Output
The analysis output consists of the following:
- rank-abundance_rarefaction_RDA.pdf: plots
- stat-results.txt: test results
References
Consult the R packages vegan, rich, biodiversityR, pegas and labdsv for more details about the methods.