NGS ChIP-seq / Find common motifs and match to JASPAR

Description

This tool scans a set of genomic regions for consensus sequence motifs, calculates the alignment score against transcription factors in the Jaspar database and finds the 10 highest ranking for each motif.

Parameters


Details

For a more thorough description of the technical details please consult the original publications cited in the References section below. Briefly, the analysis proceeds through the following steps:

- Given a set of genomic regions the analysis algorithm will first perform a de novo motif discovery in an unseeded fasion. The starting set of position weighted matrices (PWM) is obtained through a combination of space dyads and expectation maximization procedures. Further optimization is achieved using genetic algorithms techniques.

- In the second phase of the analysis. the PWM for known transcription factors collected in the JASPAR database, are matched to the set of consensus motifs discovered in the previous step.

- Finally, the top ten best matching transcription factors are gathered for each consensus motif, logo plots created and E-values scoring the match strength are calculated.

Output

The analysis output consists of the following:


References

This tool uses the following Bioconductor packages:


For more details refer to these publications:

S. Mahony, P.V. Benos "STAMP: a web tool for exploring DNA-binding motif similarities." Nucl Acids Res, (2007) 35:W253-258

S Mahony, PE Auron, PV Benos, "DNA familial binding pro les made easy: comparison of various motif alignment and clustering strategies", PLoS Computational Biology (2007) 3(3):e61

L. Leiping. GADEM: A Genetic Algorithm Guided Formation of Spaced Dyads Coupled with an EM Algorithm for Motif Discovery. J Comput Biology, (2009) Feb;16(2):317-29.