MEME Suite Motif File Formats
MEME results are recorded in three file formats: plain text, HTML, and XML. The MEME XML format is completely specified by the Document Type Definition (DTD) found at the start of the MEME XML output. The MEME plain text and HTML formats contain much explanatory text and are thus self-documenting. The XML format was added for MEME 4.0. The plain text and HTML formats have been supported in all versions of MEME.
MAST will accept the plain text and HTML forms of MEME output, and also several other formats described in the MAST documentation.
FIMO will accept the plain text, HTML, and XML forms of MEME output, and the minimal motif format.
GOMO will accept the plain text, HTML, and XML forms of MEME output, and the minimal motif format.
GLAM2 provides plain text and HTML output. The format is described in the Output format section of the GLAM2 Tutorial. It also provides output in the MEME plain text format.
GLAM2SCAN will accept the plain text and HTML forms of GLAM2 output.
Minimal Motif Format
Users may create motif files in a simplified format for use by the MEME Suite programs (except MAST, which uses a different format described here). A sample DNA motif file and a sample protein motif file are available as examples. The meaning of the format is as follows:- The MEME version number line.
MEME version 3.0
-
The alphabet line.
For DNA motif files the lineALPHABET= ACGT
or for protein motif filesALPHABET= ACDEFGHIKLMNPQRSTVWY
must be present. Ambiguous characters are not listed, but may be used by applications. -
Strand information line. (DNA motif files only.)
If both DNA strands are included in the motif:strands: + -
or if only one strand is included:strands: +
- The background distribution lines.
The background must start a new line with the string:Background letter frequencies (from
This is followed, on the next line(s), by a list of characters and their associated frequencies, delimited by white space. - The motifs.
There may be one or more motifs. Each motif contains a "MOTIF" line, a "BL" line, a "letter-probability matrix" line and one line of letter frequencies for each position in the motif (the "letter frequency matrix").
MOTIF crp
BL MOTIF crp width=0 seqs=0
letter-probability matrix: alength= 4 w= 22 nsites= 49 E= 0
The first two lines should be exactly as shown, except the motif name "crp" may be replaced by any word (without spaces). The third line should be exactly as shown, except use "20" in place of "4" for protein motifs, and "w" and "nsites" should be set to the width of the motif and the number of sites used in creating the motif frequency matrix, respectively. After these three lines, there must be one line of letter frequencies (between 0 and 1, summing to 1) for each position in the motif. The number of lines must equalw
, the width of the motif.