Self-Organizing Map for Biological Regulatory Element Recognition and Ordering
SOMBRERO finds regulatory binding sites by using
a neural network algorithm called the "Self-Organizing Map"
to find overrepresented motifs in a set of DNA sequences. The currently
most popular methods for finding overrepresented motifs use techniques
from the fields of probability theory and statistical physics such
as Expectation Maximization or Gibbs Sampling. Our application of
the Self-Organizing Map phrases motif-identification as a clustering
problem, and this seems to yield advantageous performance when applied
to real genomic problems.
Most recently, SOMBRERO was extended in order to
allow it to be initialised using a SOM that has been previously
trained on a set of known transcription factor binding matrices.
Initialising SOMBRERO using such prior knowledge allows the software
program to be biased towards finding known transcription factor
binding motifs. We have recently shown that the use of prior knowledge
in SOMBRERO's initialisation significantly improves accuracy when
known motifs are present in the input data, while accuracy is not
negatively affected for the discovery of novel motifs. SOMBRERO
is the only existing motif-finder that allows the incorporation
of entire transcription factor binding matrix databases as prior
Availability: SOMBRERO is freely
available from this link.
Citing SOMBRERO: Please cite SOMBRERO
using either (or both) of the following citations:
- S Mahony, A Golden, TJ Smith, PV Benos: "Improved detection
of DNA motifs using a self-organized clustering of familial binding
profiles." (2005) Bioinformatics 21(Suppl 1):i283-i291
( Proc. ISMB). Abstract,
- S Mahony, D Hendrix, A Golden, TJ Smith, DS Rokhsar: "Transcription
factor binding site identification using the Self-Organizing Map."
(2005) Bioinformatics 21(9):1807-14. Abstract,
SOMBRERO is the result of a collaboration between
Shaun Mahony (NUI Galway) and Dave Hendrix
(UC Berkeley). The project was carried out in Berkeley under the
supervision of Prof. Dan Rokhsar, in NUI Galway under the supervision
of Prof. Terry Smith and Dr. Aaron Golden, and in the University
of Pittsburgh in collaboration with Dr. Takis Benos.