Most codons (with the exception of those coding
Methionine and Tryptophan) have at least one synonymous alternative.
Within the protein-coding regions of most sequenced genomes, the
occurrence of synonymous codons does not appear to be random. In
other words, genes seem to display a clear preference for one codon
over a synonymous alternative. This preference is known as the synonymous
codon usage pattern of a gene, and such patterns have been extensively
Studies on codon usage pattern conservation and
variation have been used for some time in generating hypotheses
of evolutionary relationships, predicting the expression levels
of a gene, and in genome annotation. However, codon usage patterns
vary not only from organism to organism, but also between genes
in the same genome. Predicting how much variation in codon usage
exists in a genome can be quite a difficult task.
Current methods of studying codon usage variation
take the form of multivariate analysis (9-11). This method represents
codon usage information as points in multidimensional space. Codon
usage trends in a genome are examined by reducing the dimensionality
of the points and plotting the resulting cloud of 2-d points. However,
dimensional reduction involves the loss of data, so multivariate
analysis methods are only really effective at identifying broad
trends in codon usage.
In order to identify more subtle codon usage patterns,
we have implemented a method based on a variant of the Self-Organizing
Map (SOM) neural network algorithm (12). This method has the ability
to automatically recognise common or repeated patterns in a dataset.
This architecture consists of a two-dimensional output “lattice”
of weight vectors. During training of the SOM, the weight vectors
end up representing various popular patterns in the dataset, and
similar patterns are clustered together in neighbouring areas of
the output lattice. The result is that we can easily visualise variation
within a dataset of patterns.
Because of its unique implementation, RescueNet
can be put to the following uses, which are difficult to achieve
with traditional codon usage analysis methods:
· Identifying groups of genes that have similar codon usage
· Identifying genes that have atypical codon usage patterns
in a genome
· Identifying regions of a contiguous genomic sequence that
have similar codon usage patterns to major patterns in that genome.
In this manual, Sections 7
& 8 give practical examples of using
RescueNet in the above contexts.
to index) (next)