NUI Galway Logo

NUI Galway Home

NUI Galway Prospective Students NUI Galway Library
NUI Galway Logo NUI Galway Search NUI Galway Faculties & Departments NUI Galway Student Life
NUI Galway Logo NUI Galway University News NUI Galway Research NUI Galway Administration & Services
NUI Galway Logo
Home >> Research >> Bioinformatics
Menu Header

1. Introduction

Most codons (with the exception of those coding Methionine and Tryptophan) have at least one synonymous alternative. Within the protein-coding regions of most sequenced genomes, the occurrence of synonymous codons does not appear to be random. In other words, genes seem to display a clear preference for one codon over a synonymous alternative. This preference is known as the synonymous codon usage pattern of a gene, and such patterns have been extensively studied (1-8).

Studies on codon usage pattern conservation and variation have been used for some time in generating hypotheses of evolutionary relationships, predicting the expression levels of a gene, and in genome annotation. However, codon usage patterns vary not only from organism to organism, but also between genes in the same genome. Predicting how much variation in codon usage exists in a genome can be quite a difficult task.

Current methods of studying codon usage variation take the form of multivariate analysis (9-11). This method represents codon usage information as points in multidimensional space. Codon usage trends in a genome are examined by reducing the dimensionality of the points and plotting the resulting cloud of 2-d points. However, dimensional reduction involves the loss of data, so multivariate analysis methods are only really effective at identifying broad trends in codon usage.

In order to identify more subtle codon usage patterns, we have implemented a method based on a variant of the Self-Organizing Map (SOM) neural network algorithm (12). This method has the ability to automatically recognise common or repeated patterns in a dataset. This architecture consists of a two-dimensional output “lattice” of weight vectors. During training of the SOM, the weight vectors end up representing various popular patterns in the dataset, and similar patterns are clustered together in neighbouring areas of the output lattice. The result is that we can easily visualise variation within a dataset of patterns.

Because of its unique implementation, RescueNet can be put to the following uses, which are difficult to achieve with traditional codon usage analysis methods:
· Identifying groups of genes that have similar codon usage patterns
· Identifying genes that have atypical codon usage patterns in a genome
· Identifying regions of a contiguous genomic sequence that have similar codon usage patterns to major patterns in that genome.

In this manual, Sections 7 & 8 give practical examples of using RescueNet in the above contexts.

(back to index) (next)

Department of Information Technology,
National University of Ireland, Galway, University Road, Galway, Ireland.
Phone: +353 (0)91 524411 ext 3549, E-mail: aaron.golden(AT)