Essential Gene Prediction

A System for Mutation Prediction Based on Noise Trimming and Positional Significance of Transposon Insertion; Application to Identify Essential Genes in Yersinia pestis

Authors: Zheng Rong Yang, Helen L Bullifent, Karen Moore, Konrad Paszkiewicz, Richard J. Saint, Stephanie J. Southern, Olivia L Champion, Nicola J Senior, Mitali Sarkar-Tyson, Petra C. F. Oyston, Timothy P. Atkins, Richard W Titball


Massively parallel sequencing technology coupled with saturation mutagenesis has provided new and global insights into gene functions and roles. At a simplistic level, the frequency of mutations within genes can indicate the degree of essentiality. However, this approach neglects to take account of the positional significance of mutations - the function of a gene is less likely to be disrupted by a mutation close to the distal ends. Therefore, a systemic bioinformatics approach to improve the reliability of essential gene identification is desirable. We report here a parametric model which introduces a novel mutation feature together with a noise trimming approach to predict the biological significance of Tn5 mutations. We show improved performance of essential gene prediction in the bacterium Yersinia pestis, the causative agent of plague. This method would have broad applicability to other organisms and to the identification of genes which are essential for competitiveness or survival under a broad range of stresses.

Acknowledgments:This work was supported by the Defence Science and Technology Laboratory under contract DSTLX-1000060221 (WP1).

The excutable C script for scanning a SAM file
Click here to download

The excutable C script for annotation
Click here to download

The R script (packed)
Click here to download

The README file(packed)
Click here to download

The SAM files (compressed)
Input 1
Input 2
Input 3

The PTT annotation file

The Prediction List file
Prediction List