Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes

A Krogh, B Larsson, G Von Heijne… - Journal of molecular …, 2001 - Elsevier
Journal of molecular biology, 2001Elsevier
We describe and validate a new membrane protein topology prediction method, TMHMM,
based on a hidden Markov model. We present a detailed analysis of TMHMM's performance,
and show that it correctly predicts 97–98% of the transmembrane helices. Additionally,
TMHMM can discriminate between soluble and membrane proteins with both specificity and
sensitivity better than 99%, although the accuracy drops when signal peptides are present.
This high degree of accuracy allowed us to predict reliably integral membrane proteins in a …
We describe and validate a new membrane protein topology prediction method, TMHMM, based on a hidden Markov model. We present a detailed analysis of TMHMM’s performance, and show that it correctly predicts 97–98 % of the transmembrane helices. Additionally, TMHMM can discriminate between soluble and membrane proteins with both specificity and sensitivity better than 99 %, although the accuracy drops when signal peptides are present. This high degree of accuracy allowed us to predict reliably integral membrane proteins in a large collection of genomes. Based on these predictions, we estimate that 20–30 % of all genes in most genomes encode membrane proteins, which is in agreement with previous estimates. We further discovered that proteins with Nin-Cin topologies are strongly preferred in all examined organisms, except Caenorhabditis elegans, where the large number of 7TM receptors increases the counts for Nout-Cin topologies. We discuss the possible relevance of this finding for our understanding of membrane protein assembly mechanisms. A TMHMM prediction service is available at http://www.cbs.dtu.dk/services/TMHMM/.
Elsevier