How to catch flu (Wellcome Images) I’ve talked several times (for example, here, here, and here) about predicting cytotoxic T lymphocyte (CTL) epitopes, and emphasized how hard it is (or, at least, how poor the tools are). Here’s an example of why it’s difficult.

(Quick review: CTL recognize virus-infected cells by screening small peptides that are bound to the class I major histocompatibility complex [MHC class I]. The peptides are created by destruction of proteins in the target cell. There’s a handy guide to antigen presentation here, if that helps put things into context.)

In my previous post on the subject, I listed a bunch of different factors that need to be incorporated in the predictions. Number 7 was “Binding to the MHC complex in the ER”, and I commented that peptide binding to MHC class I is probably the second-best understood step in the pathway (behind TAP transport, if you’re keeping score at home).

A paper from earlier this year1 tried to identify CTL epitopes in influenza viruses. Lots of papers do this, but most don’t follow up with actual, complete tests — too expensive and difficult. Wang et al did the follow through.

They started by looking simply at binding to MHC class I alleles. Without going into details (they were looking for conserved epitopes that matched HLA supertypes, if anyone cares) they identified 167 peptides that they predicted should bind to the various MHC class I alleles; and then they tested them to see if they actually did bind. (They used NetMHC 3.0 2 to predict binding.)

Of the 167 predicted binders, 39 failed to bind altogether, and another 39 only bound very weakly. That leaves 89 peptides (just 53% of their tested pool) that were authentic binders.

Influenza viruses infecting cells of the trachea

Then, they tested to see if their peptides actually reacted with CTL from healthy donors. (They assumed that their healthy donors were immune to a influenza A — reasonable, but not a guarantee, so this is a particularly conservative test, I think.) Just 13 of their peptides were positive by this test (7.8% of their total predicted pool). Unexpectedly, two peptides that were non-binders triggered a response. Wang et al speculated that the very low affinity binding was enough for the CTL, but I wonder if this represented a contamination issue — CTL are famously sensitive, and it’s well known that tiny contaminating peptides in a synthetic prep are enough to trigger CTL, even if they’re barely detectable by other means.


The paper I’ve thought of as the record-holder for accuracy (if I’m being generous with their denominator) is Kotturi et al,3 whose prediction was correct for 25 of 160 potential peptides — about twice as good as the influenza predictions here. But Kotturi et al were dealing with just two MHC class I alleles, H-2Db and H-2Kb, and those are very intensively-studied alleles. Wang et al. are not only looking at multiple alleles, they were using supertype approaches that allow them to cover almost all (>99%) of the population — a much more difficult prediction. To me, then, their predictions are remarkably successful.

But still: Just over 7% of their predictions were correct. And even limiting to prediction to a single step in the complex pathway — just looking at MHC class I binding of the peptides — they’re barely above 50% accuracy.

It’s a hard job. But I have to say that the field is progressing with impressive speed; these predictions are much more accurate than I would have expected five years ago.

  1. Wang, M., Lamberth, K., Harndahl, M., Roder, G., Stryhn, A., Larsen, M. V., Nielsen, M., Lundegaard, C., Tang, S. T., Dziegiel, M. H., Rosenkvist, J., Pedersen, A. E., Buus, S., Claesson, M. H., and Lund, O. (2007). CTL epitopes for influenza A including the H5N1 bird flu; genome-, pathogen-, and HLA-wide screening. Vaccine 25, 2823-2831. []
  2. NetMHC is based on these three references — which I’m including as a note to myself: (1) Nielsen, M., Lundegaard, C., Worning, P., Hvid, C. S., Lamberth, K., Buus, S., Brunak, S., and Lund, O. (2004). Improved prediction of MHC class I and class II epitopes using a novel Gibbs sampling approach. Bioinformatics 20, 1388-1397 . (2) Nielsen, M., Lundegaard, C., Worning, P., Lauemoller, S. L., Lamberth, K., Buus, S., Brunak, S., and Lund, O. (2003). Reliable prediction of T-cell epitopes using neural networks with novel sequence representations. Protein Sci 12, 1007-1017 . (3) Buus, S., Lauemoller, S. L., Worning, P., Kesmir, C., Frimurer, T., Corbet, S., Fomsgaard, A., Hilden, J., Holm, A., and Brunak, S. (2003). Sensitive quantitative predictions of peptide-MHC binding by a ‘Query by Committee’ artificial neural network approach. Tissue Antigens 62, 378-384. []
  3. The CD8 T-Cell Response to Lymphocytic Choriomeningitis Virus Involves the L Antigen: Uncovering New Tricks for an Old Virus. Maya F. Kotturi, Bjoern Peters, Fernando Buendia-Laysa, Jr., John Sidney, Carla Oseroff, Jason Botten, Howard Grey, Michael J. Buchmeier, and Alessandro Sette. Journal of VIrology, May 2007, p. 4928–4940 []