The folks associated with the IEDB (Immune Epitope Database) have published a very nice and useful guide to all the serious contenders in the immune database field.  1 If you have a particular need, this is an excellent starting point for choosing the appropriate starting point.  (It’s an open access article, too.)

They’ve obviously looked in a lot more depth than I have, but they make a few comments that my more limited assessment strongly supports:

… our survey highlights clear shortcomings in the predictive tools available. Namely, MHC class II and B cell epitope predictive tools merit improvement, both in terms of predictive performance and, for MHC class II, in terms of coverage of species and alleles currently available. 1

They comment that most (80%) of citations of the databases are attributable to “practical applications”, which I take to mean direct use of the prediction tools (identification of epitopes in new flu strains, for example), construction of new tools (e.g. better prediction of epitopes), and maybe papers that review the databases (which is rather circular, I think).

Hearn et al 2009 FIgure 8
FIGURE 8. Aminopeptidases influence amino acid frequency N-terminal of naturally presented MHC I epitopes. Regions N-terminal to naturally processed MHC I epitopes, or selected randomly from protein pre- cursors, were identified as described in Materials and Methods. A, Probability of divergence occuring randomly (Chi2 test) vs position relative to epitope start site. B, Observed amino acid frequencies at position 1 (P1) relative to epitope start vs no divergence from back- ground (45-degree line). The amino acids that diverge +/-2 SDs from background frequency are indicated.2

The other 20% of citations are, I guess, using the databases to generate and test hypotheses.   This seems high, to me.  I don’t think I’ve seen very much basic science in immunology that builds on this sort of resource.  I think we’re reaching the point where these databases are usable to test and develop new hypotheses, though, and I hope to see more of this in the near future.

One example is our recent paper,2  where I used the IEDB to ask what influence ER aminopeptidases have on MHC class I epitopes (see the Figure to the left). (If you care, we concluded that aminopeptidases were probably most important for trimming N-terminal extensions of up to three residues, and that there was a global preference for a half-dozen amino acids and a bias against valine and, of course, proline — proline is resistant to aminopeptidase trimming in general, so that finding supported the approach.)

We weren’t the first to use this general approach (Schatz et al3 came up with the same idea independently and published before we did) but we used the IEDB, instead of the SYFPEITHI database, and were able to identify many more epitopes.   (My last run at the database coincided with the database being revised and half the search tools I needed stopped working, which was annoying, but the manager [Randi Vita] was very helpful and we managed to grind through the queries, albeit in slow motion compared to earlier runs.)

  1. Salimi, N., Fleri, W., Peters, B., & Sette, A. (2010). Design and utilization of epitope-based databases and predictive tools Immunogenetics, 62 (4), 185-196 DOI: 10.1007/s00251-010-0435-2[][]
  2. Hearn, A., York, I., & Rock, K. (2009). The Specificity of Trimming of MHC Class I-Presented Peptides in the Endoplasmic Reticulum The Journal of Immunology, 183 (9), 5526-5536 DOI: 10.4049/jimmunol.0803663[][]
  3. Schatz MM, Peters B, Akkad N, Ullrich N, Martinez AN, Carroll O, Bulik S, Rammensee HG, van Endert P, Holzhütter HG, Tenzer S, & Schild H (2008). Characterizing the N-terminal processing motif of MHC class I ligands. Journal of immunology (Baltimore, Md. : 1950), 180 (5), 3210-7 PMID: 18292545[]