Mystery Rays from Outer Space

Meddling with things mankind is not meant to understand. Also, pictures of my kids

June 29th, 2007

Yet another new trick: A new E1

The new tricks for the ubiquitin/proteasome system are coming thick and fast these days. Hot on the heels of the discovery of non-lysine ubiquitination and thymus-specific proteasome subunits, today’s issue of Nature reports that there’s a second E1! That may not be as startling to everyone as it is to me, but it’s yet another example of well-established observations being overturned.

Ubiquitin is attached to its substrate proteins through a multi-enzyme cascade. First, ubiquitin is “activated” by a ubiquitin-activating enzyme; then the ubiquitin is transferred to a ubiquitin-conjugating enzyme which, in combination with a ubiquitin ligase, transfers the ubiquitin to a specific substrate. The ubiquitinated substrate then does whatever its supposed to do when it’s ubiquitinated — gets degraded by the proteasome, perhaps, or trundles off to a new spot in the cell.

The ubiquitin ligase (an “E3” enzyme) is mainly responsible for the specificity of the reaction; there are thousands of ubiquitin ligases in the human genome, and probably each interacts with a small number of specific substrates.

Ubiquitin conjugating enzymes, the second in the chain (“E2” enzymes) are less abundant and less specific; there are a couple dozen of them. Each UBC interacts with a number of ubiquitin ligases, though the relationships here are not well understood in general.

The first link in the chain is the ubiquitin activating enzyme, the E1 (the gene in humans is “Ube1”). There’s only one of them — say all the reviews.1 If you knock out the single E1, as in some temperature-sensitive cell lines, then the cells die in a hurry.

The new discovery is described in:

Nature 447, 1135-1138 (28 June 2007) (doi:10.1038/nature05902)

Dual E1 activation systems for ubiquitin differentially regulate E2 enzyme charging

Jianping Jin, Xue Li, Steven P. Gygi & J. Wade Harper

RelationshipsThis is a gene called “Uba6”, it has the three domains that Ube1 does, it’s about 40% identical to Ube1, and it’s found in vertebrates and sea urchins but not invertebrates or fungi (both of which do, of course, have their own versons of Ube1). (Figure on the right is from the paper, showing the relationships between some E1-related genes of various species; note that the Uba6 genes are most closely related to Ube1 genes even from different species).

Uba6 is found throughout the body (so unlike the mouse thestis-specific version of E1, it’s not tissue-specific), though at much lower levels than E1 Classic. It acts like an authentic E1, in that it charges ubiquitin, but (and this is a cool and critical point) it seems to be specific for different ubiquitin-conjugating enzymes; two of the three UBCs they tested were strictly dependent on Ube1, while the third was strictly dependent on Uba6. It’s not merely a redundant, backup E1.

What are the implications? Once again there’s a technical point: The authors point out that “it is conceivable that certain pathways that were previously thought to be independent of ubiquitin on this basis may nevertheless require ubiquitin by means of a Uba6-dependent pathway.” But the bigger question is why vertebrates need two E1s, where invertebrates get along fine with just one.2 The authors propose (a little feebly, I think) that this may allow differential regulation: “One possibility is that Uba6 and Ube1 are differentially regulated by upstream signalling pathways to enhance flux through a specific conjugating pathway under particular circumstances.” That’s a pretty vague suggestion that covers a host of possibilities.

Still, being able to tease apart different pathways should be a very useful way of tracking down their function, and also may help actually understand what some of the different UBCs are doing. Might this be a way to organize some of the myriad ubiquitin functions? Could Uba6 lead to a different set of ubiquitin reactions — say, trafficking instead of degradation?

  1. Mice, but not most other species (including humans) express a second E1 in their testes. Also, there are a couple of transcriptional variants of human E1, but there doesn’t seem to be much functional significance of that, as far as I know.[]
  2. Unless, of course, there’s another E1 hiding in an invertebrate genome that we haven’t found yet[]
June 27th, 2007

More new tricks: Thymus-specific proteasomes

It seems like a theme here has been old dogs/new tricks, what with peptide splicing by proteasomes, new ubiquitin bonds, and new LCMV epitopes. Proteasomes, which have been studied intensively for well over 20 years, also showed a new trick recently.1

Proteasomes chop up proteins into peptides. Immunologists tend to think this is to make peptides that cytotoxic T lymphocytes (CTL) can recognize, but that’s a tiny, tiny fraction of a proteasome’s output. Mostly, proteasomes are there to destroy proteins that have reached the end of their useful life span. In fact, proteasomes are really designed to be (from the immunologists’ viewpoint) inefficient at producing CTL peptides.2 Almost all cells, almost all the time, are not infected by any viruses, and will never make use of the CTL recognition system, so any peptides that system draws away from the cell, are a strain on the natural recycling pathway.

Proteasome beta subunits

But what happens when a neighbouring cell is infected by a virus? That changes the whole equation. Now, local cells are much more likely to be infected, and it would behoove3 these neighbouring cells to divert more resources to antiviral defense systems. So, in the presence of interferon (which is produced in the presence of a viral infection), cells do many, many immune-related things, and one of those thing is to switch off one set of proteasomes and turn on a different set, evocatively known as “immunoproteasomes”.4 Immunoproteasomes differ from constitutive proteasomes in that they have a different set of catalytic subunits, so they have different cleavage preferences and they’re much more likely to make peptides that bind to MHC class I molecules and can therefore be examined by CTL. (Image on the left is of the three constitutive beta (catalytic) subunits; taken from: Chemistry & Biology 9:655-662 (May 2002). Probing Structural Determinants Distal to the Site of Hydrolysis that Control Substrate Specificity of the 20S Proteasome. Michael Groll, Tamim Nazif, Robert Huber and Matthew Bogyo)

This much has been known for 15 years or more. There are knockout mice, biochemical studies of the different kinds of proteasomes, calculations of their peptides’ lengths, and so on, and really the glitter has long since worn off proteasome subunits.

Except that Keiji Tanaka’s lab just turned up a new one.5 While rummaging through the genome, they noticed (right next door to one of the constitutive catalytic subunits of the proteasome) something that looked a lot like another, undescribed, proteasome catalytic subunit. On testing, sure enough, that’s exactly what it is. It incorporates into proteasomes, changes the catalytic activity of the proteasome, replaces its constitutive version … looks just like an interferon-inducible subunit. But it’s not interferon inducible. Instead, it’s tissue-specific; it’s only found in the thymus. (Image below is a small part of human chromosome 14, showing the new ß5 subunit, in blue, next door to its constitutive version, PSMB5, in red. Drawn with XPlasMap v.0.95, using the GenBank genomic sequence.)

Human chromosome 14

Why a thymus-specific proteasome subunit? The thymus is where T cells (including CTL) grow up; it’s where they learn how to recognize peptides, how not to recognize peptides that are part of the normal self, how to react only to abnormal (viral, tumour) peptides.6 So an obvious question was: Is this new subunit required for T cells to mature? And it is; quite dramatically so. Without this subunit, numbers of CTL exiting the thymus are reduced by maybe 75%. CD4 T cells, which are not believed to be dependent on proteasome-derived peptides for peptide recognition, weren’t affected. By comparison, knocking out the interferon-inducible catalytic subunits makes a small, often barely-detectable difference in CTL numbers and target recognition.7

Why is this one little subunit so important for CTL generation? Tanaka’s group proposed that this change in catalytic subunits makes thymic proteasomes generate a different set of peptides: peptides that are the opposite of immunoproteasome-generated peptides. Instead of being customized for MHC class I binding, these peptides are customized to be poor binders. By having low-affinity peptides, T cells would have their positive selection enhanced:

Considering that proteasomes are essential for the production of MHC class I ligands and that ß5t specifically attenuates the peptidase activities that cleave peptide bonds after hydrophobic amino acid residues, it is possible that thymoproteasomes predominantly produce low-affinity MHC class I ligands rather than high-affinity ligands in cTECs, as compared with constitutive- and immunoproteasomes, thereby supporting positive selection.

OK, if I’m interpreting this correctly, I’m not buying it. Clearly ß5t has a strong effect on CTL generation in the thymus; but I don’t see how simply generating low-affinity peptides can be the cause.

First of all, the number of CTL coming out are reduced to 25% of normal. But they show in their Supplemental Figures that only about 20% of proteasomes in the thymic cells have the thymus-specific subunit. The vast majority of proteasomes in these cells are either constitutive or immunoproteasomes, not thymoproteasomes. Five times as many high-affinity peptides will beat out low affinity peptides every time.

Thymic subunitsSecond, it’s well known that when MHC class I is associated with low-affinity peptides, the peptide falls off more readily, and the MHC class I is no longer recognized by most antibodies. Therefore, if ß5t forces MHC class I to have low-affinity peptides, eliminating ß5t should increase detectable surface levels of MHC class I. But Murata actually measure cell-surface MHC class I levels in wild-type and knockout mice, and there’s no difference (again this is in their Supplemental Figures). So there’s no indication that there is a significant amount of low affinity peptide involved. (Image on the right is from Murata et al’s paper, showing the distribution of ß5t in the thymic cortex vs. medulla)

So I don’t think that thymic epithelial cells are normally coated with low-affinity peptides that are important for CTL positive selection. Michael Bevan has a commentary8 on Murata et al, and he has a more plausible spin on the same question:

However, if the thymus cortical epithelium expresses a unique range of self peptides as Murata et al. suggest, this raises the possibility that positive selection may be mediated by self antigens that are not seen outside the thymus. Such sequestration of the positively selecting peptide may provide a greater safety window between high and low affinity to better guard against activated T cells cross-reacting on self antigens and causing autoimmunity.

I think this is a subtly but critically different suggestion. (Or, I could be misunderstanding either Bevan’s or Tanaka’s argument.) Tanaka’s group seem to be proposing that because of ß5t, the peptides associated with thymic epithelial cells are low-affinity. Bevan seems to be suggesting that there is a specific low-affinity peptide (or a small number of them) that are thymus-specific, and that are especially designed for T cell selection. This starts to sound a little reminiscent of CLIP for the MHC class II system,9 though not identical. I find this a more attractive model, and it’s testable (as is Tanaka’s model, of course). It makes me wonder — if it is CLIP-like — what the CLIP analogue peptide is. It would have to bind to a wide range of MHC class I alleles that have very different binding motifs … It occurs to me that it wouldn’t have to be a low-affinity binding at all, nor for that matter would it even have to bind entirely to the peptide-binding groove of MHC class I (maybe do something analogous to superantigens). That would get around the motif problem.

Should be interesting to see what comes up in the next year or two.

  1. Well, they’ve had the trick for the past 400 million years, but we just found out about it.[]
  2. Of course, it’s the other way around — CTL are designed to be inefficient at using proteasomes’ peptides, because proteasomes are phylogenetically much older than CTL; proteasomes are present in Archaebacteria, plants, yeast — things that have never even come close to a CTL. When CTL and the class I MHC system arose, they were optimized to not suck too many peptides out of the protein recycling pathway.[]
  3. This footnote exists solely to give me another chance to say “behoove”[]
  4. A term coined by Keiji Tanaka, from whose lab today’s paper comes: J Leukoc Biol. 1994 Nov;56(5):571-5. []
  5. Regulation of CD8+ T Cell Development by Thymus-Specific Proteasomes. Shigeo Murata, Katsuhiro Sasaki, Toshihiko Kishimoto, Shin-ichiro Niwa, Hidemi Hayashi, Yousuke Takahama, Keiji Tanaka. Science 316:1349-1353 (DOI: 10.1126/science.1141915 ) []
  6. Less delicately, they never actually learn, but they’re killed if they fail the test.[]
  7. Immunity. 1994 Oct;1(7):533-41; Science. 1994 Aug 26;265(5176):1234-7; and J Immunol. 2006 Jun 1;176(11):6665-72.[]
  8. Science 316:1291-1292 (June 2007) []
  9. Immunity. 1999 Jan;10(1):83-92. Thymic selection by a single MHC/peptide ligand: autoreactive T cells are low-affinity cells. Lee DS, Ahn C, Ernst B, Sprent J, Surh CD. And Eur J Immunol. 2000 Dec;30(12):3542-51. CLIP-derived self peptides bound to MHC class II molecules of medullary thymic epithelial cells differ from those of cortical thymic epithelial cells in their diversity, length, and C-terminal processing. Kasai M, Kropshofer H, Vogt AB, Kominami E, Mizuochi T.[]
June 25th, 2007

Immunodominance: Part I (Some background)

Cytotoxic T lymphocytes (CTL) recognize peptides that are about 9 amino acids long. There are lots of constraints on which peptides can possibly be presented; the most important factor is whether the peptide can bind to one the MHC class I alleles that the host expresses. Still, a generic virus will have hundreds or more likely thousands of peptides that are reasonable CTL targets. Of those peptides, how many are actually recognized by CTL? Of those that are recognized by CTL, how many are recognized effectively (enough to trigger a detectable response)? Does it make any difference which, and how many, are recognized? And — most interestingly — why are so few peptides recognized?

There are technical problems with this question. One huge problem is just how to identify the peptides that are recognized. Typically, you’d have to synthesize peptides from the viral genome, mix them with CTL from an immune host, and figure out which of the peptides activate the CTL. However, if you try to synthesize all the possible peptides from a viral genome, you’ll have many thousands of peptides: Expensive, to say nothing of the work involved in screening.

People have tried to get around this in two ways. One is to use longer peptides. Traditionally, screening has used 15mers rather than 9mers. Using overlapping 15mers instead of every possible 9mer can cut your screening down into a relatively manageable range — a couple thousand or fewer. Still a big job, but practical. One problem with this, of course, is that 15mers shouldn’t work at all for MHC class I! MHC class I alleles (in contrast to MHC class II) rarely bind peptides anywhere near that long; rarely much more than 11 or so amino acids long. So what you’re counting on, with your 15mers, is that either they’re contaminated with incomplete synthesis products (a common situation), or that they’re partially degraded in the medium when you add them to your cells. In either case, you really don’t have a good idea what your actual coverage of the viral proteome is.

Another approach is to try to cut down your required peptides, by trying to predict which ones could possibly bind to your MHC class I and only (or mainly) synthesizing those. The problem here is that for all the progress in understanding MHC class I binding motifs, there are lots of high-affinity peptides for various MHC class I alleles that don’t even come close to matching the putative binding motif. Your coverage is only as good as your predictions, and your predictions will miss some genuine epitopes.

(Another possible problem with both of these approaches is that they’ll miss peptides that are not part of the viral proteome. That includes things like spliced peptides (see my previous post on that), out-of-frame peptides,1 and post-translationally modified peptides that don’t match the encoded sequence — the most famous example probably being glycosylation sites where the carbohydrate is stripped off the Asn in the cytosol to leave a non-templated Asp.2 )

This brings me to Kotturi et al, a paper I’ve mentioned here before:

The CD8 T-Cell Response to Lymphocytic Choriomeningitis Virus Involves the L Antigen: Uncovering New Tricks for an Old Virus

Maya F. Kotturi, Bjoern Peters, Fernando Buendia-Laysa, Jr., John Sidney, Carla Oseroff, Jason Botten, Howard Grey, Michael J. Buchmeier, and Alessandro Sette

Journal of VIrology, May 2007, p. 4928–4940 (doi:10.1128/JVI.02632-06)


Lymphocytic choriomeningitis virus (invariably abbreviated to LCMV for obvious reasons) is one of the classic models of viral immunity. One of its many nice qualities3 is that it induces a tremendous (i.e. easily measured) immune response. At the peak of the immune response, 6 to 8 days after infection, some 80 to 95% of a mouse’s CD8 +ve T cells4 may be reactive with LCMV. That makes it relatively easy to detect individual components of the response. In other words, you can readily define individual peptide epitopes within the CTL response to LCMV. Another nice thing about the virus is that it’ usually cleared, if you infect an adult mouse, so you can then move on to analyze memory responses, but I won’t get into that today. (The image on the left is of an arenavirus [LCMV is in the arenavirus family] from Michael Buchmeier’s lab at Scripps.)

LCMV peptides

Because LCMV has been studied for a while, and because the CTL response is so large, there have been a bunch of viral epitopes defined; in the commonly-used C57BL/6 mouse, 7 peptides were known to induce CTL reactivity since 1998.5 Seven epitopes is actually a fair number — most viruses don’t have that many defined epitopes for just two MHC class I alleles — but three more epitopes were added earlier this year6 bringing the total to 10 defined epitopes that bind to the B6 mouse MHC class I alleles. (The image on the right shows two of the best-recognized peptides from LCMV glycoprotein, in the shape they assume when bound to particular MHC class I alleles. Taken from: A structural basis for LCMV immune evasion: subversion of H-2D(b) and H-2K(b) presentation of gp33 revealed by comparative crystal structure analyses. Achour A, Michaëlsson J, Harris RA, Odeberg J, Grufman P, Sandberg JK, Levitsky V, Kärre K, Sandalova T, Schneider G. Immunity. 2002 Dec;17(6):757-68.)

However, these 10 epitopes only account for around 80% of the CTL response to LCMV — that is, if you take all the CTL that light up in response to an authentic LCMV-infected cell, about a fifth of those will not light up in response to any of the known epitopes. What are those remaining guys reacting to? Kotturi et al went looking for the missing triggers.

They used both of the approaches I’ve mentioned here. They not only screened with overlapping 15mers covering much of the LCMV proteome, they used MHC prediction programs to identify particular candidates for CTL epitopes and screened those particularly. All in all, they looked at 1064 peptides: “A total of 400 Kb and Db algorithm-selected peptides, along with a set of 664 15-mer peptides, overlapping by 10 amino acids, spanning the entire LCMV proteome, were synthesized.”

Now, remembering that this is an intensively-studied virus, one that’s been a workhorse of immunology for decades, how many new epitopes do you think they turned up? Ten are already known. Kotturi et al turned up another 19 — they nearly tripled the number of MHC class I epitopes for LCMV. That’s the first remarkable thing; it suggests that probably most claims for the number of viral peptides that are recognized are drastic underestimates. (It also suggests that cross-reactive T cells are not common, but that’s another story.)

The next interesting point about their paper is where they got their hits — from their predicted epitopes, or from their 15mers? Well, the predictions did pretty well:

The 15-mer approach including truncated peptide sets required synthesis and testing of 1,2147 peptides and identified approximately 65.2% of the overall response. By contrast, the predictive approach required synthesis and testing of 400 peptides (or 160 if only the top 1.2%8 from each allele would have been synthesized) and identified approximately 88.9% of the total response.

But the predictions did miss several true epitopes; some of the genuine MHC class I epitopes just don’t look like things that are supposed to bind to H-2Kb. If you want to pick up on things that are not, as yet, predictable, you still need a brute-force approach.

So of the hundreds or thousands of potential LCMV epitopes, there are 29 that actually get recognized.9 That’s a fair number of epitopes. But here’s the next part (in fact, this is the whole point of this post). Look at the distribution of CTL responses to each peptide. Here’s what it looks like as a fraction of the total CTL response to LCMV:


The top 2 peptides of the 29 cover 25% of the response; the top 4, 50%. You need to put the bottom 18 peptides together to catch up to the first two and make up 25% of the response!10 This, ladies and gentlemen, is what we call immunodominance. The top handful of peptides are immunodominant — in a C57BL/6 mouse, those peptides will invariably be the targets of the vast majority of the CTL response.11 The other peptides will cause a response that, while detectable, is much lower than that to the dominant peptides.


Well, we don’t know, but at least we think we know some of the possible explanations. More in a later post.

  1. Nilabh Shastri has probably been the strongest supporter of this concept. See, for example, Constitutive display of cryptic translation products by MHC class I molecules. Schwab SR, Li KC, Kang C, Shastri N. Science. 2003 Sep 5;301(5638):1367-71. I’m not yet convinced that this is as common as he argues, but it clearly can happen.[]
  2. There are a number of examples of this now. The first demonstration that it can happen was: An HLA-A2-restricted tyrosinase antigen on melanoma cells results from posttranslational modification and suggests a novel pathway for processing of membrane proteins. Skipper JC, Hendrickson RC, Gulden PH, Brichard V, Van Pel A, Chen Y, Shabanowitz J, Wolfel T, Slingluff CL, Boon T, Hunt DF, Engelhard VH. J Exp Med. 1996 Feb 1;183(2):527-34.[]
  3. for an immunologist, anyway[]
  4. Quantitating the magnitude of the lymphocytic choriomeningitis virus-specific CD8 T-cell response: it is even bigger than we thought. J Virol. 2007 Feb;81(4):2002-11. Masopust D, Murali-Krishna K, Ahmed R[]
  5. van der Most, R. G., K. Murali-Krishna, J. L. Whitton, C. Oseroff, J. Alexander, S. Southwood, J. Sidney, R. W. Chesnut, A. Sette, and R. Ahmed. 1998. Identification of Db- and Kb-restricted subdominant cytotoxic T-cell responses in lymphocytic choriomeningitis virus-infected mice. Virology 240: 158–167.[]
  6. Masopust, D., K. Murali-Krishna, and R. Ahmed. 2007. Quantitating the magnitude of the lymphocytic choriomeningitis virus-specific CD8 T-cell response: it is even bigger than we thought. J. Virol. 81:2002–2011. Yes, same reference as before, but I can’t bear to struggle with these footnotes any more.[]
  7. The 664 was their starting pool of 15mers; to actually find the epitopes, they had to synthesize sub-peptides from within the positive 15mers.[]
  8. They broke down the success rate by the rank of the prediction and found that in fact they could have covered most of their hits by using fewer peptides from the most confident predictions[]
  9. There may even be a handful of others; Kotturi et al. don’t account for a few percent of CTL responses even with all the known epitopes. But that may be a sensitivity issue, so let’s assume that the 29 cover everything[]
  10. So, even if the missing few percent of responses are real, one would expect that it would be divided up among many — dozens? Hundreds? — of individual peptides, perhaps all below the limits of sensitivity for present assays.[]
  11. As a side note, even though the predictions did reasonably well — surprisingly well, to me — within their predictions the rank wasn’t a good correlation of immunodominance. For example, the most dominant peptides (50% of the total response) ranked 2, 25, 14, and 28 as predicted epitopes, whereas the three peptides with the highest prediction rank only covered 3.5% of the total response all together[]
June 21st, 2007

Non-lysine ubiquitination

In a new and dynamic field, everything-you-know-is-wrong papers appear regularly, and no one is too surprised. Usually, once a field of study has been around for a while (twenty years or more, say) most of the basics are settled in, and when an e-y-k-i-w paper comes along there’s either great skepticism or great angst or both. But there are also some long-established fields where paradigms seem to be shattered on a weekly basis. Ubiquitin is one of those. Another universal rule of ubiquitin was disproven recently, and I for one just nodded thoughtfully, unsurprised.

The paper is:

Ubiquitination of serine, threonine, or lysine residues on the cytoplasmic tail can induce ERAD of MHC-I by viral E3 ligase mK3

Xiaoli Wang, Roger A. Herr, Wei-Jen Chua, Lonnie Lybarger, Emmanuel J.H.J. Wiertz, and Ted H. Hansen

The Journal of Cell Biology, Vol. 177, No. 4, May 21, 2007 613-6241

It’s particularly interesting to me because it’s yet another example of the important insights into cell biology that arise from antigen presentation in general and viral immune evasion in particular.

The paradigm that was overthrown is that “poly-ubiquitination, the process in which a chain of at least four ubiquitin peptides are attached to a lysine on a substrate protein, most commonly results in the degradation of the substrate protein via the proteasome.” (That’s from Wikipedia– my first source for oversimplified summaries that miss important advances and misinterpret what they do find. But other articles on ubiquitin include similar statements.) It’s the “lysine” bit I’m taking issue with in this case.

Ubiquitin moleculeUbiquitin/proteasome pathwayUbiquitin was identified 30-odd years ago (picture on the left from the Nobel Prize web site). It’s a small, abundant protein that’s found in all eukaryotes, and it’s involved in protein destruction. That’s the last of the firm statements: For the rest of this paragraph, you should imagine every statement to be footnoted or qualified in some way, because throughout the past 30 years ubiquitin has made a habit of constantly revealing unexpected functions and new aspects. The simplest pattern is the one you’ll find in innumerable posters and illustrations (the one on the right is from Sigma-Aldrich, but there are scores of virtually-identical ones out there). In this pathway, ubiquitin is covalently attached to proteins, new ubiquitins are attached to the original one, a polyubiquitin chain forms, the proteasome recognizes the polyubiquitin chain, and the tagged protein is destroyed, releasing ubiquitin to kill again. It’s one way to put the regulation in your regulated proteolysis.

Poly-ubiquitin chain on Src

The canonical linkage for ubiquitin in this targeting to the proteasome is between a lysine on a substrate protein, and the terminal glycine on ubiquitin; followed by a chain of ubiquitins tagging onto the preceding ubiquitin’s lysine 48. The beautiful picture on the right is taken from the PDB’s Molecule of the Month from 2004, and shows “a string of ubiquitin molecules (colored pink and tan here, from PDB entry 1ubq) attached to old proteins, such as the src protein shown here (colored blue, from PDB entry 2src).”

There’s no room here to talk about all of the myriad other functions for ubiquitin that have been discovered over the years, but I want to highlight one in particular. In the mid 1990s2 it was discovered that ubiquitination of cell-surface molecules didn’t necessarily lead to destruction by the proteasome, but rather to internalization and in some cases destruction by the lysosome. What’s more, this receptor targeting mode of ubiquitin often involves polyubiquitin chains extending from ubiquitin’s lysine 63, not 48.3 So already there was precedent for flexibility in ubiquitin linkages.

A more recent observation came in the late 1990s and early 2000s, with the unexpected discovery4 that ubiquitin doesn’t even need lysines on its substrate protein; instead, ubiquitin can link up with the amino terminal residue of the protein and form a polyubiquitin chain there.

(I’m surprised that this doesn’t seem to be more widely known. I’ve talked to several people who have come to me, scratching their heads, because they’ve mutated all the lysines in their protein and still see it being ubiquitinated and destroyed — they were quite amazed when I pointed this phenomenon out to them.)

Wang et al, in the paper I’m highlighting here, take this one step further. They were looking at the way mK3 (a viral immune evasion molecule that causes class I major histocompatibility complexes to be rapidly degraded by the proteasome) causes degradation of MHC class I molecules. To make a long story short — hey, you should read the paper yourself! — they mutated all the lysines in the substrate protein and still saw polyubiquitination and degradation. But when they removed threonines and series — amino acids that are supposedly inaccessible to ubiquitin tagging — then the protein was no longer polyubiquitinated, and was no longer degraded. This seems to be a novel chemical process for ubiquitin, too, not involving the usual amide linkage but instead involving an ester bond.

What are the implications of this? On a purely technical basis, of course, it means that all the people who have decided ubiquitin can’t be important for their protein because there are no lysines available, have to go back and actually test directly. A more interesting question is whether non-lysine ubiquitination is a normal cellular process that the virus is just piggy-backing on, or whether this doesn’t occur normally and mK3 somehow forces the system in a new and bizarre direction. My guess is that this is in fact a normal cellular capability (there are hints in the paper and from previous literature that this may not be an abnormal event, but as yet they’re only hints). If so, the next question is whether this is a normal function that’s specific for the particular form of degradation here — that is, ER-associated degradation (ERAD), which is the process by which secreted or transmembrane proteins get destroyed during their maturation in the endoplasmic reticulum. ERAD is a fairly new and active field, and there’s a lot that’s not understood about it yet. If non-lysine ubiquitination is ERAD-specific, or especially if it’s actually a marker for ERAD, that would be really interesting and might offer a handle manipulating ERAD. Wang et al conclude:

It will be interesting to determine whether other ERAD pathways involving transmembrane protein substrates might also involve tail ubiquitination using non-K residues. Furthermore, the fact that mK3 has numerous viral (including MIR1) and cellular homologues makes it attractive to speculate that other ubiquitination-regulated processes use similar nonconventional methods of Ub conjugation.

  1. doi:10.1083/jcb.200611063[]
  2. I think the first papers were Hicke L and Riezman H (1996) Ubiquitination of a yeast plasma membrane receptor signals its ligand-stimulated endocytosis. Cell, 84, 277-287. and Strous GJ, Vankerkhof P, Govers R, Ciechanover A and Schwartz AL (1996) The ubiquitin conjugation system is required for ligand-induced endocytosis and degradation of the growth hormone receptor. EMBO J, 15, 3806-3812.[]
  3. Nice if now dated review: Dubiel W, Gordon C. Ubiquitin pathway: another link in the polyubiquitin chain? Curr Biol. 1999 Jul 29-Aug 12;9(15):R554-7.[]
  4. The EMBO Journal (1998) 17, 5964-5973. A novel site for ubiquitination: the N-terminal residue, and not internal lysines of MyoD, is essential for conjugation and degradation of the protein. Kristin Breitschopf, Eyal Bengal, Tamar Ziv, Arie Admon and Aaron Ciechanover. (doi:10.1093/emboj/17.20.5964); also see Ciechanover’s review in Trends Cell Biol. 2004 Mar;14(3):103-6 (doi:10.1016/j.tcb.2004.01.004) []
June 18th, 2007

Looking under the streetlamps: NK cells and viral immune evasion

The immune system has lots of tentacles, and viruses have to avoid all of them if they’re going to successfully infect you. Antibodies, complement, cytotoxic T lymphocytes, innate factors, interferons, interleukins, tumor necrosis factor, toll-like receptors, natural killers cells … it’s a harsh world in there for a virus, but they’re up to the task. We still have lots to learn1 about how they do it.

In the last month or so there have been three nice papers talking about viral immune evasion of natural killer cells:

MCMV m157

Structural elucidation of the m157 mouse cytomegalovirus ligand for Ly49 natural killer cell

Erin J. Adams, Z. Sean Juo, Rayna Takaki Venook, Martin J. Boulanger, Hisashi Arase, Lewis L. Lanier, and K. Christopher Garcia

PNAS 104;10128-10133 (2007)

Zoonotic orthopoxviruses encode a high-affinity antagonist of NKG2D

Jessica A. Campbell, David S. Trossman, Wayne M. Yokoyama, and Leonidas N. Carayannopoulos

J Exp Med 204:1311–1317 (2007)

Cytomegalovirus Evasion of Innate Immunity by Subversion of the NKR-P1B:Clr-b Missing-Self Axis

Sebastian Voigt, Aruz Mesci, Jakob Ettinger, Jason H. Fine, Peter Chen, Wayne Chou, and James R. Carlyle

Immunity 26:1–11 (2007)

(The figure on the left is the mouse cytomegalovirus immune evasion protein m157, from the pdb file from the Adams et al. paper. You can see the family relationship to class I major histocompatibility complexes.)

Immune evasion references

Before I talk about any of the papers, though, I want to say how cool it is to have multiple NK immune evasion papers all coming out around the same time. Our understanding of NK immune evasion has lagged well behind the T cell side (the figure on the right shows the cumulative number of papers2 of NK vs. cytotoxic T cell immune evasion. You can see how T cell research is maybe 5 years ahead of NK, which have really nothing earlier than 1993.3

Of course, the reason is that research generally works best when you’re looking under the streetlamps. Until we had some idea about how NK cells recognize their targets, it was pretty tough to figure out how viruses could block that unknown pathway. For cytotoxic T cells, the recognition mechanism (MHC class I) was reasonably well worked out by the early 1990s, and it wasn’t long after that that the explosion in research on immune evasion of that pathway followed.4 But Klaus Karre didn’t propose the “missing self” hypothesis until around 19905, and the activating receptors for NK cells weren’t at all well understood until well the mid-1990s.

The NK immune evasion findings have pretty much paralleled the state of NK cell research. The first specific NK evasion protein to be identified (that I’m aware of) was the UL18 protein from human cytomegalovirus, in 1997 or so.6 It’s an MHC class I homologue encoded by the viral genome, and after the missing-self hypothesis was widely accepted a fairly obvious question was whether UL18 could replace the self that was missing.7 Early answers were confusing, probably because of incomplete understanding of the NK recognition system, but quite recently a fairly confident and mostly positive answer8 appeared. Shortly afterward, as the understanding of NK recognition progressed to finding the activating receptors, other systems of immune evasion popped up, at first mostly relatively generic9 and then progressing to more and more specific and precise targeting, as the tools to specifically measure the individual NK cell ligands became available10.

It’s not just NK cells, of course. As we11 begin to understand toll-like receptors, for example, we’ll start to find viral interference with them. As we identify small interfering RNAs that are involved in immunity, we’ll start to find viral immune evasion mechanisms for them (of course, in plants, where the understanding of small interfering RNAs is much more advanced, viruses that block small interfering RNAs are well known). As as we find … whatever field comes up next in immunity to viruses … we’ll surely find that the viruses themselves got there a hundred million years ago, and have already stamped down the grass and settled in for a comfortable nap.

A long shot, but interesting, approach might be a reverse immune evasion approach. Find some mystery gene in a virus (there’s no shortage of mysteries); find out what it targets; and work under the assumption that it’s part of the immune system. You’d get lots of misses, but the nice thing would be that even the misses would be interesting. As a fishing expedition, though, it would be hard to fund nowadays.

  1. “The stupidest virus is smarter than the smartest virologist” — Matthias Reddehaase; quoting someone else whose name I didn’t catch[]
  2. Found on a quick and simple-minded PubMed search; I’m not claiming this is all-inclusive[]
  3. A paper by me, during my PhD! Was I really the first person, by 4 years, to mention viral immune evasion of NK cells?[]
  4. And even well before that, in the mid-19080s, the broad strokes and molecules involved were understood, so that the interactions between adenovirus proteins and MHC class I could (with some difficulty) be put into the correct context.[]
  5. I think the article was Immunol Today. 1990 Jul;11(7):237-44, but I may have missed an earlier version of the hypothesis[]
  6. Reyburn, H.T. et al. The class I MHC homologue of human cytomegalovirus inhibits attack by natural killer cells. Nature 386, 514−517 (1997) []
  7. In case you’re not familiar with the missing self hypothesis — the concept is that natural killer cells survey potential target cells for evidence that they are “self”, meaning that they have the correct allele of MHC class I on their surface. If they do have the right MHC, the NK cell is inhibited. If not — if the cell comes from another individual and has the wrong MHC alleles, or if a virus has infected the cells and caused it to down-regulate its MHC through the viral immune evasion of T cell recognition — then the NK cell will destroy the target. []
  8. The human cytomegalovirus MHC class I homolog UL18 inhibits LIR-1+ but activates LIR-1- NK cells. J Immunol. 2007 Apr 1;178(7):4473-81[]
  9. For example, interference with LFA-3 ( J. Immunol. 161, 2365−2374 (1998) []
  10. Such as Lodoen et al., The cytomegalovirus m155 gene product subverts natural killer cell antiviral protection by disruption of H60-NKG2D interactions. J Exp Med. 2004 Oct 18;200(8):1075-8[]
  11. Where I say “we”, by the way, I mean scientists as a group, not me and my tapeworm[]
June 15th, 2007

Peptide splicing, proteasomes, and immunity

Here I’m picking up on a throwaway comment I made in a thread on Larry Moran’s “Sandwalk” blog. Larry wrote about protein turnover in the cell, a favourite topic of mine to start with, especially when proteasomes come into play, as they so often do.

In the comments, daedalus2u observed “Proteases only hydrolyze peptides when the equilibrium favors it. Under conditions of dehydration, the equilibrium favors the making of peptides.” He made this in the context of lysosomes (and frankly his train of thought seems to increasingly run off the rails as the comment progresses) but it prompted Ryan to say that “I doubt proteasomes could ever act in reverse. ”

Just about everyone else doubted it, too, until a few years ago, when some really cool evidence for just that happening came out of immunology. As it turns out, though, proteasomes almost certainly can act in reverse and splice peptides. For a while it even seemed possible that this could be a common event, but I think it’s becoming increasingly likely that it’s actually a very rare event, one that’s usually only detectable by the exquisitely-sensitive T cell recognition system.1

Ryan’s reasoning wasn’t bad. He argued that “dehydrating the proteasome would change it’s structure and probably eliminate any catalytic activity.” That makes sense, but it misses something unusual (though not unique) about the proteasome.

Proteasomes have been in the news quite a bit since they won the Nobel in 20042 and there are lots of friendly introductions to proteasome-mediated protein degradation around. The Nobel Foundation has a fairly friendly “Information for the public” thing, and a less friendly but more complete PDF . For the purpose of peptide splicing, though, you only need to know the basics.

Here’s the basics: Proteasomes are multi-catalytic proteases, and they’re very abundant throughout the cytoplasm and nucleus of most cells. From this, you can work out why peptide splicing works. Not that anyone actually did work it out, but in hindsight there’s a definite logic to it. Follow closely here:

Proteasomes are multicatalytic. That is, they can chop up many different peptide bonds. That’s in contrast to many proteases, that only cleave when a very precise sequence of amino acids line up. Proteasomes do have their preferences, sure; there are sequences they don’t like — but if you feed a protein to a purified proteasome you’ll find that virtually every possible amino acid pair has been cleaved (if only very rarely).

If they’re multicatalytic, and they’re abundant, then they’re a potential hazard to normal cell function. You can’t have a protease indiscriminately chewing up cellular proteins. So proteasomes are regulated proteases (the regulation part is what the Nobel was for). If they’re regulated, you have to have a way to shield the catalytic sites so they only attack what they’re supposed to. Proteasomes do this by hiding their active sites on the inside of a hollow cylinder.

Proteasome end viewProteasome side viewHere I get to throw in a couple of images of the proteasome, which is something I do at every opportunity anyway.3 There’s an end view and a side view.4 In fact in a real cell, you probably wouldn’t see the end view like this, because this is the central core of a larger particle that has caps over the open ends. But it makes the point that this is a hollow, barrel-shaped structure. The catalytic sites are on the inside, the caps normally prevent access to the inside, and the regulatory machinery ends up selecting proteins that feed into the open chamber for destruction.

A couple of other proteases follow this pattern, by the way — tricorn protease is a huge, hollow icosahedral particle, for example. Tripeptidyl peptidase II is also a gigantic particle, and I wonder if there’s some kind of regulatory aspect to its size, even though as far as I know from relatively crude evidence, the catalytic sites of TPPII are more or less exposed.

Anyway, the hollow barrel of a proteasome is probably the key to its ability to do peptide splicing. As daedalus2u pointed out, enzymes run both ways. Proteases in general act through hydrolysis, which requires, of course, water. If there’s no water, the reaction can run backwards. In the old days, I’m told, that was how you synthesized peptides: you took the appropriate enzyme and ran the reaction in a non-aqueous system. Normally, of course, there is water inside a proteasome, or it wouldn’t work. But it’s not hard to picture a scenario where peptides are being rapidly generated, and before they have a chance to diffuse out of the proteasome they’re squeezing away water molecules. There you have a high concentration of reactive peptide ends, crowded together in the absence of a water molecule and bumping up against a promiscuous active site. When that happens, you can get peptide splicing.

As I said, this was detected using T cells, which are very sensitive to peptides — recognizing fewer than ten per cell, perhaps. In 2004, Benoit van den Eynde showed that a peptide that was a T cell epitope was in fact generated by peptide splicing in the proteasome5 and later, he showed that you can even swap position, demonstrating this with a T cell epitope that was generated by splicing two peptides in the reverse order.6

How common is this? After the first paper or two, we really didn’t know. When you look at peptide epitopes associated with a cell, I’m told, there are often a significant number that can’t be identified by blasting through databases. Were all of these unidentified because they were peptide splices? That was Benoit’s original idea, I think, and I wouldn’t have been at all surprised to see a small flood of papers triumphantly identifying as spliced those pesky holdout peptides from previous work.

Hasn’t happened, though. It’s negative evidence, but for the most part peptide splicing doesn’t seem to have fixed the problem of the unidentified peptide.7 Perhaps there will still be a herd of peptide splicing examples popping up any day now, but for now I’m leaning to the idea that this really is a very rare event.

Too bad, because it’s pretty cool.

  1. But I’m not going to be dogmatic about it. It’s an open possibility that this is a common event that’s just very hard to detect[]
  2. At any event, they’ve been in the news more often, even if they haven’t caught up with Paris Hilton yet[]
  3. Just one of the many things that make me the life of any party. I wonder why I’m not invited to more?[]
  4. This is the mammalian 20S proteasome, ref. Unno et al., Structure 2002 May; 10(5):609-18. I made the images with iMol from the pdb files.[]
  5. Science. 2004 Apr 23;304(5670):587-90[]
  6. Science. 2006 Sep 8;313(5792):1444-7[]
  7. They’re probably allelic variants, or maybe sequencing errors, or something like that, is my guess now[]
June 14th, 2007

Epitopes and Microsoft Computational Biology

Microsoft has released as open-source some code for analysis of antiviral immunity ( ) They offer 4 tools: PhyloD, Epitope Predictor, HLA Completion, and HLA Assignment. The first two are particularly interesting to me.

PhyloD is

a statistical tool that can identify HIV mutations that defeat the function of the HLA proteins in certain patients, thereby allowing the virus to escape elimination by the immune system. By applying this tool to large studies of infected patients, researchers are now able to start decoding the complex rules that govern the HIV mutations, in the hope of one day creating a vaccine to which the virus is unable to develop resistance.

The reference is to Bhattacharya et al., Science 16 March 2007: Vol. 315. no. 5818, pp. 1583 – 1586. It’s work that arises directly out of Bruce Walker’s (and others, but mostly Walker’s) work on HIV immune escape variants, which dates back to the late 1990s. I want to talk about immune escape in HIV some time, but that’s going to be a long post and I have a grant due, so I’m just going to move on to the second interesting tool, the Epitope Predictor. “This tool computes the probability that a given kmer is a T-cell epitope restricted to a given HLA allele”; the reference is Heckerman et al., RECOMB 2006, which I haven’t read yet.

This is interesting to me because it’s something I’m working on directly as well. Epitope prediction is a remarkably difficult job to do well — it’s easy to take a first pass and drastically narrow down your possibilities, but getting an accurate end product is hard.

Epitopes, in this case,are sequences of amino acids that are cut out of the full-length protein and recognized by the T cells. A full-length protein might be 500 or 1000 or more amino acids long, whereas epitopes are typically 9 amino acids long. A generic virus, say HIV, will have thousands, tens of thousands, of peptides of the appropriate length. There are moderate constraints on what can be turned into epitopes, because the peptides have to bind to HLA molecules. (HLA, human leukocyte antigen, is the species-specific term for MHC, major histocompatibility complex. I tend to use MHC, but to avoid, or at least reduce, confusion, Il’l stick to HLA here.) HLA molecules have binding rules: “Anchor” positions of the peptide must fit certain pattterns. For example, a peptide that binds to one particular human MHC allele (HLA-A3) will usually have a leucine, valine, or methionine at position 2, a lysine, tyrosine, or phenylalanine at the last position, and is fairly likely to have one of two amino acids at position 3, one of five at position 6, and one of four at position 7. So still fairly broad, but much narrower than the 20 to the 9th possibilities with no restrictions at all.

Humans, like almost all vertebrates, are wildly complex at the MHC genes — you don’t have the same HLA type as your neighbour, and probably don’t even have exactly the same type as your sister. But let’s just focus for now on one HLA type, HLA-A2 (the most common HLA-A allele in North American caucasians), because I want to see how good the Microsoft epitope prediction is.

There are several other on-line epitope prediction tools, and I haven’t tried all of them. One is at, another is at I’ve also written a couple of my own, just for fun, that are very simple-minded and crude. My own, which I’ve tested more extensively than any others, tend to catch “real” epitopes (i.e. those that occur naturally) as one of the top ten or twenty possibilities — rarely are my best scores the real epitopes, but it’s also rare to have a complete miss that doesn’t catch one in the top twenty or so.

A recent paper (Kotturi et al., Journal of Virology, May 2007, p. 4928–4940) looked at epitope prediction quite exhaustively — again this is something I want to talk about more extensively at a future date — and the bottom line was that epitope prediction was really helpful; it narrowed their search from thousands of peptides (that only caught two-thirds of the real epitopes) to a couple hundred (that caught more like 90% — but still missed a significant number of real epitopes, and still had around 90% false positives).

So, and this isn’t a careful test, let’s throw a few examples at the predictions and see how we do. I used an HIV nef protein that has at least 7 known epitopes that bind to HLA-A2 (if you’re playing along at home, the epitopes are ILKEPVHGV, VIYQYMDDL, VLDVGDAYFSV,ALQDSGLEV, IYQYMDDLYV, ELVNQIIEQL, and KYTAFTIPSI).

SYFPEITHI’s prediction does pretty well, catching 5 of the 7 in their top 25 scores; their first and third best were both true hits, and the other five were lower down in their ranking.

The IEDB tool did poorly, only finding one of the true epitopes in its top 25 (though it did give that one its highest score). To be fair, this prediction site needs a lot more fiddling than the others, and I didn’t spend much time tweaking it.

My own script catches 3 of the 7 out of my top 25 scores, but none are in the top ten.

By comparison, the Epitope Predictor at (remember the Epitope Predictor? This here’s a post about the Epitope Predictor) catches 2 of the 7 correctly; ranking them number 1 and 3.

So the bottom line, I think, is not that Microsoft sucks, but rather that epitope prediction is hard. There’s plenty of room for improvement (that’s part of the grant I’m working on). From this single example, SYFPEITHI — the granddaddy of epitope prediction — is pretty good, but even a very crude approach (mine) isn’t all that much worse.

Potentially, pooling approaches could be useful. Only one of the seven epitopes here was not predicted by any of the systems I tried here; three were only predicted by one of the systems (SYFPEITHI caught two, I caught the other); and only one epitope was predicted by all four systems. On the other hand, there would be a lot more noise, too.

So how come epitope prediction is so hard?

More about that later.