Mystery Rays from Outer Space

Meddling with things mankind is not meant to understand. Also, pictures of my kids

August 10th, 2010

DNA virus quasispecies? (Probably not.)

I’ve talked about quasispecies several times, and emphasized that RNA viruses, with their high replication error rates, are most prone to forming quasispecies.

I’ve also pointed out, though, that actually measuring quasispecies is technically difficult, and measuring it for the larger DNA viruses would be even harder. You’d need to run sequences on many viral genomes, to see how much variation develops over time; and it’s only recently that sequencing tech has approached the point where it’s even thinkable, let alone affordable, to do that:

While large DNA viruses are thought to have low mutation rates, only a small fraction of their genomes have been analyzed at the single-nucleotide level.1

So maybe DNA viruses might actually form quasispecies, and we don’t know it?

In fact, even for DNA viruses, mutants appear fairly quickly, given the appropriate selection pressure. In principle, these might not even be new mutations, but simply expansion of a particular part of a quasispecies that was already pre-existing. (It’s probably a fairly obvious point, but it’s important to remember that the introduction of new mutants, and their selection and expansion, are completely different processes. Some viruses throw out incredible numbers of mutants, but almost all of them are dead ends that are actively selected against, or at the least not selected for. Other viruses may make far fewer mutants, but given strong enough selection pressure some of these might rapidly take over the population. It’s very tricky to use observed mutations as a measure of mutation frequency, because observation often depends on selection to build up the numbers of the mutant before you can see it.)

At any rate, it’s a fair enough question,2 but recently there has been some evidence that supports the concept of DNA virus genome stability. Wayne Yokoyama’s lab has actually sequenced multiple genomes of mouse cytomegalovirus (a large DNA virus — a member of the herpesvirus family) to look at quasispecies.1

One of the things they did find was a significant number of variations in their stock, compared to the stock they had got it from years before:

… our laboratory’s Smith strain MCMV differed from the previously published Smith strain … There were 452 differences, including 50 insertion/deletions (indels) and 402 single-bp substitutions. … this high number of differences suggested that MCMV mutated in vivo, as we had previously maintained our MCMV stock by in vivo passages. 1

In other words, the standard methods of maintaining a virus — repeatedly growing new stocks in cells, and using those new stocks to make yet more — allow the accumulation of variations in the genome — which is already well known, of course, but often neglected in lab experiments.  Again, as I point out above, the number of observed mutations we see here doesn’t tell us much about the actual mutation frequency.

How often do mutations arise? By running the virus through cells repeatedly (in vitro, that is) and then seeing how individual clones differed, they determined that there are very, very few mutations per replication. What’s more, and even more impressively, very few mutations appeared after passages through mice (in vivo):

… the remaining 9 mutations allowed us to estimate the mutation rate of MCMV as 1.0 x 10–7 mutations per bp per day after in vivo passage, very similar to the mutation rate calculated for in vitro passage.1

(My emphasis) For comparison, the MCMV genome is not quite 250,000 bp long, so we’re looking at around one mutation per 40-50 genomes per day (if I’m dividing right). That’s hundreds of times more stable than most RNA viruses (see the table here3 for some RNA virus error rates).  Still, there’s plenty of room for natural selection in there, because of course there are hundreds or thousands of new MCMV genomes being made per day even in the most conservative estimate (and maybe more like millions or hundreds of millions), so dozens to thousands of them are mutants.

So, not surprisingly, Yokoyama’s group was able to detect a cluster of mutations that were almost certainly selected in the mouse; without going into detail, these mutations were in a viral gene (m157) that’s known to be recognized by the (laboratory) mouse immune system, so it wasn’t surprising that mutants were selected. And such mutants did not appear in mice without the appropriate immune component, demonstrating the role of natural selection in this cluster.

They offer a number of cautions, including one that’s raised in almost all such studies:

One caveat to our mutation analysis is that lethal mutations were probably underrepresented in the final DNA pool since, by definition, they did not propagate. Nonetheless, this limitation is intrinsic to all mutation analysis.1

Still, the results are solid and reassuring, supporting a basic concept in viral evolution.


  1. Cheng, T., Valentine, M., Gao, J., Pingel, J., & Yokoyama, W. (2009). Stability of Murine Cytomegalovirus Genome after In Vitro and In Vivo Passage Journal of Virology, 84 (5), 2623-2628 DOI: 10.1128/JVI.02142-09[][][][][]
  2. Well asked, invisible non-existent person![]
  3. CASTRO, C., ARNOLD, J., & CAMERON, C. (2005). Incorporation fidelity of the viral RNA-dependent RNA polymerase: a kinetic, thermodynamic and structural perspective. Virus Research, 107 (2), 141-149 DOI:10.1016/j.virusres.2004.11.004[]
July 26th, 2010

Quasispecies thoughts

Quasispecies theory predicts that slower replicators will be favored if they give rise to progeny that are on average more fit; these populations occupy short, flat regions of the fitness landscape … Flat quasispecies accept mutation without a corresponding effect on fitness … A flat quasispecies with an expansive mutant repertoire can explore vast regions of sequence space without consequence and is poised to adapt to rapid environmental change.

Lauring, A., & Andino, R. (2010). Quasispecies Theory and the Behavior of RNA Viruses PLoS Pathogens, 6 (7) DOI: 10.1371/journal.ppat.1001005

(My emphasis)  RNA viruses in general form quasispecies because they have such high mutation rates.  Many (though by no means all) emerging infections are the result of RNA viruses. I’ve pointed before to aspects of viruses that might help them jump from species to species (for example, here — though this is a DNA virus, not an RNA virus, and I don’t think it runs in quasispecies — and the string of posts I link to inside that one).

One example Lauring and Andino point to is influenza virus hemagglutinin (HA). Influenza mutates very rapidly, of course, but most of the changes are harmful to the virus. But changes in HA seem to be very well tolerated. Since HA is a major target of the immune system, this property allows influenza to avoid the immune system without getting hit by defects in fitness associated with the immune evasion.  This is a contrast to, say, HIV (generalizing here! This isn’t always true). HIV within a patient undergoes constant changes to avoid the immune response, but many of these changes reduce the overall viral fitness. If you take the mutated HIV into an environment without that particular immune response, the virus quickly mutates back to its original, more-fit, form.

I’m not sure how we would assess, in advance, which viruses are more “flat” than others and that are therefore more able to adapt to new species, but it’s something to think about as we look at new viruses and new viral variants. I would be interested in SARS, for example — what happened to mutation tolerance as the virus adapted to humans? Was the virus that originally jumped into humans different in this was from the ones that normally infect bats? Not easy to measure, though.

November 4th, 2010

Mutation rates in man and virus

John Hawks1 has a long and very interesting post on the human mutation rate — not just the actual number (which turns out to be less well documented and much more slippery than I had realized), but the techniques used to calculate the rate, and difficulties therein.

So much of the literature in this area is ultimately circular, I’m pulling out my sparse hair reading through it. By the time we get back to the mid-1990’s, the sequence data are even sparser than my hair by today’s standards — only a few hundred base pairs, or a sampling of restriction sites. But the divergence time estimates have propagated forward from that time to today, recycled through the assumptions of papers in the intervening time. It’s like the genetic equivalent of money laundering!

Conceptually, it’s very reminiscent of the questions about viral mutation rates, although the technical barriers are quite different and (especially for RNA viruses!) the mutation rates are vastly different. For example, Hawks’ post talks about which edge of a two-fold range the human mutation rate falls on — between 2.5 x 10-8 and 1.1 x 10-8 mutations per site; in a table I’ve used before we see a ten-thousand-fold range for poliovirus error rate estimates.

Virus polymerase error rates
RNA virus mutation rates 2

I have to get my kids ready for school now, so I don’t have time to talk about the techniques here — it’s notable that sequencing, though much easier on the tiny viral genomes than on the much vaster human scale, hasn’t completely resolved the issue, though the variation gets smaller as sequencing technology gets getter.

Here are some of my previous posts that mention replication error and mutation rates …


  1. Whose blog you should all be reading[]
  2. CASTRO, C., ARNOLD, J., & CAMERON, C. (2005). Incorporation fidelity of the viral RNA-dependent RNA polymerase: a kinetic, thermodynamic and structural perspective Virus Research, 107 (2), 141-149 DOI: 10.1016/j.virusres.2004.11.004[]
July 12th, 2010

Short takes: Deep sequencing and HIV drug resistance

Short comments about what I’ve been reading (besides several hundred influenza articles):

Hedskog, C., Mild, M., Jernberg, J., Sherwood, E., Bratt, G., Leitner, T., Lundeberg, J., Andersson, B., & Albert, J. (2010). Dynamics of HIV-1 Quasispecies during Antiviral Treatment Dissected Using Ultra-Deep Pyrosequencing PLoS ONE, 5 (7) DOI: 10.1371/journal.pone.0011345

The whole deep sequencing thing is going to profoundly change our knowledge of viral pathogenesis, as well as their ecology.

With highly mutation-prone viruses like HIV, hepatitis C virus, or influenza, our understanding of genome sequences has been based on the overall average genome — the average of a vast and diverse population. That average, that we’ve been calling the genome of these viruses, may not even exist as such, and certainly the minor variants that have been missed by traditional methods are also critically important, because they can explode out within a few days to take over the entire population, given the right set of circumstances. For example, if among those minor variants there are a few drug-resistant strains, then as soon as you treat the host, those variants may be able to take over.

In this paper, deep sequencing of people with HIV shows that drug-resistant variants do exist even before treatment, but they are normally very rare. They can take over during treatment with the particular drug, but when treatment is stopped they rapidly regress to rarity. This is presumably because the drug resistance makes the virus globally less fit (in the natural selection meaning of the term). When their more-fit brethren are destroyed by a drug these crippled, but drug-resistant, variants can grow out, but remove that selective pressure and the more wild-type versions take over once again.

As well as implications for treatment, this tells us something about viral reserves:

In most patients, drug resistant variants were replaced by wild-type variants identical to those present before treatment, suggesting rebound from latent reservoirs. 1


  1. Hedskog, C., Mild, M., Jernberg, J., Sherwood, E., Bratt, G., Leitner, T., Lundeberg, J., Andersson, B., & Albert, J. (2010). Dynamics of HIV-1 Quasispecies during Antiviral Treatment Dissected Using Ultra-Deep Pyrosequencing PLoS ONE, 5 (7) DOI: 10.1371/journal.pone.0011345[]
April 29th, 2010

Influenza variations, part II

Mutation NationAbout 15 minutes after I wrote my last article on influenza variation, I was reading the Journal of Virology  and ran across another paper1 on the same thing, that at least partly addresses some of the missing points in the earlier ones.

To brutally truncate my earlier comments: influenza should generate a huge number of mutants as it replicates; but in the few studies that have been done, not all that many variants have actually been detected.

One of the points I raised was that the influenza variation was sampled at the end-point of the infection — after the patient had died, in the paper I talked about the other day.2 Even though the virus had been through the maximum number of replication cycles, it had also experienced the maximal selection pressure, potentially reducing the number of surviving mutants. Is it possible that more variants arose earlier in the infection, but died off before they were detected?

This new paper1  actually looked at exactly that. They used canine influenza as their model, so they could deliberately infect their patients and track through the infections from the beginning through the end. Even though they used a technique that is much less sensitive to mutations (and is probably more error-prone as well) they found tons of variation, and the pattern they found is fascinating:

Mutations arose readily in the infected animals and reached high frequencies in some vaccinated dogs, but they were mostly transient and often were not detected on subsequent days. Hence, CIV populations are highly dynamic and characterized by a rapid turnover of likely deleterious mutations. ((Hoelzer, K., Murcia, P., Baillie, G., Wood, J., Metzger, S., Osterrieder, N., Dubovi, E., Holmes, E., & Parrish, C. (2010). Intrahost Evolutionary Dynamics of Canine Influenza Virus in Naive and Partially Immune Dogs Journal of Virology, 84 (10), 5329-5335 DOI: 10.1128/JVI.02469-09))

(My emphasis) This (assuming it holds true in other studies) beautifully resolves much of the difference between the expected level of variation, and the level that’s observed at any one time point of infection. The explanation is that the variation does indeed appear, but it doesn’t persist.  There is variation is over time as well as at any one time point.

Hoelzer 2010 Fig 1
Figure 1. Variation between challenge influenza virus (yellow) and virus isolated from two naïve dogs 2 to 4 days after infection 1  

There are a lot of very cool things about this study that I’m not going to talk about (differences between vaccinated and unvaccinated animals, evidence for antigenic escape) but there are two things that I thought were particularly exciting.

First is the question of why the mutations seem to be so transient. Part of that could just be chance, part of it is probably selection against deleterious mutants.

But it’s also worth keeping in mind that the viruses are replicating in a dynamic, rapidly-changing environment. The virus enters a host whose immune system is at rest but that immediately recognizes viral infection and ramps up interferons,  then other cytokines, then innate antiviral systems that build up and spill over into an adaptive immune response  …  a whole range of inflammation whose mediators and effectors change from hour to hour. Is this changing environment selecting for mutations that are briefly beneficial, and that then become deleterious as the situation changes a few hours later?

Second – when we think about viruses that are able to jump from one species to another, we think usually of mutants, virus that may be less fit in their “proper” hosts but adequately fit in some other species. (In fact canine influenza itself is a great example of this, a virus that jumped from horses into dogs six or seven years ago.  It is essentially equine influenza, but compared to the equine version it has a half-dozen variants that make it more suitable for replication in dogs.)

If we look at any particular time point we may not find any of these potential emergent mutants. But if we look at all the time points, as in this study, perhaps these potential species-jumping mutants are popping up all the time, but only for a few hours at a time:

This observation suggests that mutations that facilitate adaptation to a new host species might occur transiently in the donor host despite any associated fitness costs and provide a transient reservoir of preadapted mutations1

(My emphasis) There’s also theoretical and experimental work that probably addresses how this sort of pressure could drive population-level robustness.  For example, while heterogeneity is linked to fitness in HIV,3 Claus Wilke says:

Virus strains with a history of repeated genetic bottlenecks frequently show a diminished ability to adapt compared to strains that do not have such a history.4

I don’t know that work as well as I’d like to, but I think it’s probably relevant when considering local and global evolutionary pressures on the virus.


  1. Hoelzer, K., Murcia, P., Baillie, G., Wood, J., Metzger, S., Osterrieder, N., Dubovi, E., Holmes, E., & Parrish, C. (2010). Intrahost Evolutionary Dynamics of Canine Influenza Virus in Naive and Partially Immune Dogs Journal of Virology, 84 (10), 5329-5335 DOI: 10.1128/JVI.02469-09[][][][]
  2. Kuroda, M., Katano, H., Nakajima, N., Tobiume, M., Ainai, A., Sekizuka, T., Hasegawa, H., Tashiro, M., Sasaki, Y., Arakawa, Y., Hata, S., Watanabe, M., & Sata, T. (2010). Characterization of Quasispecies of Pandemic 2009 Influenza A Virus (A/H1N1/2009) by De Novo Sequencing Using a Next-Generation DNA Sequencer PLoS ONE, 5 (4) DOI: 10.1371/journal.pone.0010256[]
  3. Bordería AV, Lorenzo-Redondo R, Pernas M, Casado C, Alvaro T, et al. (2010) Initial Fitness Recovery of HIV-1 Is Associated with Quasispecies Heterogeneity and Can Occur without Modifications in the Consensus Sequence. PLoS ONE 5(4): e10319. doi:10.1371/journal.pone.0010319[]
  4. Novella, I., Presloid, J., Zhou, T., Smith-Tsurkan, S., Ebendick-Corpus, B., Dutta, R., Lust, K., & Wilke, C. (2010). Genomic Evolution of Vesicular Stomatitis Virus Strains with Differences in Adaptability Journal of Virology, 84 (10), 4960-4968 DOI: 10.1128/JVI.00710-09[]
April 27th, 2010

Influenza variations

Mutation comic

Indeed, the amount of HIV diversity within a single infected individual can exceed the variability generated over the course of a global influenza epidemic, the latter of which results in the need for a new vaccine each year. 1

That was said as part of a discussion on HIV vaccines, but let’s think about it from the influenza side.  Why is it true? Why doesn’t influenza have as many variants as HIV?

(Update: Another paper also looks at this question and points to some interesting explanations; I talk about that paper here.)

We know that influenza, like other RNA viruses, is prone to mutation (that is, it has an error-prone polymerase). Depending how you measure it, it’s likely that almost every new influenza genome has at least one mutation in it, meaning that every new infected animal or person should be be generating thousands upon thousands of new influenza variants.

Globally, of course we do see thousands of new flu variants each year.2 But, based on replication fidelity, you’d expect to see a lot more — maybe not quite as many as HIV, but not far from it.

This is also true on a much smaller scale, looking within infected individuals (animals or people).  Even using modern deep-sequencing techniques (like those used in some of the HIV analyses) that should in theory be able to detect large numbers of mutations, there are fewer than you might expect based on the known replication fidelity — far fewer variants than in HIV:

Inasmuch as the mutation rate for type A influenza viruses is estimated at one nucleotide change per 10,000 nucleotide during replication and most infections are caused by as many as 10 to 1000 virions which likely possess varying numbers of nucleotide differences in their genomes, one can expect that each influenza A virion is possibly a quasispecies. However, we identified relatively few quasispecies – probably because the currently available sequence analysis software do not allow robust quasispecies analysis and extensive manual curation is necessary. We believe that with the help of improved bioinformatic tools we would detect more quasispecies populations in our sample sets.  3

H1N1 (swine-origin influenza virus)
H1N1 (swine-origin influenza virus)

I don’t know enough about the computational side to comment on their bioinformatics point.  Another recent paper4 uses a similar approach and (at least at first) seems to reach an more conservative conclusion.  They talk about “quasispecies”, but they seem to be using the term rather loosely, to describe just a handful of distinct genomic sequences. These sequences differ by, for example, a single base (and a single amino acid) in the HA, where one of the sequences was present at about 75% of the sequences, and the other at about 25%. To me that’s not really a “quasispecies” — a quasispecies is something that needs to be defined by an average sequence even though the vast majority of the genomes are different from that average. (Here and here are Vincent Racaniello’s explanations at The Virology Blog.)  Two sequences is just two sequences.

However! The authors do make their data available. I don’t have time to do a detailed look, but from what I think is a very conservative analysis, in one stretch of just 25-40 bases some 5-10% of the genomes have at least one mutation.5 If that’s roughly true across the whole genome, then each genome would have, what, maybe a half-dozen mutations on average. That, to me, really is a quasispecies.

(Do note that this is not the mutation frequency for any individual residue. No single point [with the two or three exceptions that the authors focused on] is mutated at much more than one in a thousand, and most probably more like one in many thousand, which is about what you’d expect. )

There are a myriad of complicating factors separating the error frequency in these genomes from the raw error rate of the viral polymerase. A couple of huge ones: These viruses had undergone a bunch of replications in the host – this isn’t the error rate per replication cycle, it’s the cumulative error rate after many cycles. The virus was from a patient who had died with (and probably of) the virus, and though we don’t know how many time the original infecting virus had replicated it was at least a half dozen cycles, perhaps two or three times that.

Influenza virion
Influenza virion

On the other hand, during those replication cycles, many mutations (quite likely even the great majority of them) would have been deleterious or outright defective, so most of the mutations would have never propagated but would have just silently disappeared and not been counted at the end.

The most interesting point is that these mutations aren’t arising in a vacuum. Thinking now about which mutations survive and get detected, not the baseline rate of mutation formation: The variants are forming are in an environment that’s designed to be very hostile to viruses. Mutations are going to undergo selection by the immune system.

This is one place where influenza is going to experience a very a different set of pressures than HIV. HIV persists in the presence of the adaptive (T cell and antibody-based) immune response, whereas as the adaptive response kicks in for flu the virus gets evicted. HIV therefore not only have a much longer period (years instead of days) to throw out mutations, it also is shaped by the immune response. By comparison flu would probably only have a couple of replication cycles in the presence of an adaptive response.

Changes in the virus that accumulate over the handful of replication cycles would reflect a strong selection pressure. The vast majority of mutations, even those that aren’t completely defective, are going to be less fit than the original virus and won’t accumulate. Knowing which mutations do accumulate should be very interesting because it may tell us what the virus is going through in the host.  That’s what the authors of this paper focused on — the one particular site that had a much, much higher variant  frequency, more like 25% of the genomes.  The assumption is that this arose during the infection and was positively selected for. 6

The variants that replicate best in a host may be quite different from those that are effectively transmitted. That is, there may be multiple sources of selective pressure, of which we have previously mainly only seen transmission pressure (because that’s the main one that will accumulate in a population, because transmission represents a bottleneck in the virus’s evolution [link to The Virology Blog]).  The particular HA variant that was picked up here (that apparently accumulated during the infection) is rare globally. Is that a version of the HA that’s more efficient within a host, but that doesn’t transmit as well?

I think a major reason for the difference between HIV and influenza variant accumulation is the difference between within-host and between-host (transmission) selection.  HIV spends long, long periods within a single host, thousands of replication cycles, accumulating mutations.  The transmission bottlenecks come at much longer intervals and have a much larger accumulated population to work with.  Influenza has a comparatively brief period within the host, only a handful of replications before a new transmission bottleneck hits. 7

This sort of deep sequencing experiment on influenza will probably be improved over the next few years, and I’ll be very interested to see just how much variation there really is within on flu-infected host.


  1. Walker, B., & Burton, D. (2008). Toward an AIDS Vaccine Science, 320 (5877), 760-764 DOI: 10.1126/science.1152622[]
  2. More correctly, I suppose, we infer the presence of thousands of new variants based on the hundreds of them that we see, and knowing that we are only examining a tiny fraction of all the flu cases that are out there.[]
  3. Ramakrishnan, M., Tu, Z., Singh, S., Chockalingam, A., Gramer, M., Wang, P., Goyal, S., Yang, M., Halvorson, D., & Sreevatsan, S. (2009). The Feasibility of Using High Resolution Genome Sequencing of Influenza A Viruses to Detect Mixed Infections and Quasispecies PLoS ONE, 4 (9) DOI: 10.1371/journal.pone.0007105[]
  4. Kuroda, M., Katano, H., Nakajima, N., Tobiume, M., Ainai, A., Sekizuka, T., Hasegawa, H., Tashiro, M., Sasaki, Y., Arakawa, Y., Hata, S., Watanabe, M., & Sata, T. (2010). Characterization of Quasispecies of Pandemic 2009 Influenza A Virus (A/H1N1/2009) by De Novo Sequencing Using a Next-Generation DNA Sequencer PLoS ONE, 5 (4) DOI: 10.1371/journal.pone.0010256[]
  5. I extracted the FASTQ data containing the short sequence reads matching influenza sequences from the supplemental PDF, converted it to FASTA, and used xdformat to move it into a BLAST database. Then I grabbed 40 bases from the genbank sequence CY045951.1, the PB2 segment of the closest-match influenza strain, choosing a region (positions 2151-2190) with very high coverage, and BLASTed this sequence against the short sequence data, using parameters such that I retrieved sequences that match at least 25 of 40 positions.  Of the  ~2050 hits I retrieved, about 120 had at least one internal mismatch. I can’t distinguish these from sequencing errors, but I think it’s much higher than you’d expect from sequencing error.  And I hope that my conservative approach (for example, I would have discarded mismatches at the ends of the hits) would balance out that source of confusion. []
  6. One point, by the way, that the authors didn’t cover was the possibility that this patient had actually been initially infected with more than one viral sequence.  We do know that a significant number of flu cases are doubly infected. The fact that the minor variant is a very unusual strain makes this less likely, but not impossible.[]
  7. And I think it’s fair to say that the global population-based HIV variation — the transmission-selected amount of variation, as opposed to the vast within-individual variation — is rather more comparable to that of influenza.[]
April 27th, 2008

Elementary Dr Watson

Foot-and-mouth disease virusWe’ve been promised that as genome sequencing becomes faster and simpler, we’ll start seeing practical dividends as well as parlour tricks like sequencing Watson’s genome. Some of the dividends are already paying out, as a paper in the latest PLoS Pathogens1 shows.

Probably most of you remember the outbreaks of foot-and-mouth disease in Britain in 2001, and again last year. FMD is a virus that affects many hooved animals; it’s not usually fatal, but causes productivity loss. FMD outbreaks are economically devastating, because aside from the productivity loss many countries, that are free of the disease, will refuse to take meat or other agricultural products from outbreak areas. The goal of FMD management, then, is to keep it away, and if it ever hit, to contain it and slaughter all infected and potentially-infected animals.

The 2001 outbreak in Great Britain came from outside the country. The 2007 outbreak, though, was clearly from a local source: The FMD research lab in the Institute for Animal Health (IAH), Pirbright, Surrey. The latest paper discusses the epidemiology of that outbreak, and how they used whole-genome sequencing to track and predict sites of FMD.

Samuel & Knowles, 2001, Fig 2(This is timely, because the US is planning to move the sole American FMD research center, now on Plum Island, to the mainland. There’s obvious concern that the virus could escape from containment within research labs and infect neighboring animals, causing the first American FMD outbreak since 1929. I am not particularly knowledgeable about the field, but I have to think that, at best, the timing of the planned move is unfortunate.)

FMD is caused by a picornavirus, the same broad family as polio and cold viruses. Like those viruses, FMD mutates rapidly, traveling around as a quasispecies cloud. The clouds can be easily divided into 7 broad groups, and within the most common serotype (O) there are 8 distinct subgroups (see the map2 to the right [click for a larger version] for their geographical distribution).

The FMD genome is 8134 nucleotides long, and the sequence analysis that has been used for epidemiology like the 7 different topotypes has been based on no more than 8% of that length — the VP1 gene, usually. That’s enough to track high-level changes, because of FMD’s rapid mutation rate:2

the rate of evolution is approximately 1% per year …. If the concept of a constant evolutionary rate is accepted and there are no constraints on virus evolution then it would expected that new topotypes could arise in approximately 15 years. In reality, this extent of evolution probably takes much longer. For example, FMD viruses belonging to the Asia 1 serotype, first identified in samples from Pakistan in 1954 … have not yet exceeded 15% nucleotide difference …

But 8% of the genome is not nearly enough to track changes within a single epidemic, like the one in Surrey last year; it simply isn’t long enough to pick up the handful of variations. It was known in the previous outbreak, in 2001, that the information was there in the genome (“virus recovered from closely housed animals can differ by 1 to 2 nucleotides and is likely to pass through a “bottleneck” on passage between farms”).3 The issue was a practical, technological one — being able to sequence entire virus genomes quickly enough to pass back information to people in the field.

Cottam 2008 Fig 2By 2007, the technology was there. The people at the IAH were able to sequence genomes from viruses isolated in the outbreak with a fine enough comb to track changes throughout the spread, and fast enough pass information back to the field within 24-48 hours. Their sequencing confirmed that the virus was in fact a lab escapee, because it was almost identical to a couple of lab strains but was different from circulating viruses. 4

The 40-odd viral genomes yielded a fair bit of useful information (see the figure to the left for a summary). For example,

The small number of nucleotide substitutions observed between viruses from source and recipient IP suggests that there has been direct transmission without the involvement of other susceptible species, e.g. sheep or deer.

It’s obviously useful to know if there’s a wild-animal reservoir of disease, but an even more important insight came from this work as well.

the virus from IP3b was nine nucleotides different from the virus from IP1b … This is a high number of changes for a single farm-to-farm transmission … and we predicted that there were likely to be intermediate undetected infected premises between the first outbreaks in August and IP3b. … Serosurveillance of all sheep within 3 km of the September outbreaks revealed another infected premises (IP5), on which it was estimated that disease had been present for at least two, and possibly up to five weeks. As Figure 2B shows, IP5 is a likely link between the August and September outbreaks.

I would be interested in hearing from the people on the ground just how useful this information was — for example, were they impelled to search more for an intermediate source based on this information, or did they already suspect it from other, classical ways? But in any case, it’s clear that genomics is capable of pushing epidemiology a lot further in the future.


  1. Cottam, E.M., Wadsworth, J., Shaw, A.E., Rowlands, R.J., Goatley, L., Maan, S., Maan, N.S., Mertens, P.P., Ebert, K., Li, Y., Ryan, E.D., Juleff, N., Ferris, N.P., Wilesmith, J.W., Haydon, D.T., King, D.P., Paton, D.J., Knowles, N.J. (2008). Transmission Pathways of Foot-and-Mouth Disease Virus in the United Kingdom in 2007. PLoS Pathogens, 4(4), e1000050. DOI: 10.1371/journal.ppat.1000050[]
  2. Samuel, A. R., and Knowles, N. J. (2001). Foot-and-mouth disease type O viruses exhibit genetically and geographically distinct evolutionary lineages (topotypes). J Gen Virol 82, 609-621.[][]
  3. Cottam, E. M., Haydon, D. T., Paton, D. J., Gloster, J., Wilesmith, J. W., Ferris, N. P., Hutchings, G. H., and King, D. P. (2006). Molecular epidemiology of the foot-and-mouth disease virus outbreak in the United Kingdom in 2001. J Virol 80, 11274-11282.[]
  4. As far as I know, it’s not yet known how exactly the virus escaped from the IAH. I’ve read what seems to be informed speculation that it may have come from the drains, as decontamination systems designed to prevent that weren’t properly maintained; but I don’t know if that’s true, an educated guess, or mere rumor and guesswork.[]
January 2nd, 2008

Antibody-based vaccines

Broadly neutralizing anti-HIV antibody Viruses replicate inside cells, which shields them from some components of the immune system. In particular, antibodies can’t penetrate inside a cell1 to bind to a virus there, so antibodies are not much use for eliminating a viral infection.2 For some viruses that have to exit the cell to spread to a new target cell, antibody may help limit spread, but many viruses can spread directly from one cell to its neighbor without ever being exposed to antibodies. So once a virus has entered a host’s cells, you probably want mostly cell-mediated immune responses, such as T helper cells and cytotoxic T lymphocytes, to get rid of the virus.

That’s for eliminating viral infections. What antibodies are often extremely good at is blocking infections–stopping the virus from ever getting a foothold. A virus that enters your body has to be exposed to extracellular components at least briefly before it can burrow into its protective cell. During that phase the virus is vulnerable to antibody-mediated inhibition. Antibodies therefore may be relatively unhelpful for getting rid of an ongoing infection, but they can be very good at protecting against new infections.

Not surprisingly, then, most3 antiviral vaccines depend on inducing a strong and specific antibody response. That also means you can often get away with killed virus vaccines like the Salk polio vaccine, or even subunit vaccines like Hepatitis B vaccine; these are very poor at inducing cytotoxic T lymphocyte (CTL) responses, but they don’t have to. Killed vaccines are, in principle, intrinsically safer than the attenuated viruses, or even vector-based recombinant vaccines.

Why are researchers looking for alternatives?

So why is there so much interest in developing vaccines that stimulate CTL? Why are so many groups working on vector-based vaccines or attenuating viruses? One reason is that these vaccines are (again, in principle) intrinsically more immunogenic than killed vaccines. If you can give one dose of vaccine, and then have your antigens stick around for a couple of weeks, or even amplify themselves as they replicate in situ, then you may not need to give a second (booster) dose of the vaccine. That’s a moderate advantage in the first world, and potentially a huge advantage in the third world, where you may only have one chance to visit your patients.

Another reason is that to a large extent we’ve already nailed the simple problems. If a killed vaccine can protect against a major pathogen, we probably already have that vaccine up and running. We’re left with those virus diseases that, for one reason or another, are not easily prevented by antibody-type responses, and so cellular CTL-type responses are the most promising next step.

HIVWhat keeps a virus from being blocked by antibodies? There are a number of reasons, but the most obvious is that the virus offers a moving target to antibodies. HIV is probably the most famous example of this approach. The HIV surface is dominated by glycoproteins that are enormously variable; an antibody that blocks one particular HIV strain does nothing against a different strain. Hepatitis C virus (HCV) is another virus with highly variable surface proteins. Malaria, a parasite rather than a virus, has enough room in its genome to take this strategy even further. As well as using strain variation, individual parasites can dynamically change their surface proteins, stepping methodically through some 60 variants.4

Surface antigens in these pathogens have probably evolved to be variable; there’s been selective pressure for a pathogen to be different from the majority, since that way they’re less likely to infect an immune host.5 Internal antigens–those that are not exposed to antibodies-tend to be more highly conserved. Internal antigens aren’t exposed to antibodies, but they’re perfectly good targets for CTL. This is one reason for the interest in developing vaccines that induce good CTL responses.

Back to the future: Workarounds for antibody-based vaccines

There’s another approach, though. We have a lot of experience with antibody-based vaccines. It would be nice if there was a way to use them against HIV and HCV. Are there sections of the virus surface that are not variable? If so, then designing a vaccine that raises antibodies against these regions might be effective against many different strains. That’s been a hot topic for quite a while, and in fact there have been some steps forward on this front for HIV.6 More recently, in the latest issue of Nature Medicine7 there’s an article suggesting that some antibodies may be able to neutralize many hepatitis C strains.

The results provide evidence that broadly neutralizing antibodies to HCV protect against heterologous viral infection and suggest that a prophylactic vaccine against HCV may be achievable.


  1. Yes, I know that some forms of antibody routinely penetrate cells as they’re pumped into the gut, for example, but let’s stay relevant.[]
  2. Perhaps antibodies are important in some cases for triggering antibody-dependent cell-mediated cytotoxicity (ADCC) by NK cells, but again let’s not get sidetracked.[]
  3. If not all. I can’t think of a counterexample offhand[]
  4. Developmental selection of var gene expression in Plasmodium falciparum. Qijun Chen, Victor Fernandez, Annika Sundstram, Martha Schlichtherle, Santanu Datta, Per Hagblom & Mats Wahlgren. Nature 394, 392-395 (23 July 1998) []
  5. That being said, I don’t know that this has been formally shown for any of these agents; and in fact the only study I know of off the top of my head specifically did not find evidence for frequency-dependent selection in malaria surface protein alleles: Sequence Variation in the T-Cell Epitopes of the Plasmodium falciparum Circumsporozoite Protein among Field Isolates Is Temporally Stable: a 5-Year Longitudinal Study in Southern Vietnam. Amadu Jalloh, Huynh van Thien, Marcelo U. Ferreira, Jun Ohashi, Hiroyuki Matsuoka, Toshio Kanbe, Akihiko Kikuchi, and Fumihiko Kawamoto. Journal of Clinical Microbiology, April 2006, p. 1229-1235, Vol. 44, No. 4  []
  6. For example, Structural definition of a conserved neutralization epitope on HIV-1 gp120. Tongqing Zhou, Ling Xu, Barna Dey, Ann J. Hessell, Donald Van Ryk, Shi-Hua Xiang, Xinzhen Yang, Mei-Yun Zhang, Michael B. Zwick, James Arthos, Dennis R. Burton, Dimiter S. Dimitrov, Joseph Sodroski, Richard Wyatt, Gary J. Nabel & Peter D. Kwong. Nature 445, 732-737 (15 February 2007)–the source of the image at top here[]
  7. Broadly neutralizing antibodies protect against hepatitis C virus quasispecies challenge. Mansun Law, Toshiaki Maruyama, Jamie Lewis, Erick Giang, Alexander W Tarr, Zania Stamataki, Pablo Gastaminza, Francis V Chisari, Ian M Jones, Robert I Fox, Jonathan K Ball, Jane A McKeating, Norman M Kneteman & Dennis R Burton. Nature Medicine Published online: 6 December 2007  []
August 16th, 2007

HIV mutation: Does the world revolve around me?

Marras 2002HIV is a genetically unstable virus, and exists as a “quasispecies”,1 a cloud of variations surrounding a platonic ideal virus. Over time, selection pushes the cloud in various directions. What’s the main push behind that movement?

Because I’m interested in T cell immunity I tend to think of HIV mutation as being driven by, well, T cell immunity. This is the CTL escape2 I’ve mentioned before, and the paper that most dramatically reinforced that viewpoint for me was:
Constraints on HIV-1 evolution and immunodominance revealed in monozygotic adult twins infected with the same virus.
Draenert R, Allen TM, Liu Y, Wrin T, Chappey C, Verrill CL, Sirera G, Eldridge RL, Lahaie MP, Ruiz L, Clotet B, Petropoulos CJ, Walker BD, Martinez-Picado J.
J Exp Med. 2006 Mar 20;203(3):529-39.

This study found a pair of identical twins, infected with the same HIV strain at the same time, and tracked the appearance of new variants of HIV that popped up over time, correlating with immune responses. Remarkably, the mutations of HIV that appeared to be CTL escape variants were almost identical:

Of four responses that declined in both twins, three demonstrated mutations at the same residue. In addition, the evolving antibody responses cross-neutralized the other twin’s virus, with similar changes in the pattern of evolution in the envelope gene. These results reveal considerable concordance of adaptive cellular and humoral immune responses and HIV evolution in the same genetic environment, suggesting constraints on mutational pathways to HIV immune escape.

The conclusion I drew from this paper is that, in a particular genetic environment, the immune system shepherds HIV along a particular trail.

Now, this paper only tracked the twins for 3 years. Another paper3 had tracked a similar pair of twins over a much longer time, 17 years, and their findings were rather different:

Seventeen years after infection, their CTL targeting of HIV-1 was remarkably similar. In contrast, their overall TCR profiles were highly dissimilar, and a dominant epitope was recognized by distinctly different TCR in each twin. Furthermore, their viral epitopes had diverged, and there was ongoing viral phylogenetic divergence between the twins between 12 and 17 years after infection. These results indicate that while CTL targeting is predominately genetically determined, stochastic influences render the interaction of HIV-1 and host immunity, and therefore viral escape and CTL efficacy, unpredictable.

Yang et al 2005(The figure at right is the concluding figure from Yang et al., showing quite dramatically how each twin’s HIV had moved in different directions: “Phylogenetic relationships between pol (A), env (B), and nef (C) sequences from 1995 and 2000 are shown. Open and closed circles represent twin 1-05 sequences from 1995 and 2000, respectively; open and closed triangles represent twin 1-06 sequences from 1995 and 2000, respectively.“)

Still, even the Yang et al. paper don’t change my overall impression that HIV mutation is mainly CTL-driven; even if the viruses trotted down different paths, they were both (probably) being chivvied along those paths by CTL pressure.

What made me think about this today is an interesting variant on CTL escape variants, described in the latest issue of Journal of Virology:
A Rapid Progressor-Specific Variant Clone of Simian Immunodeficiency Virus Replicates Efficiently In Vivo Only in the Absence of Immune Reponses.
Takeo Kuwata, Russell Byrum, Sonya Whitted, Robert Goeken, Alicia Buckler-White, Ronald Plishka, Ranjini Iyengar, and Vanessa M. Hirsch.
Journal of Virology, Sept. 2007, p. 8891-8904 Vol. 81, No. 17

These guys were looking at monkeys infected with SIV, a subset of which were “rapid progressors” (RP). These monkeys show an early immune response, but lose their anti-HIV immunity very quickly — within 4 weeks. Worse than that, they also lose their ability to mount any new immune response, even to related antigens like tetanus toxoid. It turns out that these RP monkeys also contain a unique variant of SIV, with specific mutations in the env gene. Is this mutant virus responsible for the rapid progression of disease?

In fact, when they tried infecting monkeys with the new variant SIV, the recipients did not progress rapidly. If anything, the new variant virus was actually worse than wild-type virus at causing disease. And what’s more, when they took sampled the virus circulating in the newly-infected virus, what they found was that as immune responses to the virus developed, the RP variant disappeared and was replaced by … wild-type SIV, the parent of the RP variant. The only way the SIV could survive in their new hosts, in the face of an immune response, was to mutate back to the original wild-type sequence. It looks as if the causality was backwards; the RP variant didn’t cause the rapid progression, rather the rapid progression permitted these new viruses (that are very sensitive to immune responses) to be able to replicate.

These studies suggest that the SIV variants commonly selected in RP macaques are not the direct cause of rapid disease de novo in naive macaques. The evolution of RP-specific variants appears to be the result of replication in a severely immunocompromised host.

So perhaps this is an exception that proves the rule, and the major force behind HIV mutation and selection really is immune pressure: The virus doesn’t develop other variants until the immune system is completely screwed.

There’s at least one obvious exception to this: The change in receptor usage that HIV shows after infection. According to my primitive perception of this, the HIV types that are most readily spread between individuals (those that use the CCR5 receptor), are not the same type as most efficiently spread within an individual (those which can also use CXCR4); so the former are more likely to infect, but then mutants with the latter arise after infection. This is selection that’s not CTL-based. What other selection pressures on HIV, within a single individual, have been shown? I don’t know, I’m asking.


  1. The figure at the top of this post is “Quasispecies complexity of kidney and PBMC-derived from 2 patients with HIVAN from: Replication and compartmentalization of HIV-1 in kidney epithelium of patients with HIV-associated nephropathy. Daniele Marras, Leslie A. Bruggeman, Feng Gao, Nozomu Tanji, Mahesh M. Mansukhani, Andrea Cara, Michael D. Ross, G Luca Gusella, Gary Benson, Vivette D. D’Agati, Beatrice H. Hahn, Mary E. Klotman & Paul E. Klotman. Nature Medicine 8, 522 – 526 (2002) []
  2. That is, as the host’s T cells target specific regions of the virus, any new versions of the virus that mutate the targets, are more likely to thrive than the wild-type sequence. Antibodies also probably select HIV mutants, through the same mechanism — i.e. escape from neutralizing antibodies. I don’t know the relative importance of CTL and antibody selection, but I suspect that CTL are more important because antibodies mainly target a limited region of a limited number of proteins, whereas CTL attack the entire HIV genome.[]
  3. Genetic and Stochastic Influences on the Interaction of Human Immunodeficiency Virus Type 1 and Cytotoxic T Lymphocytes in Identical Twins. Otto O. Yang, Joseph Church, Christina M. R. Kitchen, Ryan Kilpatrick, Ayub Ali, Yongzhi Geng, M. Scott Killian, Rachel Lubong Sabado, Hwee Ng, Jeffrey Suen, Yvonne Bryson, Beth D. Jamieson, and Paul Krogstad. Journal of Virology, December 2005, p. 15368-15375, Vol. 79, No. 24 []
August 2nd, 2007

Immunodominance, Part II: Why care?

HIV budding from a lymphocyteHIV and hepatitis C virus (HCV) are the two best-known chronic infections of humans. Both of them seem to persist at least partly by throwing out immune escape variants.

To expand that a bit: These are viruses that continue to infect people in spite of a specific immune response: People infected with either virus, generate cytotoxic T lymphocytes (CTL) that recognize and destroy infected cells. CTL recognize short peptides, say 9 amino acids long, that are derived from viral proteins. If you monitor which viral peptides that CTL are recognizing, and track those peptides over time, what you often (but not invariably) find is that the peptides in the dominant virus in the body changes sequence over time. As a result, CTL regularly lose the ability to recognize the virus. Each time (at least for a while) the virus mutates away from the CTL, new CTL pop up that recognize the new version of the virus, but each time the virus has a window to bump up its replication for a while as CTL control is reduced. 1

This sort of immune escape occurs in HCV infections as well2 although it’s not as clear that it’s critical to HCV persistence:3

Although it is clear that CTL escape mutations occur in HCV genomes, the relevance of this mechanism to viral persistence is an open question. Mutations usually occur within the first 3-4 months of infection …. Such observations are compatible with release from early immune selection pressure as viral escape is established, and perhaps suggest a role for CTL escape mutations in the genesis of chronic infection.

Boat wakePicture the virus as motorboat, roaring through the T cell ocean, leaving behind it a wake of failed CTL that can no longer recognize the viral epitopes. The problem with this image is that to keep ahead, the virus has to continually change its sequence4 and changing a protein’s sequence usually means losing some functionality. It’s been shown that immune escape is often associated with a reduction in viral fitness.5 From any particular viral sequence, there are probably a limited number of directions the virus can move without losing its ability to replicate effectively: “The stereotypic nature of acquired mutations provides support for biochemical constraints limiting HIV-1 evolution and for the impact of CD8 escape mutations on viral fitness.”6

So it’s not as effortless as it seems for the persistent virus to keep on mutating away from the controlling T cells; the virus takes a pretty big hit to do so. The amount of fitness the virus is “willing” to lose in order to escape from CTL recognition tells us just how effective CTL must be in controlling the virus, so CTL must be pretty good at the job. How can we help CTL control the virus? How can we keep the virus from escaping from CTL control?

This is where the concept of immunodominance comes in (see? I had a point after all!). Immunodominance, if you missed the last post on the subject, is the observation that (for reasons that are not well understood) immune responses often focus on a very limited number of epitopes; there may be many peptides that are recognized to some extent, but the vast majority of CTL recognize only two or three of those peptides. If a CTL response is “broad”, meaning that many viral epitopes are recognized well (with no clear immnodominant epitope), then to escape from CTL control the virus quasispecies must throw out multiple mutations at the same time. That’s much harder (less likely) than throwing out a single mutation; and it’s much harder than sequentially throwing out single escape mutants, with periods in between of efficient replication (unchecked by CTL) in which the quasispecies can establish compensatory mutations and become set for a new mutation.

In this context, then, immunodominance may be a bad thing. It’s been suggested7 that some individuals who can control HIV for a long time, do so at least partially because of their subdominant CTL response. If we could manipulate the CTL response during vaccination or initial infection, then, perhaps we could reduce the response to an immunodominant epitope and increase the responses to multiple subdominant epitopes, and perhaps this would help control HIV infection.

Is there a context in which immunodominant responses are good things?

More later.


  1. I think the first paper showing evidence for HIV immune escape was Human immunodeficiency virus genetic variation that can escape cytotoxic T cell recognition. Rodney E. Phillips, Sarah Rowland-Jones, Douglas F. Nixon, Frances M. Gotch, Jon P. Edwards, Afolabi O. Ogunlesi, John G. Elvin, Jonathan A. Rothbard, Charles R. M. Bangham, Charles R. Rizza & Andrew J. Mcmichael. Nature 354, 453 – 459 (12 December 1991) []
  2. The outcome of hepatitis C virus infection is predicted by escape mutations in epitopes targeted by cytotoxic T lymphocytes. Erickson AL, Kimura Y, Igarashi S, Eichelberger J, Houghton M, Sidney J, McKinney D, Sette A, Hughes AL, Walker CM. Immunity. 2001 Dec;15(6):883-95. []
  3. Mutational escape from CD8+ T cell immunity: HCV evolution, from chimpanzees to man. David G. Bowen and Christopher M. Walker. J Exp Med 201: 1709-1714 (6 June 2005) []
  4. To be a little more accurate, there’s no single “virus”, but rather a cloud of viruses with slightly varying sequences — a quasispecies; within that cloud, the majority may have the immune-escape sequence.[]
  5. For example: Rapid viral escape at an immunodominant simian-human immunodeficiency virus cytotoxic T-lymphocyte epitope exacts a dramatic fitness cost. Fernandez CS, Stratov I, De Rose R, Walsh K, Dale CJ, Smith MZ, Agy MB, Hu SL, Krebs K, Watkins DI, O’connor DH, Davenport MP, Kent SJ. J Virol. 2005 May;79(9):5721-31.[]
  6. Selective escape from CD8+ T-cell responses represents a major driving force of human immunodeficiency virus type 1 (HIV-1) sequence diversity and reveals constraints on HIV-1 evolution. Allen TM, Altfeld M, Geer SC, Kalife ET, Moore C, O’sullivan KM, Desouza I, Feeney ME, Eldridge RL, Maier EL, Kaufmann DE, Lahaie MP, Reyor L, Tanzi G, Johnston MN, Brander C, Draenert R, Rockstroh JK, Jessen H, Rosenberg ES, Mallal SA, Walker BD. J Virol. 2005 Nov;79(21):13239-49.[]
  7. For example, Subdominant CD8 T-Cell Responses Are Involved in Durable Control of AIDS Virus Replication . Thomas C. Friedrich, Laura E. Valentine, Levi J. Yant, Eva G. Rakasz, Shari M. Piaskowski, Jessica R. Furlott, Kimberly L. Weisgrau, Benjamin Burwitz, Gemma E. May, Enrique J. Leon,Taeko Soma, Gnankang Napoe, Saverio V. Capuano III, Nancy A. Wilson,and David I. Watkins. J Virol, Apr. 2007, p. 3465-3476 Vol. 81, No. 7 doi:10.1128/JVI.02392-06; and Control of human immunodeficiency virus replication by cytotoxic T lymphocytes targeting subdominant epitopes. Frahm N, Kiepiela P, Adams S, Linde CH, Hewitt HS, Sango K, Feeney ME, Addo MM, Lichterfeld M, Lahaie MP, Pae E, Wurcel AG, Roach T, St John MA, Altfeld M, Marincola FM, Moore C, Mallal S, Carrington M, Heckerman D, Allen TM, Mullins JI, Korber BT, Goulder PJ, Walker BD, Brander C. Nat Immunol. 2006 Feb;7(2):173-8.[]
|