Mystery Rays from Outer Space

Meddling with things mankind is not meant to understand. Also, pictures of my kids

August 17th, 2010

Pigs (and their viruses) fly

Type II PRRSV An emerging disease that I just missed directly seeing emerge is PRRS.

PRRS is “porcine reproductive and respiratory syndrome”, which pretty much sums up the disease. It’s caused by — you’ll never guess — Porcine reproductive and respiratory syndrome virus (PRRSV), an arterivirus that emerged in 1987. That was the year I left large animal veterinary practice, so I never had a chance to deal with PRRS clinically.

Twenty-three years may not seem like all that long a time, but if you’re an RNA virus that’s a lot of generation times and a whole lot of time for mutations and evolution, and PRRS viruses are an evolutionarily mess. 1 There are North American type PRRSV viruses and European type viruses, there are mysterious clusters of related viruses, there are clusters of related diseases, there are thousands of sequences, and it’s just kind of baffling what’s gone on with the whole schtick.

A new paper2 has tried to sort out part of the mess by analyzing some 8624 North American-type PRRSV sequences, from nearly a dozen countries, and working out evolutionary relationships between them all. 3 (The focus on the North American series — the Type II PRRSV — is because this group seems to be a more common source of disease; although the European strains are far from rare themselves.)

There were a couple of interesting points that parallel some other viruses:

1. Feral vaccines. It’s already known, or at least strongly suspected, that some of the modified-live PRRSV vaccines have started to go feral on a small scale (not nearly as dramatically as the vaccinia virus I mentioned a while ago), and that’s supported by this genetic analysis:

In the vaccine-associated sublineage phylogenies (data not shown), there were a number of well-supported small clusters that might reflect the small-scale transmission of the vaccine viruses in the field … 2

As well as vaccinia, there are other live vaccines that are known to spread into the population. The sort of limited transmission that seems to be showing up here is more typical of this sort of thing than are the vaccinia instances I talked about before.

2. The amazing flying pigs. Even though this is just one of the two major sub-groups of PRRSV and it’s less than 25 years since it emerged, they came up with nine fairly distinct lineages of the virus (see the figure to the right). As you’d expect the lineages speak to the history of the virus — which is to a large extent the history of the pigs that carried the virus.4

This version of the virus probably started out in North America (though how it got there … ?) and then got introduced into other countries on several independent occasions. Two of these introductions were in the late 1980s, shortly after the North American emergence. Aside from that there’s evidence of a bunch of smaller introductions:

… lineage 1 had several Thai sequences clustered with early Canadian sequences … ; lineage 8 contained highly pathogenic Chinese strains and their relatives … ; and lineages 8 and 9 had several Italian isolates which were distributed separately along the phylogeny …, indicating independent introductions of PRRSV from the United States to Italy. 2

They were even able to identify smaller-scale travel patterns, between individual states in the USA:

Iowa plays a central role because its viruses were introduced recurrently to all nine other states (Fig. 5B). The remaining states were not just receiving sites. Their local strains also were transmitted to other states repeatedly, but within a narrower range. … Our phylogeographic analyses reveal, for the first time, an interstate PRRSV traffic network in the United States. … The result also indicates that long-distance spread is a frequent process for PRRSV … 2

This is a reminiscent of the history of the pandemic H1N1 influenza virus, when it was still in swine. (Remember that pandemic H1N1 is genetically  a mixture of a North American swine influenza strain and a Eurasian strain.)  There’s a large national and global traffic in pigs, and even though most countries are reasonably careful in the way they handle incoming pigs it’s not a guarantee against virus introduction. I’m not singling out pigs, either — other kinds of livestock also are global travellers, and obviously so are humans.  But it’s a reminder that it isn’t just humans and their viruses that can quickly travel and spread around the globe.


  1. More correctly, our understanding of their evolution is a mess. The viruses are doing just fine.[]
  2. Shi, M., Lam, T., Hon, C., Murtaugh, M., Davies, P., Hui, R., Li, J., Wong, L., Yip, C., Jiang, J., & Leung, F. (2010). Phylogeny-Based Evolutionary, Demographical, and Geographical Dissection of North American Type 2 Porcine Reproductive and Respiratory Syndrome Viruses Journal of Virology, 84 (17), 8700-8711 DOI: 10.1128/JVI.02551-09[][][][]
  3. That’s a lot of viruses, but the sampling is heavily biased to a limited number of places, especially the USA [and especially a few regions within the USA] so it’s probably an underestimate, and maybe a severe underestimate, of the global diversity.

    I didn’t know, by the way, that there’s a PRRSV Database:http://prrsvdb.org/[]

  4. Or of the pig’s fluids. I think that especially in the early days of the emergence, the virus was spread by the boar semen used for artificial insemination.[]
August 10th, 2010

DNA virus quasispecies? (Probably not.)

I’ve talked about quasispecies several times, and emphasized that RNA viruses, with their high replication error rates, are most prone to forming quasispecies.

I’ve also pointed out, though, that actually measuring quasispecies is technically difficult, and measuring it for the larger DNA viruses would be even harder. You’d need to run sequences on many viral genomes, to see how much variation develops over time; and it’s only recently that sequencing tech has approached the point where it’s even thinkable, let alone affordable, to do that:

While large DNA viruses are thought to have low mutation rates, only a small fraction of their genomes have been analyzed at the single-nucleotide level.1

So maybe DNA viruses might actually form quasispecies, and we don’t know it?

In fact, even for DNA viruses, mutants appear fairly quickly, given the appropriate selection pressure. In principle, these might not even be new mutations, but simply expansion of a particular part of a quasispecies that was already pre-existing. (It’s probably a fairly obvious point, but it’s important to remember that the introduction of new mutants, and their selection and expansion, are completely different processes. Some viruses throw out incredible numbers of mutants, but almost all of them are dead ends that are actively selected against, or at the least not selected for. Other viruses may make far fewer mutants, but given strong enough selection pressure some of these might rapidly take over the population. It’s very tricky to use observed mutations as a measure of mutation frequency, because observation often depends on selection to build up the numbers of the mutant before you can see it.)

At any rate, it’s a fair enough question,2 but recently there has been some evidence that supports the concept of DNA virus genome stability. Wayne Yokoyama’s lab has actually sequenced multiple genomes of mouse cytomegalovirus (a large DNA virus — a member of the herpesvirus family) to look at quasispecies.1

One of the things they did find was a significant number of variations in their stock, compared to the stock they had got it from years before:

… our laboratory’s Smith strain MCMV differed from the previously published Smith strain … There were 452 differences, including 50 insertion/deletions (indels) and 402 single-bp substitutions. … this high number of differences suggested that MCMV mutated in vivo, as we had previously maintained our MCMV stock by in vivo passages. 1

In other words, the standard methods of maintaining a virus — repeatedly growing new stocks in cells, and using those new stocks to make yet more — allow the accumulation of variations in the genome — which is already well known, of course, but often neglected in lab experiments.  Again, as I point out above, the number of observed mutations we see here doesn’t tell us much about the actual mutation frequency.

How often do mutations arise? By running the virus through cells repeatedly (in vitro, that is) and then seeing how individual clones differed, they determined that there are very, very few mutations per replication. What’s more, and even more impressively, very few mutations appeared after passages through mice (in vivo):

… the remaining 9 mutations allowed us to estimate the mutation rate of MCMV as 1.0 x 10–7 mutations per bp per day after in vivo passage, very similar to the mutation rate calculated for in vitro passage.1

(My emphasis) For comparison, the MCMV genome is not quite 250,000 bp long, so we’re looking at around one mutation per 40-50 genomes per day (if I’m dividing right). That’s hundreds of times more stable than most RNA viruses (see the table here3 for some RNA virus error rates).  Still, there’s plenty of room for natural selection in there, because of course there are hundreds or thousands of new MCMV genomes being made per day even in the most conservative estimate (and maybe more like millions or hundreds of millions), so dozens to thousands of them are mutants.

So, not surprisingly, Yokoyama’s group was able to detect a cluster of mutations that were almost certainly selected in the mouse; without going into detail, these mutations were in a viral gene (m157) that’s known to be recognized by the (laboratory) mouse immune system, so it wasn’t surprising that mutants were selected. And such mutants did not appear in mice without the appropriate immune component, demonstrating the role of natural selection in this cluster.

They offer a number of cautions, including one that’s raised in almost all such studies:

One caveat to our mutation analysis is that lethal mutations were probably underrepresented in the final DNA pool since, by definition, they did not propagate. Nonetheless, this limitation is intrinsic to all mutation analysis.1

Still, the results are solid and reassuring, supporting a basic concept in viral evolution.


  1. Cheng, T., Valentine, M., Gao, J., Pingel, J., & Yokoyama, W. (2009). Stability of Murine Cytomegalovirus Genome after In Vitro and In Vivo Passage Journal of Virology, 84 (5), 2623-2628 DOI: 10.1128/JVI.02142-09[][][][][]
  2. Well asked, invisible non-existent person![]
  3. CASTRO, C., ARNOLD, J., & CAMERON, C. (2005). Incorporation fidelity of the viral RNA-dependent RNA polymerase: a kinetic, thermodynamic and structural perspective. Virus Research, 107 (2), 141-149 DOI:10.1016/j.virusres.2004.11.004[]
August 3rd, 2010

Lamprey immunity, again

The Lamprey (Yarrell 1835)
From A History of British Fish (William Yarrell, 1835)

I’ve talked about lamprey immune systems several times (here, here, and here). I find them fascinating because it shows both how our own immune system developed, and also shows alternate routes that can lead to a pretty good, but very different, immune system.

Quick background: In order of evolutionary appearance you have sea urchins, lampreys, sharks, reptiles, mammals. (Note that this is not true, it’s no more than a sloppy shorthand for common ancestry, but it’s a handy shorthand for this purpose.  See a phylogenetic tree here.) Mammals have a form of adaptive immune system that includes T lymphocytes and antibodies, and at first glance this whole complex system arose, almost fully-formed, in sharks.1

This has always amazed me, because an adaptive immune system doesn’t work in isolation; the pieces don’t work alone. You need all kinds of moving parts — all the complex molecular pieces that chop and snip DNA to form T cell receptors and antibodies, all the multiple parts of a thymus that screen T cells for functional and safe receptors, the MHC molecules that the receptors see and all the pieces that snip and shuffle around peptides for that system, the spleen and lymph nodes that let lymphocytes interact with other cells, — and it seemed that all these pieces abruptly appeared and put themselves together, like a fine watch, in one evolutionary blink.

When I first learned about this, some 15 or 20 years ago, I told myself that this was an illusion, that once more species were looked at we’d see the history of these moving parts in other common ancestors. Of course, this is exactly what’s happened since then. We see accidental, random parts in sea urchin genomes (I talk about that here) and we see other bits and pieces arising in lampreys and hagfish (in the links at the top).

So in reality, the adaptive immune system didn’t arise all that suddenly after all; the pieces gradually were added over a hundred million years or more, sometimes purely by chance, sometimes for other purposes altogether, and sometimes as components of a prototypic immune system that acted as a foundation for the whole shark thing.2

So that’s the first part of the background: In lampreys, which diverged from the mammalian lineage maybe 450 million years ago, we see many of the pieces of a mammalian adaptive immune system. There are cells that look a lot like lymphocytes, there is something that looks like a spleen. But, as I say, there are none of the familiar pieces that we think of as an adaptive immune system. Lampreys flatly do not have our adaptive immune system:

Nevertheless, the cardinal elements of adaptive immunity, namely Ig, TCR, RAG1 and 2, and MHC class I and II, were conspicuously absent.3

Lamprey "antibody"
Lamprey variable receptor with bound antigen4

But step back a little, and look a little deeper, and we see some familiar parts. Lampreys do, in fact, have variable receptors, just like T cell receptors and antibodies, and those receptors are made by chopping and shuffling genome DNA, just like TcR and antibodies, and are expressed in their lymphocyte-like cells, and some are secreted (like antibodies and B cells) and some are cell-associated (like T cell receptors).

And here’s the other amazing thing: At the molecular level, the lamprey receptors are completely unlike T and B cell receptors. The lamprey lineage came up with a completely different system that allows them to do pretty much the same thing as the shark lineage. Their receptors are different kinds of molecules, and the system that shuffles the genomic DNA is different. 5 Yet, the functional end product is the same — a system that has immunological memory. An adaptive immune response, that’s quite alien to our own, but that works pretty damn well.

Although the Ig-based and VLR-based adaptive immune systems in jawed and jawless vertebrates use different genes and assembly mechanisms, both systems generate diverse repertoires of anticipatory receptors capable of recognizing almost any Ag through the combinatorial assembly of large arrays of partial gene segments. The development of clonally diverse lymphocytes allows for Ag-specific responses and memory, which are lacking in innate immunity.3

There is still a lot we don’t know about lamprey immunity (how does it present self-reative receptors, with no thymus?) but what we do know is just so amazing, I’m completely fascinated by it. It beautifully illustrates two of the basic features of evolution — building on previous structures, whether related or not; and alternate solutions to the same problem. Herrrin and Cooper have a short and dense, but very interesting, review, 3 that prompted this particular post.


  1. That is, in the common ancestor of sharks and mammals, to use a slightly less-sloppy terminology.[]
  2. And of course, the system has continued to evolve. The mammalian system is remarkably similar to the shark in broad strokes, but it’s also very different in many ways.[]
  3. Herrin, B., & Cooper, M. (2010). Alternative Adaptive Immunity in Jawless Vertebrates The Journal of Immunology, 185 (3), 1367-1374 DOI: 10.4049/jimmunol.0903128[][][]
  4. B. W. Han, B. R. Herrin, M. D. Cooper, I. A. Wilson (2008). Antigen Recognition by Variable Lymphocyte ReceptorsScience, 321 (5897), 1834-1837 DOI: 10.1126/science.1162484[]
  5. Though there are some common pieces that hint at a common ancestor of the two systems, maybe.[]
July 29th, 2010

Genetic ironies: Retrovirus version

I’ve mentioned the APOBEC family before (for example, here and here). They’re a group of mammalian genes that (among other things) protect against retrovirus infection.

DIfferent strains of mice have different resistance to retrovirus infection. Some strains are highly resistant, others quite susceptible. At least some of this difference in susceptibility comes down to different expression levels of mouse APOBEC3: High expression of the gene gives good resistance to some retroviruses, low expression gives less resistance.

How come some strains have higher expression than others? Turns out that it’s because a retrovirus inserted in the APOBEC3 region of the genome of certain mouse strains, and that insertion cranks up expression of the APOBEC3.

We discovered that the mA3 allele in virus resistant mice is disrupted by insertion of the regulatory sequences of a mouse leukemia virus, and this insertion is associated with enhanced mA3 expression.  ((Sanville, B., Dolan, M., Wollenberg, K., Yan, Y., Martin, C., Yeung, M., Strebel, K., Buckler-White, A., & Kozak, C. (2010). Adaptive Evolution of Mus Apobec3 Includes Retroviral Insertion and Positive Selection at Two Clusters of Residues Flanking the Substrate Groove PLoS Pathogens, 6 (7) DOI: 10.1371/journal.ppat.1000974))

So perhaps low APOBEC expression allowed retrovirus infection, which led to insertion of the retrovirus genome, which increased APOBEC3 expression and provided resistance to further retrovirus infection.

July 26th, 2010

Quasispecies thoughts

Quasispecies theory predicts that slower replicators will be favored if they give rise to progeny that are on average more fit; these populations occupy short, flat regions of the fitness landscape … Flat quasispecies accept mutation without a corresponding effect on fitness … A flat quasispecies with an expansive mutant repertoire can explore vast regions of sequence space without consequence and is poised to adapt to rapid environmental change.

Lauring, A., & Andino, R. (2010). Quasispecies Theory and the Behavior of RNA Viruses PLoS Pathogens, 6 (7) DOI: 10.1371/journal.ppat.1001005

(My emphasis)  RNA viruses in general form quasispecies because they have such high mutation rates.  Many (though by no means all) emerging infections are the result of RNA viruses. I’ve pointed before to aspects of viruses that might help them jump from species to species (for example, here — though this is a DNA virus, not an RNA virus, and I don’t think it runs in quasispecies — and the string of posts I link to inside that one).

One example Lauring and Andino point to is influenza virus hemagglutinin (HA). Influenza mutates very rapidly, of course, but most of the changes are harmful to the virus. But changes in HA seem to be very well tolerated. Since HA is a major target of the immune system, this property allows influenza to avoid the immune system without getting hit by defects in fitness associated with the immune evasion.  This is a contrast to, say, HIV (generalizing here! This isn’t always true). HIV within a patient undergoes constant changes to avoid the immune response, but many of these changes reduce the overall viral fitness. If you take the mutated HIV into an environment without that particular immune response, the virus quickly mutates back to its original, more-fit, form.

I’m not sure how we would assess, in advance, which viruses are more “flat” than others and that are therefore more able to adapt to new species, but it’s something to think about as we look at new viruses and new viral variants. I would be interested in SARS, for example — what happened to mutation tolerance as the virus adapted to humans? Was the virus that originally jumped into humans different in this was from the ones that normally infect bats? Not easy to measure, though.

July 12th, 2010

Short takes: Deep sequencing and HIV drug resistance

Short comments about what I’ve been reading (besides several hundred influenza articles):

Hedskog, C., Mild, M., Jernberg, J., Sherwood, E., Bratt, G., Leitner, T., Lundeberg, J., Andersson, B., & Albert, J. (2010). Dynamics of HIV-1 Quasispecies during Antiviral Treatment Dissected Using Ultra-Deep Pyrosequencing PLoS ONE, 5 (7) DOI: 10.1371/journal.pone.0011345

The whole deep sequencing thing is going to profoundly change our knowledge of viral pathogenesis, as well as their ecology.

With highly mutation-prone viruses like HIV, hepatitis C virus, or influenza, our understanding of genome sequences has been based on the overall average genome — the average of a vast and diverse population. That average, that we’ve been calling the genome of these viruses, may not even exist as such, and certainly the minor variants that have been missed by traditional methods are also critically important, because they can explode out within a few days to take over the entire population, given the right set of circumstances. For example, if among those minor variants there are a few drug-resistant strains, then as soon as you treat the host, those variants may be able to take over.

In this paper, deep sequencing of people with HIV shows that drug-resistant variants do exist even before treatment, but they are normally very rare. They can take over during treatment with the particular drug, but when treatment is stopped they rapidly regress to rarity. This is presumably because the drug resistance makes the virus globally less fit (in the natural selection meaning of the term). When their more-fit brethren are destroyed by a drug these crippled, but drug-resistant, variants can grow out, but remove that selective pressure and the more wild-type versions take over once again.

As well as implications for treatment, this tells us something about viral reserves:

In most patients, drug resistant variants were replaced by wild-type variants identical to those present before treatment, suggesting rebound from latent reservoirs. 1


  1. Hedskog, C., Mild, M., Jernberg, J., Sherwood, E., Bratt, G., Leitner, T., Lundeberg, J., Andersson, B., & Albert, J. (2010). Dynamics of HIV-1 Quasispecies during Antiviral Treatment Dissected Using Ultra-Deep Pyrosequencing PLoS ONE, 5 (7) DOI: 10.1371/journal.pone.0011345[]
April 29th, 2010

Influenza variations, part II

Mutation NationAbout 15 minutes after I wrote my last article on influenza variation, I was reading the Journal of Virology  and ran across another paper1 on the same thing, that at least partly addresses some of the missing points in the earlier ones.

To brutally truncate my earlier comments: influenza should generate a huge number of mutants as it replicates; but in the few studies that have been done, not all that many variants have actually been detected.

One of the points I raised was that the influenza variation was sampled at the end-point of the infection — after the patient had died, in the paper I talked about the other day.2 Even though the virus had been through the maximum number of replication cycles, it had also experienced the maximal selection pressure, potentially reducing the number of surviving mutants. Is it possible that more variants arose earlier in the infection, but died off before they were detected?

This new paper1  actually looked at exactly that. They used canine influenza as their model, so they could deliberately infect their patients and track through the infections from the beginning through the end. Even though they used a technique that is much less sensitive to mutations (and is probably more error-prone as well) they found tons of variation, and the pattern they found is fascinating:

Mutations arose readily in the infected animals and reached high frequencies in some vaccinated dogs, but they were mostly transient and often were not detected on subsequent days. Hence, CIV populations are highly dynamic and characterized by a rapid turnover of likely deleterious mutations. ((Hoelzer, K., Murcia, P., Baillie, G., Wood, J., Metzger, S., Osterrieder, N., Dubovi, E., Holmes, E., & Parrish, C. (2010). Intrahost Evolutionary Dynamics of Canine Influenza Virus in Naive and Partially Immune Dogs Journal of Virology, 84 (10), 5329-5335 DOI: 10.1128/JVI.02469-09))

(My emphasis) This (assuming it holds true in other studies) beautifully resolves much of the difference between the expected level of variation, and the level that’s observed at any one time point of infection. The explanation is that the variation does indeed appear, but it doesn’t persist.  There is variation is over time as well as at any one time point.

Hoelzer 2010 Fig 1
Figure 1. Variation between challenge influenza virus (yellow) and virus isolated from two naïve dogs 2 to 4 days after infection 1  

There are a lot of very cool things about this study that I’m not going to talk about (differences between vaccinated and unvaccinated animals, evidence for antigenic escape) but there are two things that I thought were particularly exciting.

First is the question of why the mutations seem to be so transient. Part of that could just be chance, part of it is probably selection against deleterious mutants.

But it’s also worth keeping in mind that the viruses are replicating in a dynamic, rapidly-changing environment. The virus enters a host whose immune system is at rest but that immediately recognizes viral infection and ramps up interferons,  then other cytokines, then innate antiviral systems that build up and spill over into an adaptive immune response  …  a whole range of inflammation whose mediators and effectors change from hour to hour. Is this changing environment selecting for mutations that are briefly beneficial, and that then become deleterious as the situation changes a few hours later?

Second – when we think about viruses that are able to jump from one species to another, we think usually of mutants, virus that may be less fit in their “proper” hosts but adequately fit in some other species. (In fact canine influenza itself is a great example of this, a virus that jumped from horses into dogs six or seven years ago.  It is essentially equine influenza, but compared to the equine version it has a half-dozen variants that make it more suitable for replication in dogs.)

If we look at any particular time point we may not find any of these potential emergent mutants. But if we look at all the time points, as in this study, perhaps these potential species-jumping mutants are popping up all the time, but only for a few hours at a time:

This observation suggests that mutations that facilitate adaptation to a new host species might occur transiently in the donor host despite any associated fitness costs and provide a transient reservoir of preadapted mutations1

(My emphasis) There’s also theoretical and experimental work that probably addresses how this sort of pressure could drive population-level robustness.  For example, while heterogeneity is linked to fitness in HIV,3 Claus Wilke says:

Virus strains with a history of repeated genetic bottlenecks frequently show a diminished ability to adapt compared to strains that do not have such a history.4

I don’t know that work as well as I’d like to, but I think it’s probably relevant when considering local and global evolutionary pressures on the virus.


  1. Hoelzer, K., Murcia, P., Baillie, G., Wood, J., Metzger, S., Osterrieder, N., Dubovi, E., Holmes, E., & Parrish, C. (2010). Intrahost Evolutionary Dynamics of Canine Influenza Virus in Naive and Partially Immune Dogs Journal of Virology, 84 (10), 5329-5335 DOI: 10.1128/JVI.02469-09[][][][]
  2. Kuroda, M., Katano, H., Nakajima, N., Tobiume, M., Ainai, A., Sekizuka, T., Hasegawa, H., Tashiro, M., Sasaki, Y., Arakawa, Y., Hata, S., Watanabe, M., & Sata, T. (2010). Characterization of Quasispecies of Pandemic 2009 Influenza A Virus (A/H1N1/2009) by De Novo Sequencing Using a Next-Generation DNA Sequencer PLoS ONE, 5 (4) DOI: 10.1371/journal.pone.0010256[]
  3. Bordería AV, Lorenzo-Redondo R, Pernas M, Casado C, Alvaro T, et al. (2010) Initial Fitness Recovery of HIV-1 Is Associated with Quasispecies Heterogeneity and Can Occur without Modifications in the Consensus Sequence. PLoS ONE 5(4): e10319. doi:10.1371/journal.pone.0010319[]
  4. Novella, I., Presloid, J., Zhou, T., Smith-Tsurkan, S., Ebendick-Corpus, B., Dutta, R., Lust, K., & Wilke, C. (2010). Genomic Evolution of Vesicular Stomatitis Virus Strains with Differences in Adaptability Journal of Virology, 84 (10), 4960-4968 DOI: 10.1128/JVI.00710-09[]
April 27th, 2010

Influenza variations

Mutation comic

Indeed, the amount of HIV diversity within a single infected individual can exceed the variability generated over the course of a global influenza epidemic, the latter of which results in the need for a new vaccine each year. 1

That was said as part of a discussion on HIV vaccines, but let’s think about it from the influenza side.  Why is it true? Why doesn’t influenza have as many variants as HIV?

(Update: Another paper also looks at this question and points to some interesting explanations; I talk about that paper here.)

We know that influenza, like other RNA viruses, is prone to mutation (that is, it has an error-prone polymerase). Depending how you measure it, it’s likely that almost every new influenza genome has at least one mutation in it, meaning that every new infected animal or person should be be generating thousands upon thousands of new influenza variants.

Globally, of course we do see thousands of new flu variants each year.2 But, based on replication fidelity, you’d expect to see a lot more — maybe not quite as many as HIV, but not far from it.

This is also true on a much smaller scale, looking within infected individuals (animals or people).  Even using modern deep-sequencing techniques (like those used in some of the HIV analyses) that should in theory be able to detect large numbers of mutations, there are fewer than you might expect based on the known replication fidelity — far fewer variants than in HIV:

Inasmuch as the mutation rate for type A influenza viruses is estimated at one nucleotide change per 10,000 nucleotide during replication and most infections are caused by as many as 10 to 1000 virions which likely possess varying numbers of nucleotide differences in their genomes, one can expect that each influenza A virion is possibly a quasispecies. However, we identified relatively few quasispecies – probably because the currently available sequence analysis software do not allow robust quasispecies analysis and extensive manual curation is necessary. We believe that with the help of improved bioinformatic tools we would detect more quasispecies populations in our sample sets.  3

H1N1 (swine-origin influenza virus)
H1N1 (swine-origin influenza virus)

I don’t know enough about the computational side to comment on their bioinformatics point.  Another recent paper4 uses a similar approach and (at least at first) seems to reach an more conservative conclusion.  They talk about “quasispecies”, but they seem to be using the term rather loosely, to describe just a handful of distinct genomic sequences. These sequences differ by, for example, a single base (and a single amino acid) in the HA, where one of the sequences was present at about 75% of the sequences, and the other at about 25%. To me that’s not really a “quasispecies” — a quasispecies is something that needs to be defined by an average sequence even though the vast majority of the genomes are different from that average. (Here and here are Vincent Racaniello’s explanations at The Virology Blog.)  Two sequences is just two sequences.

However! The authors do make their data available. I don’t have time to do a detailed look, but from what I think is a very conservative analysis, in one stretch of just 25-40 bases some 5-10% of the genomes have at least one mutation.5 If that’s roughly true across the whole genome, then each genome would have, what, maybe a half-dozen mutations on average. That, to me, really is a quasispecies.

(Do note that this is not the mutation frequency for any individual residue. No single point [with the two or three exceptions that the authors focused on] is mutated at much more than one in a thousand, and most probably more like one in many thousand, which is about what you’d expect. )

There are a myriad of complicating factors separating the error frequency in these genomes from the raw error rate of the viral polymerase. A couple of huge ones: These viruses had undergone a bunch of replications in the host – this isn’t the error rate per replication cycle, it’s the cumulative error rate after many cycles. The virus was from a patient who had died with (and probably of) the virus, and though we don’t know how many time the original infecting virus had replicated it was at least a half dozen cycles, perhaps two or three times that.

Influenza virion
Influenza virion

On the other hand, during those replication cycles, many mutations (quite likely even the great majority of them) would have been deleterious or outright defective, so most of the mutations would have never propagated but would have just silently disappeared and not been counted at the end.

The most interesting point is that these mutations aren’t arising in a vacuum. Thinking now about which mutations survive and get detected, not the baseline rate of mutation formation: The variants are forming are in an environment that’s designed to be very hostile to viruses. Mutations are going to undergo selection by the immune system.

This is one place where influenza is going to experience a very a different set of pressures than HIV. HIV persists in the presence of the adaptive (T cell and antibody-based) immune response, whereas as the adaptive response kicks in for flu the virus gets evicted. HIV therefore not only have a much longer period (years instead of days) to throw out mutations, it also is shaped by the immune response. By comparison flu would probably only have a couple of replication cycles in the presence of an adaptive response.

Changes in the virus that accumulate over the handful of replication cycles would reflect a strong selection pressure. The vast majority of mutations, even those that aren’t completely defective, are going to be less fit than the original virus and won’t accumulate. Knowing which mutations do accumulate should be very interesting because it may tell us what the virus is going through in the host.  That’s what the authors of this paper focused on — the one particular site that had a much, much higher variant  frequency, more like 25% of the genomes.  The assumption is that this arose during the infection and was positively selected for. 6

The variants that replicate best in a host may be quite different from those that are effectively transmitted. That is, there may be multiple sources of selective pressure, of which we have previously mainly only seen transmission pressure (because that’s the main one that will accumulate in a population, because transmission represents a bottleneck in the virus’s evolution [link to The Virology Blog]).  The particular HA variant that was picked up here (that apparently accumulated during the infection) is rare globally. Is that a version of the HA that’s more efficient within a host, but that doesn’t transmit as well?

I think a major reason for the difference between HIV and influenza variant accumulation is the difference between within-host and between-host (transmission) selection.  HIV spends long, long periods within a single host, thousands of replication cycles, accumulating mutations.  The transmission bottlenecks come at much longer intervals and have a much larger accumulated population to work with.  Influenza has a comparatively brief period within the host, only a handful of replications before a new transmission bottleneck hits. 7

This sort of deep sequencing experiment on influenza will probably be improved over the next few years, and I’ll be very interested to see just how much variation there really is within on flu-infected host.


  1. Walker, B., & Burton, D. (2008). Toward an AIDS Vaccine Science, 320 (5877), 760-764 DOI: 10.1126/science.1152622[]
  2. More correctly, I suppose, we infer the presence of thousands of new variants based on the hundreds of them that we see, and knowing that we are only examining a tiny fraction of all the flu cases that are out there.[]
  3. Ramakrishnan, M., Tu, Z., Singh, S., Chockalingam, A., Gramer, M., Wang, P., Goyal, S., Yang, M., Halvorson, D., & Sreevatsan, S. (2009). The Feasibility of Using High Resolution Genome Sequencing of Influenza A Viruses to Detect Mixed Infections and Quasispecies PLoS ONE, 4 (9) DOI: 10.1371/journal.pone.0007105[]
  4. Kuroda, M., Katano, H., Nakajima, N., Tobiume, M., Ainai, A., Sekizuka, T., Hasegawa, H., Tashiro, M., Sasaki, Y., Arakawa, Y., Hata, S., Watanabe, M., & Sata, T. (2010). Characterization of Quasispecies of Pandemic 2009 Influenza A Virus (A/H1N1/2009) by De Novo Sequencing Using a Next-Generation DNA Sequencer PLoS ONE, 5 (4) DOI: 10.1371/journal.pone.0010256[]
  5. I extracted the FASTQ data containing the short sequence reads matching influenza sequences from the supplemental PDF, converted it to FASTA, and used xdformat to move it into a BLAST database. Then I grabbed 40 bases from the genbank sequence CY045951.1, the PB2 segment of the closest-match influenza strain, choosing a region (positions 2151-2190) with very high coverage, and BLASTed this sequence against the short sequence data, using parameters such that I retrieved sequences that match at least 25 of 40 positions.  Of the  ~2050 hits I retrieved, about 120 had at least one internal mismatch. I can’t distinguish these from sequencing errors, but I think it’s much higher than you’d expect from sequencing error.  And I hope that my conservative approach (for example, I would have discarded mismatches at the ends of the hits) would balance out that source of confusion. []
  6. One point, by the way, that the authors didn’t cover was the possibility that this patient had actually been initially infected with more than one viral sequence.  We do know that a significant number of flu cases are doubly infected. The fact that the minor variant is a very unusual strain makes this less likely, but not impossible.[]
  7. And I think it’s fair to say that the global population-based HIV variation — the transmission-selected amount of variation, as opposed to the vast within-individual variation — is rather more comparable to that of influenza.[]
March 12th, 2010

Yellow fever, stasis, and diversification

Girl with yellow fever (Wellcome Images)
“Episode de la fièvre jaune”

By analyzing hepatitis C virus genome sequences, you can trace the virus’s history through its spread by the slave trade, and linked 19th-century health models in different countries to viral spread and transmission. Similarly, by looking at leprosy DNA, you can track its spread along the Silk Road and along slave routes.

Yellow Fever was one of the most dreaded plagues of the 18th and 19th centuries, waning only after it was understood to be mosquito-borne, so that mosquito control pushed the virus back. It’s still prevalent in Africa and in some parts of South America, though. Yellow Fever virus, too, originated in Africa and was spread to the New World through the slave trade:

The most commonly cited hypothesis of the origin of YFV in the Americas is that the virus was introduced from Africa, along with A. aegypti,1 in the bilges of sailing vessels during the slave trade. … We estimate that the currently circulating strains of YFV arose in Africa within the last 1,500 years and emerged in the Americas following the slave trade approximately 300–400 years ago. These viruses then spread westwards across the continent and persist there to this day in the jungles of South America.2

Mosquitoes aren’t merely passive carriers of the Yellow Fever virus. The virus actively infects the mosquitoes as well as their mammalian host, entering the insect gut, replicating and multiplying in various organs until it reaches the saliva, from which it can re-infect mammals3 when the mosquito bites and injects its anticoagulant saliva.

Mosquitoes - Harper's Weekly 1873
“Latest from the front — our friends the mosquitoes preparing and off for the summer campaign”
(Harper’s Weekly, 1873)

Another pattern is possible: The virus could also be spread vertically, from the mosquito to its egg, infecting the newborn mosquito before it hatches. However, although this was shown to happen as long ago as 1905,4 just after mosquitoes were proven to be carriers, it hasn’t been very clear if this is a significant part of the natural viral cycle or if it’s more of a lab curiosity:

Although transovarial transmission of YFV has been demonstrated, the relative importance of this in maintaining the transmission cycle is unknown. 5

Now, genome sequence analysis suggests that in fact transovarial spread of Yellow Fever virus may well be common and important in the viral life cycle.6

This was based on comparisons of Yellow Fever virus genome sequences over time, with those of a close relative, Dengue virus. Dengue and YFV probably arose about the same time, in the same area, and were both spread along the slave trade. But Dengue seems to have diversified much more than YFV:

… it is intriguing that the overall age of YFV (emergence within the last 2,500 years) is broadly similar to the time of origin of the four DEN viruses. Hence, YFV and DENV seem to have radiated at approximately the same time. However, since this time, DENV has differentiated into four antigenically distinct viruses while YFV is still classified as a single serotype.6

(This is actually clinically very significant, because the most severe form of Dengue disease is caused by sequential infection with two different Dengue serotypes.) In fact, in general YFV shows a much slower rate of evolution over time than Dengue — about 5-fold slower per year. The authors consider a reject a number of explanations for this — it’s not that they have different mutation rates, because their raw mutation rates are probably quite similar; it’s not that they infect different hosts, because they have very similar insect and mammalian hosts; and so on — and finally suggest that the difference may be because YFV spends a significant part of each year lying more or less dormant in mosquito eggs:

In particular, it is possible that a mechanism of vertical transmission, such as transovarial transmission where the virus may remain quiescent in mosquito eggs for many months, plays a more important role in YFV than in DENV6

As a result of this quiescent period, YFV would simply have fewer replication cycles per year than does Dengue, and so it appears to evolve more slowly. For this to be detectable at this level, transovarian transmission would have to be a fairly common event, not just a once-in-a-while half-accidental option.


  1. A. aegypti is the mosquito that is most involved in spreading the virus[]
  2. Bryant, J., Holmes, E., & Barrett, A. (2007). Out of Africa: A Molecular Perspective on the Introduction of Yellow Fever Virus into the Americas PLoS Pathogens, 3 (5) DOI: 10.1371/journal.ppat.0030075[]
  3. Mainly primates, for functional transmission[]
  4. Marchous E, Simond PL. 1905. La transmission hereditaire du virus de la fievre jaune chez la Stegomyia fasciata. C. R. Soc. Biol. 59:259[]
  5. Barrett, A., & Higgs, S. (2007). Yellow Fever: A Disease that Has Yet to be Conquered Annual Review of Entomology, 52 (1), 209-229 DOI: 10.1146/annurev.ento.52.110405.091454[]
  6. Sall, A., Faye, O., Diallo, M., Firth, C., Kitchen, A., & Holmes, E. (2009). Yellow Fever Virus Exhibits Slower Evolutionary Dynamics than Dengue Virus Journal of Virology, 84 (2), 765-772 DOI: 10.1128/JVI.01738-09[][][]
March 2nd, 2010

Frogs and jumping viruses

Frogs (by Haeckel)
“Batrachia”, by Ernst Haeckel
(Kunstformen der Natur, 1904)

There’s a constant viral assault on us humans, as there is on just about all other species. We as a species have to contend not only with the vast pool of human pathogens, those viruses that constantly circulate among humanity; but also with the continual probes on our defenses from other viruses, viruses that normally infect other species.  All of us are exposed to these on a regular basis: Dog and cat viruses, mouse viruses, crow and pigeon viruses, bat viruses, not to mention the ocean of insect and fungus and amoeba and plant viruses.

Almost all of these assaults don’t even scratch our defenses.  The viruses can’t even enter our bodies, and if they do then they can’t enter our cells, and if they do they can’t replicate in our cells, and if they do then they can’t  …

Most viruses, in other words, can’t effectively jump species.  Even when they do, they’re usually not well adapted to the new species, and they can’t establish a productive chain of infections. Even if they cause a disease, they burn themselves out, infecting fewer and fewer individuals each round of infection, until they disappear.

But every so often, in a tiny minority of cases, the virus does get a foothold.  This is one of the ways that “emerging infections” get started.  It covers things like HIV, SARS, parvovirus of dogs, Ebola, and of course the new H1N1 swine-origin influenza virus (SOIV), among many others.

Why did these guys take off, when so many other viruses failed? Why did SOIV infect people last year, while decades of exposure to pigs and swine H1N1 influenza viruses didn’t lead to earlier pandemics?  Basically, we don’t know, and we’d really, really like to know, so we have a chance of predicting the next SOIV or HIV before it’s a pandemic.

OK, so that explains why I’ve written a fair number of posts here on species-jumping in viruses (here, here, here, here, and here), and partly explains why I want to mention a new paper from Bertram Jacobs‘ lab1.  (The rest of the reason is, as always, that I just think it’s  cool.)  I’m not sure why Jacobs has done this particular project, because he’s more of an interferon guy, but he’s looked at the origins of ranaviruses and finds evidence for lots of species shifts in their history.

Dekay - Salamanders & turtle
“The Smooth Terrapin (Emys terrapin)”, by James Dekay
(Zoology of New York; or, The New York fauna, 1843)

Ranaviruses are probably best known as frog viruses, but they infect a bunch of cold-blooded animals — fish, frogs, salamanders, turtles, and so on — and several of them are causes of emerging infectious disease (as I discussed last time I talked about ranaviruses, here).  Jacobs’ group looked at about a dozen of them whose genomes are completely sequenced2, and tried to put together their evolutionary history, which turns out to involve all kinds of cross-species jumps:

…we hypothesize that the most recent common ancestor of the ALRVs was an ancestral fish virus …  Both of these hypotheses suggest that for the majority of evolutionary time vertebrate iridoviruses were confined to fish, and much more recently, there appear to have been at least three species jumps, from fish to frogs, from fish to salamanders, and from frogs to reptiles, and perhaps as many as four species jumps, including a jump from tetrapod amphibians back to fish. It is tempting to speculate that activities associated with human harvesting of aquatic organisms during the past 40,000 years led to the more common recent jumping of ranaviruses among aquatic organisms.1

(My emphasis) They don’t offer any specific reasons why the ranaviruses should be able to leap from species to species like the chamois of the Alps, but they do make the general point that these viruses tend to be rather promiscuous to start with.  Not only are closely-related viruses able to infect different hosts, but even the same viruses often are able to infect a wide range of species; the fish virus they sequenced in this paper, epizootic hematopoietic necrosis virus, can infect a half-dozen different species of fish.  They raise an interesting comparison:

In addition, the ability of this group of viruses to infect such a wide variety of host species suggests that more host shifts are likely. Therefore, it is important that we understand more of the evolutionary traits of this unique group of viruses, as there is no other closely related group of viruses that infect such a broad group of hosts, with the possible exception of the orthomyxoviruses.1

Orthomyxoviruses, of course, include influenza viruses, which notoriously infect humans, pigs, ducks, chickens, wild waterfowl, horses, and dogs; and you’ll recall all the reports during the epidemic phase of SOIV of the virus infecting all kinds of other pets and domestic animals.  Influenza viruses are apparently evolving at an even faster pace than the ranaviruses, and experimenting with even more species; but there may be lessons for us (as influenza hosts) in the ranaviruses.


  1. Jancovich, J., Bremont, M., Touchman, J., & Jacobs, B. (2009). Evidence for Multiple Recent Host Species Shifts among the Ranaviruses (Family Iridoviridae) Journal of Virology, 84 (6), 2636-2647 DOI: 10.1128/JVI.01991-09[][][]
  2. Including epizootic hematopoietic necrosis virus, whose genome they sequenced themselves[]