Swine flu PB2 gene phylo tree
Swine flu PB2 gene phylogenetic tree

This post will be updated through the day.
Update 1 (near the bottom): I don’t think the Ohio/07 virus is particularly closely related to the current swine flu
Update 2 (at the bottom): A helpful comment from The Virology Blog about the ancestry of the virus

As I noted yesterday, the CDC has released a number of the Swine Flu genome sequences; they’re available here.  (There are  more sequences released yesterday.)  In my quick and primitive glance at them yesterday, I didn’t see anything very remarkable about the sequences — they looked like more or less straightforward swine flu things.

Sandra, at Discovering Biology in a Digital World, has run some more sophisticated analyses on the sequences and suggests that these viruses may be “the same strain that caused an outbreak in 2007 at an Ohio country fair.”  She is refering to the virus A/SW/OH/511445/2007, which is described here:

Vincent, A., Swenson, S., Lager, K., Gauger, P., Loiacono, C., & Zhang, Y. (2009). Characterization of an influenza A virus isolated from pigs during an outbreak of respiratory disease in swine and people during a county fair in the United States Veterinary Microbiology DOI: 10.1016/j.vetmic.2009.01.003

In August 2007, pigs and people became clinically affected by an influenza-like illness during attendance at an Ohio county fair.  … Approximately 26 people in close association with the fair pigs were affected by an influenza-like illness. Viruses from at least two individuals were isolated, sequenced and analyzed at the Centers for Disease Control and determined to be nearly identical to the swine virus studied here (A. Klimov, personal communication).

The accession numbers for the Ohio virus are EU604689-EU604696.

Please see Sandra’s page for her explanation of what she did.  However, I am not yet convinced.  I ran my own analysis, selecting sequences in what I think is a diffferent, and what I think is a more appropriate, way, and I did not see the present swine flu clustering with the Ohio strain.  (See the snippet of a figure at the top left here; click for a larger version.)  I am far from an expert in phylogenetic analysis or on influenza virus, so I’m not saying I’m right by any means — but I would like to see an explanation of why my approach is wrong.

Basically — as far as I can tell; Sandra hasn’t posted her complete methods as I write this, though that will come later today — the difference between our conclusions may be the way we collected sequences for comparison.  She “used H1N1 (and a couple of H1N2) protein sequences from Swine and Humans between Jan 1 2006 and today.”  I didn’t try to restrict the sequences in any way — rather, I ran a BLAST search on the non-redundant nucleotide collection in GenBank and selected the top 100 (or 1000) most similar sequences for comparison.  To me, it makes sense that we should not artificially restrict our sample, but rather should look at all the closest matches we can find.

In my PB2 tree, the nearest neighbours are another California isolate of the new strain, then the closest cluster includes A/duck/NC/91347/01(H1N2), A/mallard duck/South Dakota/Sg-00125/2007(H3N2), A/pintail duck/South Dakota/Sg-00126/2007(H3N2), and A/swine/Korea/CAS05/2004(H3N2).  (The avian PB2 is expected, because current swine flu strains have acquired an avian PB2.)

One way to look more closely at this might be to look at some of the gene sequences more specifically.  Vincent et al. point to some unusual features of the Ohio virus’s HA  gene.  Though it’s not likely to be unique it may help sort out relationships here:

Unique changes at antigenic determinant sites were identified in the OH07 HA at positions 71 and 162 and may play a role in the loss of cross-reactivity. The NA gene was shown to be related to the swine N1 phylogenetic cluster (Fig. 1B). The internal genes (Fig. 2A–F) were shown to be of the triple reassortant SIV lineage and group with those of the cluster IV H3N2 viruses reported by Olsen, et al. (Olsen et al., 2006). The PB2 gene was determined to contain the conserved avian amino acid residue glutamic acid at position 627, reported to be important in avian and human host specifcity (Subbarao et al., 1993).

Right now I need to get my kids ready for school, but I’ll look at those genes later today.

Update 1. OK, I’ve looked at the sites in the Ohio virus that Vincent et al flagged as being unsual.  The current swine flu viruses do not match all those unusual sites.  They do match some but so do many (dozens of) other swine flus.  I am not really seeing evidence that the A/SW/OH/511445/2007 swine flu isolate is “the same strain that caused an outbreak in 2007 at an Ohio country fair.”

Vincent et al (their Table 3) flagged the following sites as unusual; here are comparisons.  I ran several from the current outbreak — also a couple of HAs from swine flus that a simple BLAST search identified as highly related to the current outbreak (much more similar than is the A/SW/OH/511445/2007).  The figure at bottom shows it graphically (current outbreak in blue — A/SW/OH/511445/2007 in green — unusual residues boxed in red) and the table breaks it down.

71 73 74 142 156 162
A/California/04/2009_H1N1 S A S K N S
A/California/07/2009(H1N1) S A S K N S
A/Texas/05/2009(H1N1) S A S K N S
A/Swine/Ohio/891/01(H1N2) F A S K N S
A/SW/OH/511445/2007 S A S N N N
A/swine/Guangxi/17/2005(H1N2) F A S K N S

Swine flu HA aligned

Update 2. Vincent at The Virology Blog says:

According to ProMED-mail, the NA and MP genes are related to those of influenza viruses from Asian-European swine, and the other genes appear to originate from swine flu viruses from pigs in North America. The data are in accord with the original assertion of the CDC that all genes of the new isolate were derived from swine viruses.