Jigsaw (Wellcome Images)

Every so often — not often enough — I run across a paper that’s so ridiculously ingenious that it just makes me laugh with pleasure.

Ladies and gentlemen, a round of applause, please, to Shou-Wei Ding, of the Center for Plant Cell Biology at UC Riverside, for his Rube Goldberg-esque brilliant technique for identifying new viruses. 1

Background: Small interfering RNA (siRNA, 2 or RNAi) is pretty well known nowadays, especially since three of its discoverers were given Nobel Prizes a few years ago. These small RNAs are found in most eukaryotes — plants, insects, worms, as well as birds and mammals. In fact, siRNA was first discovered in plants and then was widely used in insects and worm research well before it was shown in mammals. In mammals, we tend to think of small RNAs as having regulatory functions. In plants, insects, and worms, there are certainly regulatory siRNAs, but siRNAs are also used as an important part of their antiviral immune response. (Unsurprisingly, plant and insect viruses themselves have defenses against these siRNAs, by producing anti-siRNA genes.)

siRNAs are small (duh), maybe 20-30 bases long. Without going into mechanisms more than necessary, they recognize specific sequences of their target RNA (by being complementary to the target). When they bind to their target, they cause that target RNA to be chopped up into small pieces. Some of those small pieces can then act as new siRNA and the cycle continues.

Ding’s group reasoned that (1) if insect viruses are being attacked by siRNA, and (2) the viruses are then chopped up into new siRNA, then (3) all you have to do is piece together the siRNA, to recover the virus sequence.

Seems sort of obvious now that I put it together that way, but I would have said that there’s no way it would work — not enough coverage, I would have said, you’d only see a tiny fraction of the genome. Even if there was enough coverage, how would you find the virus pieces in the huge pool of other siRNAs? And even if you could do that, how would you piece them together? It would be like a jigsaw puzzle where every piece was just a tiny snippet of blue sky.

But, wonderfully, it actually worked for Ding’s group. Not only did they identify viruses that they knew should be there, they also pulled out five brand-new viruses out of their insect cells:

In this study, we found that viral small silencing RNAs produced by invertebrate animals are overlapping in sequence and can assemble into long contiguous fragments of the invading viral genome from small RNA libraries sequenced by next-generation platforms. Based on this finding, we developed an approach of virus discovery in invertebrates by deep sequencing and assembly of total small RNAs (vdSAR) isolated from a host organism of interest. Use of this approach revealed mix infection of Drosophila cell lines and adult mosquitoes by multiple RNA viruses, five of which were previously undescribed.

Ding 2010 virus genome assembly
“Virus discovery in OSS cells by viral genome assembly from sequenced viral piRNAs of 25–30 nucleotides in length after viral siRNAs were removed.”

The ability to piece these tiny fragments together is a spin-off of the new genome sequencing platforms, which by their nature make very short reads that have to be computationally stitched back together. Most of these new platforms make slightly longer fragments than siRNA size, but they’re in the same ballpark and I guess the same approaches work.

As far as the coverage, they didn’t find 100% of the genomes, but they got a really surprisingly high fraction back — 80% to 95% or more of the various viruses.

Not only that, they got really new viruses:

As a result, none of the four viruses could be assigned into an existing virus genus. This suggests that vdSAR is capable of discovering viruses that are only distantly related to known viruses.

Giving credit where it’s due, another group recently, and independently, used a similar approach to identify viruses in sweet potatot plants3 but I didn’t notice that article until Ding pointed it out.

Whether or not this technique proves useful in the long run, it’s just so ingenious that I want it to succeed.


  1. Wu, Q., Luo, Y., Lu, R., Lau, N., Lai, E., Li, W., & Ding, S. (2010). Virus discovery by deep sequencing and assembly of virus-derived small silencing RNAs Proceedings of the National Academy of Sciences, 107 (4), 1606-1611 DOI: 10.1073/pnas.0911353107[]
  2. I’m going to call all of the various types of small RNAs “siRNA” here, but that’s just shorthand, there are different subclasses that I won’t go into[]
  3. Kreuze, J., Perez, A., Untiveros, M., Quispe, D., Fuentes, S., Barker, I., & Simon, R. (2009). Complete viral genome sequence and discovery of novel viruses by deep sequencing of small RNAs: A generic method for diagnosis, discovery and sequencing of viruses Virology, 388 (1), 1-7 DOI: 10.1016/j.virol.2009.03.024[]