Going FISHing

***NOTE***

For the code in this article to work you must use the reverse complement of the probe, not the probe itself.  I’ll correct it in the near future.

************

I’m preparing for some spring field work in Barrow, Alaska where I’ll be following up (hopefully) on some odd observations that we had there in 2010.  In particular I’d like to quantify some specific marine bacteria in Barrow sea ice.  The method that I’ll use to do this is called FISH, which stands for fluorescent in-situ hybridization.  FISH is a microscopy technique, so it differs from my normal sequence-based approach to evaluating community composition.

On the left, marine bacteria stained with DAPI, a non-specific stain that binds to DNA. On the right one specific clade is identified using FISH. The image came from http://www.teachoceanscience.net/teaching_resources/education_modules/marine_bacteria/explore_trends/.

On the left, marine bacteria stained with DAPI, a non-specific stain that binds to DNA. On the right one specific clade is identified using FISH. The photo is courtesy of Matthew Cottrell to the original site at http://www.teachoceanscience.net/teaching_resources/education_modules/marine_bacteria/explore_trends/.

All we know about our mystery 2010 bacteria at this point are their classification based on partial 16S sequences.  Using this classification I’ve selected several FISH probes.  A FISH probe is a short (18-20 nucleotide) strand of DNA identical to some diagnostic region of a target clade’s 16S rRNA gene (a clade is an evolutionarily related group of organisms).  Since the actual 16S rRNA in the target ribosomes are complementary to the 16S rRNA gene, the probe – if it finds its way into a bacterial cell belonging to that clade – should stick to the ribosome.

Tagged on to one end of the probe is a molecule that fluoresces under green light.  Viewed in a microscope under green light bacteria that have taken up the probe, and in which the probe has adhered to the ribosome, will show up as tiny points of red light.  Assuming you know how much water you made the slide from, counting the tiny points of light tells you how many members of the target clade were present when the sample was taken.

FISH probes can be have a broad or narrow specificity.  There are probes for example, that tag virtually all Bacteria and Archaea (universal probes), just Bacteria or just Archaea (domain level probes), and down through the taxonomic rankings to probes that tag only for specific bacterial “species”.  The probes that I’ll be using target at approximately the family level  (if your high school biology is too far back, the taxonomic rankings go domain, phylum, class, order, family, genus, species).

I’m fortunate to have two metagenomic datasests from our 2010 field season as this gives me the opportunity to test the probes before going into the field.  Without testing I don’t really know if they’ll target the bacteria we observed in 2010.  A metagenome is a collection of short DNA sequences from the most abundant genes in the environment, in this case 10-20 million sequence reads.  Among that massive collection of gene fragments are fragments belonging to 16S rRNA genes, and some fraction of those belong to the bacteria that I’d like to count this year.  With a little Python it’s a simple thing to search the metagenome for sequence fragments that match the probe:

import re
probes = ['TCGCTGCCCACTGTC', 'CCGACGGCTAACATTC','CTCCACTGTCCGCGACC']
found_42 = []
found_44 = []

for each in probes:
     n = 0
     h = 0
     probe = re.compile(each)

     with open('overlapped_42.fasta','r') as fasta:
          for line in fasta:
               n = n + 1
               print n
               if re.search(probe, line) != None:
                    h = h + 1
     found_42.append(h)
     fasta.close()

print found_42

The “with open” statement was new to me, the advantage to using it here is that the sequence files (5-10 Gb) are not held in memory.  The script isn’t exactly fast, it took about 2 hours to search all the sequences in both metagenomes for all three probes.  You might notice that it wastes a lot of time looking at the sequence id lines in the fasta files, eliminating that inefficiency could probably cut down on time substantially.  The good news is that the number of probe hits matches really well with our expectations.  Over 2500 hits between the three probes for one of the metagenomes!  Had we used FISH in 2010 it is very likely that we would have seen an abundance of our target bacteria, hopefully we will this year…

Sidenote: Bacteria vs. bacteria… It may appear that my capitalization is inconsistent for bacteria.  Bacteria can refer, as a proper noun, to the domain Bacteria.  It can also be used as a common noun for prokaryotic life in general, in which case it is simply bacteria.  Unfortunately some have advocated rather strongly for the word bacteria to replace the word prokaryote on the grounds that the latter term suggests a closer evolutionary relationship between the Archaea and Bacteria than actually exists.  Given the strong ecological overlap between the Bacteria and Archaea it is essential to have a term that refers to both as an ecological unit, and prokaryote, while imperfect, is much less misleading than bacteria!

4018 Total Views 10 Views Today
This entry was posted in Research. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

WordPress Anti Spam by WP-SpamShield