I just hit the “submit” button for our latest frost flower paper, which reviewer-be-willing will appear in an upcoming polar and alpine special issue of FEMS Microbial Ecology. Several scattered bits of analysis from the paper have appeared in this blog, here’s a recap of what we did and why.
In a 2013 paper we described an unusual community of bacteria in Arctic frost flowers composed primarily of members of the bacterial order Rhizobiales. This was a bit odd because Rhizobiales, while not unknown in the marine environment, or not typically found in high abundance there. Rhizobiales have definitely not been observed in association with sea ice. To try and figure out what was going on with this particular community we went back to the same samples (originally collected in April, 2010) and used whole-genome amplification to amplify all the DNA until we had enough to sequence metagenomes from one frost flower and one young sea ice sample. The good people at Argonne National Lab did the actual sequencing (great work, great prices…).
I’ve never worked with metagenomic data before, and once I had the actual sequence reads it took me a while to figure out how to get the most information out of them. For a first cut I threw them into MG-RAST. This confirmed the Rhizobiales dominance easily enough, but for publication quality results you really need to do the analysis yourself. Ultimately I used a four-pronged approach to evaluate the metagenomes.
1. Extract 16S rRNA reads and analyze taxonomy. For this I use mothur, treating the 16S associated reads like any 16S amplicon dataset (accept that you can’t cluster them!).
2. Align all the reads against all available completed genomes. We’ve got 6,000 or so genomes in Genbank, may as well do something with them. This actually provides a ton of information, and backs up the 16S based taxonomy.
3. Assemble. I’m not that good at this, the final contigs weren’t that long. Cool though.
4. Targeted annotation. One would love to blastx all the reads against the nr database, or even refseq protein, but this is silly. It would tie up thousands of cpus for days,this computing power is better spent elsewhere. If you have some idea what you’re looking for however, you can blastx against small protein databases pretty easily. After the blast search you can use pplacer or a similar tool to conduct a more fine scale analysis of the reads with positive matches.
So what’d we find? The frost flower metagenome contains lots of things that look like Rhizobiales, but that are pretty different from most of the Rhizobiales with sequenced genomes in Genbank. Among other things they don’t have nodulation genes or genes for nitrogen fixation, unlike many terrestrial Rhizobiales. The community does have genes for some interesting processes however, including the utilization of dimethylsulfide (DMS) – a climatically active biogenic gas, the degradation of glycine betaine – which may be an important compound in the nitrogen cycle, and haloperoxidases – oxidative stress enzymes that release climatically active methylhalides.
You can find more details on some of this analysis on this poster.