Annotate Metagenomes In An Instant

RTMg has been available a while, and we have done some pretty cool stuff with it [e.g. web, mobile, open social, and all publicly available metagenomes], but we need to enable others to play with it too. Now everyone can enjoy metagenome annotation in an instant. (Not the flavorless instant coffee type instant, but the rich and bountiful instant gratification type instant!). Don’t believe me? Here is a video I made showing how to annotate a metagenome and create a pie chart of the data.

{youtube}OTaMNRXMBXM{/youtube}

After the read more I’ll show you how to do it too.

 

 

Start by downloading and install the SEED Servers. The code and installation instructions are all at http://servers.theseed.org/

Next, you need to grab the bioinformatics realm of my lab’s sourceforge repository: http://edwards-sdsu.cvs.sourceforge.net/edwards-sdsu/

What you really need are the Rob.pm module and the perl code shown in the video (countfastachars.pl, dereplicate_metagenomes.pl, assign2DNA.pl, add_ss_to_assignedDNA.pl, and summarize.pl).

Make the code executable, and ensure that the modules from the SEED servers are in your PERL5LIB environment variable, and you should be away. The only additional perl module you may have to install is YAML.pm

Here is all the code shown in the video above, so you can copy and paste it in your own windows!

# here is an example metagenome.
ls -lh

countfastachars.pl -s example_metagenome.fna
# it has 33,475 sequences
# and 13,616,305 bp
# based on the average length of 406 bp,
# I’d think it is a 454 metagenome (but I don’t know!)

# First, we dereplicate the metagenome
# this makes four files – exact duplicates, 5′ duplicates, 3′ duplicates, and a dereplicated file
dereplicate_metagenomes.pl -f example_metagenome.fna -q example_metagenome.qual

# the fasta file is now ending .fa and the corresponding quality file ends qu

# now we annotate the metagenome using the seed servers
assign2DNA.pl -r 3 -m 600 -k 8 -f example_metagenome.dereplicated.fa

# An annotated metagenome … already!!
ls -lh example_metagenome.dereplicated.fa.8-mers.txt

# add the subsystems to the annotations (by default we don’t provide them)
add_ss_to_assignedDNA.pl example_metagenome.dereplicated.fa.8-mers.txt > example_metagenome.dereplicated.fa.8-mers.subsystems.txt

# finally, we can summarize the metagenome
summarize.pl example_metagenome.dereplicated.fa.8-mers.subsystems.txt

# now, we can make a pie chart
oocalc example_metagenome.dereplicated.fa.8-mers.subsystems.one_level.txt

# THIS WORK IS COPYRIGHT ROB EDWARDS, Argonne National Lab and San Diego State University, 2010