Accessing the SDSU Seed

The SDSU Seed (aka Phantome Seed, phage seed) is a complete local seed install. I mainly update the phages on this (because it is the phage seed), but can update microbial genomes if you need. If you want a more up to date site with microbial genomes check out the SEED servers (and my separate blog post about using those).

In the read more I detail how to access the local SEED if you are interested.

 

To start you need to source the fig environment variables. The SEED runs with its own version of perl and all modules, and also has a lot of extra @INC directories with all the modules.

Login to edwards.sdsu.edu and then source the config:

source ~fig/FIGdisk/config/fig-user-env.sh

(Note there is also a csh version if you prefer the c shell).

This should change your version of perl:

which

perl /home/fig/FIGdisk/env/linux-debian-x86_64/bin/perl

Now instead of using /usr/bin/perl you should use that perl in your scripts. You should also add all the extra includes. Luckily, the headers are printed by a simple script called tool_hdr.

Open a text editor and read the output from tool_hdr. For example, if you use vi, you can use this command

:0r!tool_hdr

which will read the output from tool_hdr and insert it at line 0 of your file.

Now you can create perl code as you would, and use all of the fig methods.

The bulk of the routines to access data are in the FIG.pm module. This is a monolithic module, and there is a lot in there. I use two aliases (provided for you once you source the fig-user-env.sh file to expedite finding what you need.

  • figroutines — this is a list of all the subroutines in FIG.pm (basically it is grep ^sub FIG.pm | less) so you can look for the different routines you need
  • figmethods — this is the FIG.pm source code (essentially less FIG.pm). The documentation in FIG.pm is … spotty … and so this is the best place to look for answers.

Here is how to create a simple script to get all the functions of all the proteins in all the genomes:

 

use FIG;
my $fig=new FIG;
foreach my $genome ($fig->genomes()) { # with genomes you can limit the choice by domain or completeness
foreach my $peg ($fig->pegs_of($genome)) { # a peg is basically the same as a CDS -- it stands for protein encoding gene
print join("t", $peg, scalar($fig->function_of($peg))), "n"; # in the array format function_of returns a tple of [annotator, function]. As a scalar it just returns the function.
}
}