If you have a genome annotated in RAST and you want to create a genome scale metabolic model, here is one way to do it using PyFBA and the SEED.
Before you start, you’ll need to install a couple of libraries:
- The PyFBA library has detailed installation instructions. Don’t be scared, its mostly git clone and pip install.
- Get the SEED Servers as you can get a lot of information from them. You can install the git python repo from github
We need to get the assigned_functions file from RAST. The easiest way to find that is in the RAST directory by choosing `Genome Directory` from the Downloads menu on the job details page.
For my example, I am going to build an Acidobacteria model. Using the genome 1121860.3 that I downloaded from RAST.
Creating the basic model
The first step is to create a basic model, which probably won’t grow, but which we can use as the basis of our models.
Converting assigned_functions to reactions
We need to create an initial list of reactions for our model. There is example code that will convert roles to reactions and make a list of reactions you can use in the model:
python example_code/assigned_functions_to_reactions.py -a assigned_functions > reactions.txt
This list of reactions is essentially the basic model but is not gap filled, and so probably won’t grow. You can test that using fba from reactions.
python example_code/fba_from_reactions.py -r reactions.txt -m media/VL55.txt
Gives the output:
In parsing the bounds we found 29 media uptake and secretion reactions and 66 other u/s reactions Length of the media: 29 Number of reactions to run: 1070 Number of compounds in SM: 1095 Number of reactions in SM: 1166 Revised number of total reactions: 34791 Number of total compounds: 45617 SMat dimensions: 1095 x 1166 Initial run has 0.0 --> Growth: False
Note that the last line says Growth: False. As expected, our model did not grow.
Gap filling the model
The next step is to gap fill the model There are many different approaches to gap filling (summarized in Cuevas and Edwards), and PyFBA has a several different methods to gap fill. One that appears to be effective is to use roles from closely related genomes to complete your model. We will generate a list of reactions that were added to the model, and a list of functional roles that complete those reactions, at the end of this example.
Getting reactions in closely related genomes
Before we can start with building models and gap-filling, we want to get reactions from closely related genomes. We use the SEED servers and the RAST_neighboring_roles.py code to get a list of reactions that should also be in the genome.
[Note: If this code was not installed when you installed PyFBA. If not, you can use `git clone` to clone the repository including the example code].
python example_code/RAST_neighboring_roles.py Acidobacteria \ > Acidobacteria.txt
It turns out that the taxonomy of Acidobacteria is a little confused (its a new genus, according to Wikipedia … I’m not sure if that makes it better or worse that it is confused). Therefore, I’m also going to pull the list of roles associated with the genus Acidobacterium
python example_code/RAST_neighboring_roles.py Acidobacterium \ > Acidobacterium.txt
Then I used a one line perl code to join the two files together:
perl -ne '@a=split /\t/; $v{$a[0]}=$a[1]; END {map {print "$_\t$v{$_}"} keys \ %v}' Acidobacteri*txt > Acido.txt
This is the file that I will use for the rest of the work.
Gap filling the basic model
python example_code/gapfill_from_reactions.py -r reactions.txt \ -m media/VL55.txt -c Acido.txt gapfilled_reactions.txt \ 2> gapfilled_reactions.err
This creates two files, the err file tells you which reactions were added and why, and the .txt file has a list of all the reactions, including the ones in the basic model and the ones that were added.