Category Archives: Phage

Phage genome annotation for therapeutic phages

Annotating phage for therapeutics

It is always therapeutic to annotate phages, but in this instance, we are specifically thinking about how we annotate phages so that we can use them for therapeutic purposes.

Rob is giving a talk at ESCMID entitled “The annotation of therapeutic phages” where he discusses some of the issues that come up. This blog post accompanies that talk and provides links to some of the papers that he discusses.

Host lifestyle prediction

It is generally accepted that lysogeny is bad for therapeutic phages, but there are ways around it.
Lysogeny can lead to superinfection exclusion, recombination with other phages, and the development of phage resistance.

Here are some tools that can be used to predict whether phages are lytic or lysogenic

We can overcome lysogeny, either through engineering phages, or through Gibson Assembly based on prophage sequences. These two papers suggest some cutting edge approaches to making that happen!

Toxins

Phages encode a lot of toxins! They help the bacteria replicate and escape a nasty death, and provide a mechanism for the spread of the phage.

In Streptococcus, the presence of toxins helps the bacteria spread, and we know phages control bacterial virulence.

Antibiotic resistance

Obviously it would be bad if the phage encoded an antibiotic resistance cassette, and there is some evidence that they do occassionally:

But the jury is still out on how important this is! For many, espeically lytic phages, they may not care about antibiotics since they are going to kill the host anyway. There is some debate as to the importance of antibiotic resistance genes in phages.

Databases

Nonetheless, because of the overall importance of antibiotic resistance in bacterial genomes (which, after all, is the reason we are here), there are lots of databases that you can use to search for different antibiotic resistance genes.

Ensemble approaches for therapeutic phages

New ways of identifying phages that have the potential for therapy are starting to emerge, and these are some of the ensemble tools that are trying to integrate multiple lines of evidence and provide support for phages for therapy.

AlphaFold of all Phage Lambda Proteins

DeepMind’s AlphaFold is winning at predicting tertiary structures from primary amino acid sequences. We thought it would be fun to investigate how it performed on phage Lambda.

We took the NCBI version of λ and extracted all the proteins, and then ran them through AlphaFold. It was able to make a prediction for all the proteins except for three proteins: NP_040594.1 (144 amino acids), NP_040597.1 (232 amino acids), and NP_040645.1 (158 amino acids).

Click to see a larger version

As you can see, many of the structures are just predicted to be long alpha helices with little order, but some of the structures are complex and closer representation to the predicted structures.

There are, of course, a heap of caveats to this analysis, including the fact that we did not (at this time) filter out any of the existing phage λ structures so one would hope that those are really good!

You can download all the best ranked structures for phage Lambda so you can view them in your favorite structure viewer

Global Distribution of Crassphage Map

How to make beautiful maps

Making maps is hard. Even though we’ve been making maps for hundreds of years, it is still hard. Making good looking maps is really hard. We published a map that is both beautiful and tells a story, and this is the story of how we made that map.

But a figure like this does not appear immediately, it takes work to get something to look this good, and needless to say it wasn’t me that made it look so great!

Continue reading

Submitting a PhiSpy update to pip and conda

First, make sure everything is upto date in GitHub.

We are going to call this release version 4.0 and we will have release candidates, starting at rc1

First, create a release on GitHub. Strictly speaking you don’t need to do that but it is a great thing to do.

PyPi Release

The PyPi instructions cover this, but I have abstracted out the parts we need to focus on (since we have a setup.py already!)

As a regular user we build everything. This make a new release that we will upload

python3 setup.py sdist

This will create the tarball and the wheel file in the dist directory. Then we need to upload those to PyPi.

We are going to use the PyPi test interface to make sure that everything is OK. Do not skip this step!

If you need an API key, navigate to the PyPi login page . However, if you have done this before, you probably don’t need to save it again 😉

python3 -m twine upload --repository testpypi dist/PhiSpy-4.0.0rc1.tar.gz

Note that you can not upload the wheel. Binary wheels from linux are not supported.

Now we are going to test it out. Lets make a virtual environment and install it there

virtualenv test_phispy
cd test_phispy
source bin/activate
which pip

This should tell you that the current pip is from your virtual environment. If it is not, solve that problem!

For PhiSpy, we have a couple of dependencies that you should install with regular pip before you can install your new release candidate:

pip3 install scikit-learn biopython

This will install other things like numpy that you need.

Now you can install your new release.

pip install -i https://test.pypi.org/simple/ PhiSpy==4.0.0rc1

If you are not sure exactly the URL, logging into the PyPi test login page will show your available repositories, including the newly uploaded repository. If you click on the version you want, you can get the link to download and install that.

Once you are happy and have run some tests, login to the real PyPi page (good to do anyway, even if you have an API key)

Now you can upload the final version to PyPi for everyone to access

python3 -m twine upload dist/PhiSpy-4.0.0.tar.gz

Its worth logging into the real PyPi page to make sure that you can download it!

Making a CONDA release

It turns out that for most code all you have to do is wait! The conda bots will take care of incrementing to the next version and running the continuous integration tests for you.

However, if you need to update the code manually, you probably need to change the version in meta.yaml and then you should update the SHA hash:

URL=https://pypi.io/packages/source/p/phispy/PhiSpy-4.2.17.tar.gz
wget -O- $URL | shasum -a 256

and then paste the output of that into the SHA field. In this case, the shasum should be

38bb8f072e2eba8efe0c46258ad9b45940eed8f126901af9d455ad0bae396e99

Note: the format for this block is:

TOOL=PhiSpy-4.2.17.tar.gz
TL=$(echo $TOOL | cut -f 1 -d '-' | awk '{print tolower($0)}')
URL=https://pypi.io/packages/source/${TL:0:1}/$TL/$TOOL
wget -O- $URL | shasum -a 256

Press about Global Phylogeography of crAssphage

Our paper on the global phylogeography of crAssphage is published in Nature Microbiology. You can read the paper at the Nature Microbiology website or on ReadCube. The paper garnered international press attention, and here we have summarized the press coverage.

Please let Rob know if you are aware of any other reports that are not included here.

Continue reading

crAssphage poops up again!

Yet again, analysis of a metagenomic sample shows that crAssphage is the most abundant phage anywhere. It also shows what a dis-service NCBI did to science by deleting the crAssphage record. We used meta-spades to reconstruct the entire crAssphage genome from someone else’s data set, but in their paper, the largest contig was ~3 kb. This analysis suggests that crAssphage is present in ulcerative colitis samples but the abundance goes down after treatment!

Continue reading