Monthly Archives: August 2010

How to join the lab

The Edwards lab is a dynamic lab. Our students come for a while and do exciting things, and then head into industry and earn big salaries or on to academic positions to break the frontiers of science. We are always looking for scientists with curiosity, drive, and the ability to understand Rob’s ideas and convert them into something meaningful!

If you are interested in joining our lab, the first thing you need to do is understand what we do. Take a look at the Research and Projects section of the web site to see our funded projects. Also, check out our open projects page.

Once you’ve done all that, talk to Rob.



  • Read the web, think about the kind of science we do
  • Stop by the lab, talk to people working in the lab
  • Find the open problems page and think about some of those
  • Come up with an idea for a project. Even if its unworkable, at least you have had an idea


  • Expect Rob to give you a project when you walk in the door
  • Expect Rob to give you a project if you don’t even know what the lab works on

Marine Sciences

The US-Brazilian Consortium for Marine Sciences is funded by the Department of Education through its Fund for the Improvement of Postsecondary Education (FIPSE), and the Fundacao Coordenacao de Aperfeicoamento de Pessoal de Nivel Superior (CAPES) from the Brazilian Ministry of Education. We’ve assembled a team of marine sciences researchers from San Diego State University and Scripps Institution of Oceanography, together with a team from the Federal University of Rio de Janeiro (UFRJ), the Universidade Federal de Pernambuco, and Universidade Federal da Paraíba, together with FIOCRUZ and the Rio de Janeiro Botanical Gardens. Together, we will develop a completely new marine sciences course to be held in Brazil in 2011 and 2012, and exchange students between San Diego, Rio de Janeiro, Pernambuco, and Paraíba.

Dark Matter

The viral dark matter is all the sequences that we find in metagenomes that we don’t know what to do with. In a project funded by the National Science Foundation, together with Dr. Forest Rohwer and Dr. Anca Segall in the SDSU Biology Department, and Dr. Alex Burgin, we will tackle some of this dark matter. We’re going to combine metagenomics, metaproteomics, metabolomics, and structural biology to unearth the functions of sets of genes that we have no idea what they do.

Posters from the lab

We like to present posters and we maintain a rotating display of them in the 2nd and 4th floor halls of GMCS at SDSU. Come by and see them life-sized sometime. Here are some of our posters. If you see us at a meeting, say hi!

Talks that Rob Edwards has given

Below are links to some of the talks that Rob has given, and if his talk is not listed, you should email him and he’ll send it to you.
Recent talks are available directly for download from Rob’s Page




You gotta lyse that lysin before the lysin lyses you!

Phages kill bacteria. That’s their ultimate goal. Yet, they have to maintain the bacterial cell integrity until they’re done with making new phage particles. So, they carefully control the bacterial genome till they replicate their DNA and package it in nascent phage particles. Once these are formed and are ready to leave, they need to leave. They engage in a highly timed and orchestrated procedure of poking holes in the bacterial membranes (using phage holins), degrading the bacterial peptidoglycan-based cell wall, then—if the bacterial host happens to be a gram-negative cell—breaking the outer membrane too!

In the event a phage decides to remain “dormant” inside a bacterium, things get a bit more complicated. A so-called “arms race” is generated. For bacteria, phages are time bombs that can be induced at any time to kill the bacteria. How would bacteria avoid this fatal vampirish ending? They have to “tolerate mutations” in the phage’s most dangerous protein-encoding genes. If the gene that controls phage induction is damaged, this may salvage the bacteria. Other tempting targets are the lysis modules! If lysins or holins are disabled, the domant prophages may remain captive forever (or rather until prince “helper phage” comes and frees them from that peptidoglycan-walled prison.

So, if you’re a bacterium, it’s smart to disable the lysin genes, one way or another. If you’re a scientist studying bacterial and phage genomes, there is no better way to find this out than using the subsystems-based SEED server. Using subsystems allows you to find out how closely related phages and prophages may have very different lysin genes. In the diagram below, a bunch of staphylococcal phage and prophage genomes are compared. You will notice immediately how some of their lysins (in Red, labeled # 1) are sometimes truncated. A truncated lysin is bad news for a phage. It means the phage is on its way to be enslaved by the bacterium for long years to come!

Truncated and intact lysins in staphylococcal phages


Identifying Prophages in Bacterial Genomes

Finding prophages in microbial genomes remains a problem with no definitive answer. The majority of existing tools rely on detecting genomic regions enriched in proteins with known phage homologs, which hinders the de novo discovery of phage regions. In this study, a weighted phage detection algorithm, Phage_detector was developed based on seven distinctive characteristics of prophages i.e. protein length, transcription strand directionality, customized AT and GC skew, the abundance of unique phage words, phage insertion points and the similarity of phage proteins. The first five characteristics are capable of identifying prophages without any sequence similarity with known phage genes. Phage_detector locates prophages by ranking genomic regions enriched in distinctive phage traits, which leads to the successful prediction of 92% of prophages (including 33 previously unidentified prophages) in 95 complete bacterial genomes with 8% false negative and 18% false positive.

PHACTS: Phage Classification Tool Set

There are two distinct phage lifestyles: lytic and lysogenic. The lysogenic lifestyle has many implications for phage therapy, genomics, and microbiology, however it is often very difficult to determine whether a newly sequenced phage isolate grows lytically or lysogenically just from the genome. Using the ~200 known phage genomes, a supervised random forest classifier was built to determine which proteins of phage are important for determining lytic and lysogenic traits. A similarity vector is created for each phage by comparing each protein from a random sampling of all known phage proteins to each phage genome. Each value in the similarity vector represents the protein with the highest similarity score for that phage genome. This vector is used to train a random forest to classify phage according to their lifestyle. To test the classifier each phage is removed from the data set one at a time and treated as a single unknown. The classifier was able to successfully group 188 of the 196 phages for whom the lifestyle is known, giving my algorithm an estimated 4% error rate. The classifier also identifies the most important genes for determining lifestyle; in addition to integrases, expected to be important, the composition of the phage (capsid and tail) also determines the lifestyle. A large number of hypothetical proteins are also involved in determining whether a phage is lytic or lysogenic.

Metagenome Sequence Matcher

Metagenome analysis spans a large range of different methods and tools in the bioinformatics community. These tools provide scientists with biological information present in a sequenced environmental sample, more specifically the genetic functions encoded in the DNA of the sampled metagenome. Most often those tools have been developed to compare a specific metagenome file against databases that are filled with sequences and annotation data.

This project is directed to performing a comparative analysis between multiple metagenomic FASTA files. By importing n-length pieces of the sequences from one file into a hash table structure, comparing other metagenome sequences from other files will be done quickly and precisely. Finding similar sequences and structures between numerous metagenomes can give insight into what biological functions are shared between related and unrelated organisms.