Monthly Archives: February 2011

How to remove human DNA sequence contamination from metagenomes

The immense amount of metagenomic data produced today requires an automated approach for data processing and analysis. Before any downstream analysis will be performed, the datasets should be preprocessed to ensure the quality of the data and prevent erroneous conclusions. One step of your data preprocessing (usually the last) should be to check for sequence contamination (DNA from sources other than the sample). This post will show you how to identify and remove human sequence contamination from metagenomes, but can also be applied to any other type of sequence dataset or contamination.

Continue reading

poster

2008

poster

Transition in Vibrio spp. correlates with human activity in the Northern Line Islands

Robert Schmieder, Tracy McDole, Elizabeth Dinsdale, Matthew Haynes, Forest Rohwer, Robert Edwards


Presented at: International Coral Reef Symposium (ICRS) 2008
Download PDF file (3.5 MB)

poster

ADAPTdb/ADAPT – A Framework for the Analysis of ARISA Data Sets

Robert Schmieder, Matthew Haynes, Elizabeth Dinsdale, Forest Rohwer, and Robert Edwards


Presented at: Metagenomics 2008
Download PDF file (1.2 MB)

poster

2009

poster

ADAPTdb/ADAPT – A Framework for the Analysis of ARISA Data Sets

Robert Schmieder, Matthew Haynes, Elizabeth Dinsdale, Forest Rohwer, and Robert Edwards


Presented at: ISMB/ECCB and M3 2009
Download PDF file (812 KB)

poster

Deviation of amino acid utilization and correlation with G+C composition in bacterial genome

Sajia Akhter, Hochul K Lee, Barbara Bailey, Peter Salamon, Robert Edwards


Presented at: Applied Computational Science and Engineering Student Support (ACSESS) 2009
Download PDF file (1.6 MB)

poster

Assembler for SOLiD data: by Improving memory management of Velvet assembler

Sajia Akhter and Robert Edwards


Presented at: Rocky Mountain Bioinformatics Conference (Rocky) 2009
Download PDF file (1.0 MB)

poster

Phage Annotation Tools and Methods

Ramy K. Aziz, Bhakti Dwivedi, Joe Anderson, Bonnie Hurwitz, JP Massar, Mya Breitbart, Matthew Sullivan, Jeff Elhai and Robert A. Edwards


Presented at: Rocky Mountain Bioinformatics Conference (Rocky) 2009
Download PDF file (940 KB)

poster

2012

poster

Tools for Detecting Macrolide Resistance in the Human Microbiome

Robert Schmieder, Yan Wei Lim, Anca Segall, Molly Schmid, and Robert Edwards


Presented at: Advances in Genome Biology & Technology (AGBT) 2012
Download PDF file (1.2 MB)

poster

Host prediction for viral metagenomes using oligonucleotide profiles

Michiyo Wellington-Oguri, Robert Schmieder, Barbara Bailey, Robert A. Edwards, and Bas E. Dutilh


Presented at: Student Research Symposium (SRS) 2012
Download PDF file (705 KB)

poster

Database Structure and Visualization Software for Microbial Physiology Data

Nicholas Turner, Haquio Liu, Jeremy Frank, and Robert Edwards


Presented at: Student Research Symposium (SRS) 2012
Download PDF file (176 KB)

poster

Tools for Fast Sequence Alignment

Sajia Akhter and Robert Edwards


Presented at: Student Research Symposium (SRS) 2012
Download PDF file (1.3 MB)

poster

2011

poster

Tools for Quality Control and Preprocessing of Metagenomic Datasets

Robert Schmieder, Yan Wei Lim and Robert Edwards


Presented at: Pacific Symposium on Biocomputing (PSB) 2011
Download PDF file (4.2 MB)

poster

FACIL: fast and accurate genetic code inference and logo

Bas E. Dutilh, Rasa Jurgelenaite, Radek Szklarczyk, Sacha A.F.T. van Hijum, Harry R. Harhangi, Markus Schmid, Bart de Wild, Kees-Jan Fran├žoijs, Hendrik G. Stunnenberg, Marc Strous, Mike S.M. Jetten, Huub J.M. Op den Camp and Martijn A. Huynen


Presented at: SDMG All Day Meeting 2011
Download PDF file (3.6 MB)

poster

Genomic Comparison of Salmonella enterica Serovars Enteritidis and Dublin

D. Matthews, R. Schmieder, J. Busch, N. Cassman, M. Doherty, D. Green, B. Matolock, B. Heffernan, G. Olsen, L. Farris, D. Schiffeli, S. Maloy, E. Dinsdale, and R. Edwards


Presented at: ASM 2011
Download PDF file (561 KB)

poster

PhiSpy: A novel similarity-independent tool for predicting prophages in microbial genomes

Sajia Akhter, Ramy K Aziz, Robert A Edwards


Presented at: Evergreeen International Phage Biology Meeting
Download PDF file (1.5 MB)

poster

2010

poster

Real-Time Metagenomics Analysis

Daniel A. Cuevas, Joshua A. Hoffman and Robert A. Edwards


Presented at: ASM 2010
Download PDF file (980 KB)

poster

Investigating the Frequency of Quinolone Resistance Genes in Environmental Samples

Sajia Akhter, Anca M. Segall, Molly Schmid and Robert A. Edwards


Presented at: ASM 2010
Download PDF file (1.5 MB)

poster

Identification of Macrolide Resistance Alleles in Environmental Metagenomes

Robert Schmieder, Anca Segall, Molly Schmid and Robert Edwards


Presented at: ASM 2010
Download PDF file (2.7 MB)

poster

Fast Identification and Removal of Sequence Contaminations from Genomic and Metagenomic Datasets

Robert Schmieder and Robert Edwards


Presented at: Human Microbiome Meeting 2010
Download PDF file (1.2 MB)

poster

PHANTOME: Phage Annotation Tools and Methods

Ramy K. Aziz, Brad Hull, Bhakti Dwivedi, Joe Anderson, Bonnie Hurwitz, JP Massar, Matthew Sullivan, Jeff Elhai, Mya Breitbart, Ross Overbeek and Robert A. Edwards


Presented at: Institut Pasteur Virus of Microbes meeting 2010
Download PDF file (808 KB)

poster

Phages Without Borders: Distribution of Phage Nucleic Acids in 310 Metagenomes

Ramy K. Aziz, Mya Breitbart and Robert A. Edwards


Presented at: ASM 2010
Download PDF file (1.9 MB)

poster

Phage Annotation Tools and Methods

Ramy K. Aziz, Bhakti Dwivedi, Joe Anderson, Bonnie Hurwitz, Brad Hull, JP Massar, Mya Breitbart, Matthew Sullivan, Jeff Elhai and Robert A. Edwards


Presented at: ASM 2010
Download PDF file (924 KB)

poster

Predicting Phage Preferences: Lytic vs. Lysogenic Lifestyle from Genomes

Katelyn McNair, Rob Edwards and Barbara Bailey


Presented at: CSHL meeting 2010
Download PDF file (269 KB)

Contamination of sequencing data (Pt. 2)

It is amazing how easily the processing of samples can lead to contamination of data. Something like 22% of sequenced genomes contain AluY elements from the human genome. As noted in the following posting from The Scientist, this alarming discovery could also be indicative of contamination of sequenced genomes by DNA from other sources, such as the commonly used E. coli, which could be problematic when working with other bacterial genomes. This possibility could have grave consequences when it comes to evaluating horizontal gene transfer.

http://www.the-scientist.com/news/display/57990/

http://www.plosone.org/article/info:doi/10.1371/journal.pone.0016410

Though, it should be noted that as per the article in The Scientist, this is only applicable to female scientists (“But probably the most common contaminant is the scientist herself.” from paragraph 4).

Project Update: Multi-threading or Cluster Computing?

Recently, I’ve been faced with a problem where I feel my metagenome comparator program is running too slow. The main reason behind it is that it’s performing operations that occur multiple times in a loop. These operations involve different tasks such as: reading lines from text, creating objects, inserting those objects into a data structure, retrieving those objects from the data structure, and writing the data structures to disk (just to name a few). So it would be natural to suggest to someone in my position to parallelize it all, and that’s exactly what I want to do. However, I’ve never written any type of parallel applications, and thus, I need to do a little bit of learning and researching into parallel programming. (More of my ramblings after the Read More break)

Continue reading