command line deconseq

AKA: how to remove contamination from your metagenome! We use sharks genomes, but it works with humans, corals, and other things too!

A while ago we wrote deconseq to allow you to remove contamination from your sequence libraries. We used an HTS-mapper to map the reads in your sequences to your reference genome, and then filtered the sequences after mapping.

This is trivial to do with modern sequence analysis tools, and so we provide recipes here for filtering your reads based on matches to a reference genome. Read more to find out how!

We also provide a snakefile that does all these steps. Set up the bowtie2 index, a directory of reads, and and output location, and it will generate mapped and unmapped reads for you!

Continue reading

SoCal Hackathon 2019

We are pleased to announce the second installment of the SoCal Bioinformatics Hackathon.

From 9-11 January, 2019, the NCBI will help run a bioinformatics hackathon in Southern California hosted by the Computational Sciences Research Center at San Diego State University!  We are going to put a few hundred thousand metagenomic datasets on cloud infrastructure and identify known, taxonomically definable and novel viruses!  We’re specifically looking for folks who have experience in Computational Virus Hunting or Adjacent Fields! If this describes you, please apply! This event is for researchers, including students and postdocs, who are already engaged in the use of bioinformatics data or in the development of pipelines for virological analyses from high-throughput experiments. The event is open to anyone selected for the hackathon and willing to travel to SDSU (see below).

Continue reading