Submitting a PATRIC genome annotation using the command line

The PATRIC BRC has a great command line interface that allows you to access much of the data remotely. Here is a simple recipe to submit a genome for annotation.

To learn more about the CLI, read the docs or install it. Once you have it installed, you need to login using the command

p3-login username

Then, if you have a fasta file, say phage lenchom, you can use this command to run it.

p3-submit-genome-annotation --contigs-file phage_lencho.fasta --phage -n "Bacteroides phage lencho" -t 196894 -d Virus / Bacteroides_phage_lencho

This sets:

  • --contigs-file is your DNA sequence
  • --phage tells PATRIC to use the phage annotation pipeline. (Note, you might also try –recipe phage2; but this is likely to change)
  • -n is the name that the genome will have in PATRIC
  • -t is the taxonomy ID. Some useful ones are 196894 (Unclassified Siphoviridae); 28883 (Caudovirales); 10239 (Viruses); 49928 (Unclassifed Bacteria); 12908 (Unclassified sequences)
  • -d is the domain
  • / is the location in PATRIC workspace where you want the output to be. Note you may need to make this directory in the PATRIC workspace, but its easiest to do this on the website.
  • Bacteroides_phage_lencho is the name that the job will have in your PATRIC workspace. For retrieving jobs via the CLI (see below) you don’t want to have any spaces in this name.

Now go submit a few thousand annotations!

Downloading in Bulk

Once those annotations have run, you can download them all in bulk. I’m still using the Bacteroides directory here, and first we get a list of names, and then we get the annotations in GenBank format

p3-ls -l --type /  | perl -ne '/job_result (.*)$/ and print "$1\n"' > jobs

cat jobs | while read g; do echo $g; p3-cp ws:"/$g/$" genbank/; done