The PATRIC BRC has a great command line interface that allows you to access much of the data remotely. Here is a simple recipe to submit a genome for annotation.
Then, if you have a fasta file, say phage lenchom, you can use this command to run it.
p3-submit-genome-annotation --contigs-file phage_lencho.fasta --phage -n "Bacteroides phage lencho" -t 196894 -d Virus /firstname.lastname@example.org/home/Bacteroides Bacteroides_phage_lencho
--contigs-fileis your DNA sequence
--phagetells PATRIC to use the phage annotation pipeline. (Note, you might also try –recipe phage2; but this is likely to change)
-nis the name that the genome will have in PATRIC
-tis the taxonomy ID. Some useful ones are 196894 (Unclassified Siphoviridae); 28883 (Caudovirales); 10239 (Viruses); 49928 (Unclassifed Bacteria); 12908 (Unclassified sequences)
-dis the domain
- /email@example.com/home/Bacteroides is the location in PATRIC workspace where you want the output to be. Note you may need to make this directory in the PATRIC workspace, but its easiest to do this on the website.
- Bacteroides_phage_lencho is the name that the job will have in your PATRIC workspace. For retrieving jobs via the CLI (see below) you don’t want to have any spaces in this name.
Now go submit a few thousand annotations!
Downloading in Bulk
Once those annotations have run, you can download them all in bulk. I’m still using the Bacteroides directory here, and first we get a list of names, and then we get the annotations in GenBank format
p3-ls -l --type /firstname.lastname@example.org/home/PhispyBacteroides | perl -ne '/job_result (.*)$/ and print "$1\n"' > jobs
cat jobs | while read g; do echo $g; p3-cp ws:"/email@example.com/home/Bacteroides/.$g/$g.gb" genbank/; done