Site icon EdwardsLab

Submitting to Genbank

I hate submitting to genbank. It is a royal pain, and often I use work arounds to get my data submitted, eg. using the awesome Short Read Archive. For this project, I have two complete genomes and I want them to be in GenBank, so here are my notes and thoughts about the process.

Second attempt. I put this at the top since you probably want to know what works first. Here is a web page describing some of the steps. I’m trying and will provide some scripts.

First, get everything together:

 

 

 

 

 

 

 

 

 

 

The stuff below here didn’t work. I had an error report 1,015 lines long. I don’t understand why NCBI can identify these errors and not propose solutions to the errors. It is just lazy offsetting the work to everyone else.

It would be so much less work for science if NCBI staff would work with developers at RAST to fix the errors and come up with a one button solution to this problem. But no, everyone has to suffer because of petty politics.

First, I am using the BankIt Online Submission form. Mostly because I refuse to use a piece of software that is so old you need to tell it that you have Internet access. You need to login with your pubmed id.

Before you start you need a couple of things:

Most of the form is self explanatory, and they will ask you some questions. Answer them. However, the five column feature table is a pain in the butt. There is a description of what they want available online.

I have written this code that uses BioPython to convert a gbk file to this five column format. There are a couple of random and unpredictable errors, and you’ll have to iterate this process many times, editing the genbank file or tsv file and repeating the upload until you get the right response.

Exit mobile version