Although GUIs and web pages are great ways for users to interact with our tools and software, the command line interface is still a prevalent medium for executing scripts in the bioinformatics field. One of the ways that we can make command line scripts more interactive with users is to include capabilities for options, flags, and arguments in our code. These allow users to change the behavior of the script, i.e., input values and input format, file output format and nomenclature, algorithm values and thresholds, status updates, and more. Before really diving into Python, C-style argument parsing was the implementation I was most familiar with, such as the getopt Python module or Getopt Perl module, but it does not follow the object-oriented style that languages like Python are most known for. I usually spent two or three dozen lines of code implementing the function and writing out a usage help message. I recently came across the argparse module and felt that this is exactly what I was looking for. It took away much of the manual programming and simplifies the process. Here I’ll explain a short tutorial with a few simple cases on how to use argparse and the benefits I found from using it.
USING REQUIRED ARGUMENTS
A simple example to show is a program that takes in a file and prints out its name.
import argparse parser = argparse.ArgumentParser()
# Initiate argument parser object
# Add input file argument with help message
parser.add_argument(‘infile’, help=‘Input file to print out’)
args = parser.parse_args() print ‘The filename is {}’.format(args.infile)
When we run the command without giving a filename, the following help message appears:
$ python sample.py
usage: sample.py [-h] infile
sample.py: error: too few arguments
We can then run the script with the -h flag to get a full help message:
$ python sample.py -h
usage: sample.py [-h] infile
positional arguments:
infile Input file
optional arguments:
-h, –help show this help message and exit
Here, we can see the positional (required) arguments listed along with the help message that we wrote. What is great is that we did not need to manually code the help message ourselves. The argparse object contains methods to format and print out the help message whenever there was a problem with the script during the argument parsing.
We can successfully run the code as so:
$ python sample.py test_file.fasta
The filename is test_file.fasta
USING OPTIONAL ARGUMENTS
Optional arguments, like flags, are also essential in many programs and the argparse module supports these.
Here, we’ll add the option to print out the number of lines in the file:
import argparse
parser = argparse.ArgumentParser() # Initiate argument parser object
Add input file argument with help message
parser.add_argument(‘infile’, help=’Input file to print out’)
Add line count optional argument
parser.add_argument(‘—-linecount’, help=’Printout number of lines in file’,
action=’store_true’)
args = parser.parse_args() # Call command line parser method
print ‘The filename is {}’.format(args.infile)
Check if the linecount flag was raised
if args.linecount:
with open(args.infile) as f:
numLines = len(f.readlines())
print ‘Number of lines: {}’.format(numLines)
To explain what I’ve added, we can see that the new argument includes a double hyphen ‘–‘ before the name. This will let the parser know that this is not a required or positional argument. I also added the action=‘store_true’ option to this line. This will let the parser know that it will store True for the variable args.linecount and False if the user does not include the flag. The default behavior for action is to accept an argument value after the flag.
We can run the script with the help flag to get new information:
$ python sample.py -h
usage: sample.py [-h] [–linecount] infile
positional arguments:
infile Input file to print out
optional arguments:
-h, –help show this help message and exit
–linecount Printout number of lines in file
$ python sample.py test_file.fasta
The filename is test_file.fasta
$
$ python sample.py test_file.fasta –linecount
The filename is test_file.fasta
Number of lines: 4
$
$ python sample.py –linecount test_file.fasta
The filename is test_file.fasta
Number of lines: 4
We can see here that the new help message includes the –linecount flag and its help message. I then run the script without the flag and it completes successfully. Finally, I include the flag in the command, one case where I include it before the filename and one case after the filename. I did this to show that the order of the optional arguments does not matter.
We can add short arguments to the code because some users prefer them over long arguments. Changing that one line of code will give us:
import argparse
parser = argparse.ArgumentParser() # Initiate argument parser object
Add input file argument with help message
parser.add_argument(‘infile’, help=’Input file to print out’)
Add line count optional argument
parser.add_argument(‘-c’, ‘–linecount’,
help=’Printout number of lines in file’,
action=’store_true’)
args = parser.parse_args() # Call command line parser method
print ‘The filename is {}’.format(args.infile)
Check if the linecount flag was raised
if args.linecount:
with open(args.infile) as f:
numLines = len(f.readlines())
print ‘Number of lines: {}’.format(numLines)
The new short argument is prepended with a single hyphen. Running the script gives us the output:
$ python sample.py -h
usage: sample.py [-h] [-c] infile
positional arguments:
infile Input file to print out
optional arguments:
-h, –help show this help message and exit
-c, –linecount Printout number of lines in file
$ python sample.py test_file.fasta -c
The filename is test_file.fasta
Number of lines: 4
One thing to notice, the -c is shown in the usage line at the top of the help message because we put this as the first argument in the parser.add_argument() line. If we put the -c option after –linecount then the long argument would have shown up in the usage line. The order would also have been flipped under the optional arguments section.
To conclude, the argparse module handles much of the work of parsing command line arguments and formatting help and usage messages. There are other functions that argparse supplies programmers with that I did not go over here, such as type checking, limited choices for arguments, and argument counting. These can be further explained in the tutorial link below. This covers all of what I presented here and more.
More in depth tutorial @ http://goo.gl/Y4CsIH