Tag Archives: bash

Use column to display tsv files in columns

If you have a tab separated file and view it in a pager like less or more, the columns never line up. Here is a simple way to make those columns appear correct.

column -t file.tsv

For example, here is a file with three columns of words, displayed with cat

If we pass that to column with the -t option to detect the columns, we get nicely organised columns:

However, note that this is not exactly correct, notice that “Paradigm shift” has been split into two columns because the -t option uses whitespace by default, so to display the columns using tabs, we need to add a -s option:

column -t -s$'\t'

awk use tab as input separator

By default, awk uses any whitespace to separate the input text. You can use the same construct as perl to change the input Field separator:

cat file.tsv | awk -F"\t" '!s[$1]'

The above example will split a file on tabs and then print the unique entries in the first column.

BASH commands: number of arguments supplied

This code checks for the number of arguments supplied to a bash script

if [[ $# -eq 0 ]]; then echo "No arguments supplied"; exit; fi

If there are no arguments you probably don’t need to do anything.

Remember the arguments are:

$0 - the name of the script
$1 - the first argument
$2 - the second argument

Running Mauve with Java 10

They short answer is you can’t (currently) run the totally awesome Mauve with Java 10, but you can probably run Mauve anyway. Here’s how!

TL;DR

locate java | grep java$ | grep -w bin
export JAVA_CMD=/usr/lib/jvm/java-8-openjdk-amd64/bin/java

Continue reading →

Reading and writing to the same file

If you try to modify a file (removing all empty lines for example) using a command like:

 cat file.txt | sed '/^$/d' > file.txt

you will end up with and empty file.txt. The reason is that bash parses the command line looking for “metacharacters” ( “|” , “>” and “space” in this case) that separate words, then groups and executes those words according to their precedence. This means that “> file.txt” get executed FIRST. This creates an empty file.txt (overwriting any existing file) and a “process” to redirect standard output to that file. Then “cat file.txt” get executed, but by now file.txt is empty. So “cat file.txt” outputs 0 lines, “sed ‘/^$/d’ ” deletes all 0 empty lines, and 0 lines get written to file.txt . This works as “intended” and bash outputs no error.

You can get around this using a temporal file.

 cat file.txt | sed '/^$/d' > tmp_file.txt
mv tmp_file.txt file.txt

But, as file.txt is technically a new file you might lose some information, in particular permissions and whether file.txt was originally a symbolic link or not.

Other options is to use sponge, which is part of moreutils and sadly not standard in many systems.

 cat file.txt | sed '/^$/d' | sponge file.txt

EdwardsLab

Delivering the best in bioinformatics…