Category Archives: bash

bash

Human Readable Numbers

There is an easy convenience to using human readable numbers. Instead of a number like 1,099,511,627,776 you can use 1T. Instead of 1,073,741,824 you can use 1G and instead of 1,024 you can use 1K (that 1K = 1,024 is why the other numbers don’t end with multiple zeros).

But how do you do some (simple) math with human readable numbers, like adding up a list?

This is where numfmt comes to your aid.

For example, lets make a list of numbers:

259G
1.1G
692G
5.5G
5.3G
140M
30G
302G
222G
281M
1.9G
60G
2.2T

If we put those in a file called sizes.txt we can sum them with a simple command like this:

cat sizes.txt | numfmt --from=iec | awk 's+=$1 {} END {print s}'  | numfmt --to=iec

The --from=iec converts the numbers from human readable format to numbers, the awk adds the numbers, and then the second numfmt converts the sum back to a human readable number,

bash

Use column to display tsv files in columns

If you have a tab separated file and view it in a pager like less or more, the columns never line up. Here is a simple way to make those columns appear correct.

column -t file.tsv

For example, here is a file with three columns of words, displayed with cat

If we pass that to column with the -t option to detect the columns, we get nicely organised columns:

However, note that this is not exactly correct, notice that “Paradigm shift” has been split into two columns because the -t option uses whitespace by default, so to display the columns using tabs, we need to add a -s option:

column -t -s$'\t'

bash

awk use tab as input separator

By default, awk uses any whitespace to separate the input text. You can use the same construct as perl to change the input Field separator:

cat file.tsv | awk -F"\t" '!s[$1]'

The above example will split a file on tabs and then print the unique entries in the first column.

Calculate the SHA256 checksum

If you create a conda recipe you need to calculate the sha256 checksum. This is a quick post to explain how to do that.

We often submit things to PyPi and then use the PyPi versions to create conda installations. The beauty of this approach is that if you update the PyPi installation, you don’t need to do anything else: the conda bot will automagically notice the new version and update for you. Procrastination pays off again! We talk about this in our PhiSpy blog post.

In the bioconda recipe we usually use this to point to a specific PyPi package for conda to install:

{% set name = "pyctv_taxonomy" %}
{% set version = "0.25" %}
{% set sha256 = "332e54fed6640f61e5c4722c62b9df633921358ba0eb8daf6230711970da2ad9" %}

package:
  name: "{{ name|lower }}"
  version: '{{ version }}'

source:
  url: "https://pypi.io/packages/source/{{ name[0] }}/{{ name }}/{{ name }}-{{ version }}.tar.gz"
  sha256: '{{ sha256 }}'

Note that we have the name (which is lower case) and the version number, and the URL is constructed from the first character of the name, the name, and the name-version.tar.gz. So in this case, the URL would be https://pypi.io/packages/source/p/pyctv_taxonomy/pyctv_taxonomy-0.25.tar.gz

Now there are a couple of ways we can generate the sha256 sum:

URL=<code>https://pypi.io/packages/source/p/pyctv_taxonomy/pyctv_taxonomy-0.25.tar.gz</code>
wget -qO- $URL | shasum -a 256

or

URL=<code>https://pypi.io/packages/source/p/pyctv_taxonomy/pyctv_taxonomy-0.25.tar.gz</code>
curl -sL $URL | openssl sha256

In this case, they both give the same answer:

332e54fed6640f61e5c4722c62b9df633921358ba0eb8daf6230711970da2ad9
bash

BASH commands: number of arguments supplied

This code checks for the number of arguments supplied to a bash script

if [[ $# -eq 0 ]]; then echo "No arguments supplied"; exit; fi 

If there are no arguments you probably don’t need to do anything.

Remember the arguments are:

$0 - the name of the script
$1 - the first argument
$2 - the second argument