Here are some tips and tricks for minimap2 that I keep forgetting!
–split-prefix
If you have a large (>4 GB) multisequence index file, there are two options.
The first is to increase the value of -I
when you build the index (preferred) so that the whole index is kept in memory. Note: This must be done when you build the index, you can’t build the index and then change -I
during runtime.
The second is to use --split-prefix
with a string. For snakemake
, there are two options:
- You can use
"{sample}"
as your prefix like so:
params:
prfx = "{sample}"
...
shell:
"""
minimap2 --split-prefix {params.prfx} ...
"""
2. You can use a random 6 character string like so:
import random, string
params:
pfx = ''.join(random.choices(string.ascii_uppercase + string.digits, k=6))
...
shell:
"""
minimap2 --split-prefix {params.prfx} ...
"""
The trick is here, things will probably break if your index file is small. If you see the errorr: [W::sam_hdr_create] Duplicated sequence
it is probably because you have split a small index sequence, and the sequence IDs are being duplicated. Remove the --split-prefix
option and you should be good.