
Reproducibility in bioinformatics can be a big problem. The same code can be runned by you in you computer and give some results, and when given to someone else it gives other results or it doesn’t fully work.
In order to resolve this issue, different tools were developed. You can use environments or containers.
CONTAINERS
There are different programs taht can be use to build and run containers: Docker, Appptainer or Podman are the most widely used.
1. How to obtain these containers
There are several repositories in which people publish container images, two of the most commonly used are: Dockerhub and Seqera.
Dockerhub
Once you access their webpage (no need to create an account), you can search for the software that you need. In this case we are looking for VCFtools. This software is used for VCF manipulation and querying.
After going to dockerhub and choosing one container, we copy the pull comand and run the following in the project server:
apptainer pull vcftools_0.1.16-1.sif docker://biocontainers/vcftools:v0.1.16-1-deb_cv1Seqera
In the case of Seqera, users don’t upload their containers, but they develop container images as you request them.
To pull an image from this repository you need to set the container setting to Singularity (Apptainer old name).
Make sure the container is compiled before trying to pull it!!
Once it’s ready you can copy the text and pull it to your system with the following command:
apptainer pull vcftools_0.1.17.sif oras://community.wave.seqera.io/library/vcftools:0.1.17--b541aa8d9f9213f92. Running Containers
Apptainer can be used to build the container from the image. Then you can either enter the container and run as if you had the exact same operating system as the person who built it, or you can run the software inside the container from outside of the container.
Running from “the outside”
apptainer exec vcftools_0.1.17.sif vcftools --versionYou can use runor exec to use the container. Note than using runit will launch the container and first run the %runscript if one is defined and then run you command.
Running interactively from “the inside”
There is also the possibility to enter the container and work interactively within it.
apptainer shell <name-of-container>Remember that the container is a isolated system and if you want to use files from outside you will need to bind file paths using -B.
apptainer shell <name-of-container>
apptainer shell -B outside/path:inside/path <name-of-container>To exit the container type exit and enter.
Running containers with sbatch
Apptainer containers can be run as part of a batch job if you integrate them int a SLURM job submission script.
We are going to add the container to our FastQCsbatch script.
#! /bin/bash -l
#SBATCH -A project_ID
#SBATCH -t 30:00
#SBATCH -n 1
apptainer exec container_image.sif fastqc -o . --noextract ../data/*fastq.gz3. Run your own container
This is a computationally intesnive task. The containers are build froma definition file (.defextension).
Let´s build a container with a cow telling us the date!
Create a file called lolcow.def and add the following:
Bootstrap: docker
From: ubuntu:20.04
%post
apt-get -y update
apt-get -y install cowsay lolcat fortune
%environment
export LC_ALL=C
export PATH=/usr/games:$PATH
%runscript
date | cowsay | lolcat Then to build the conrainer use:
apptainer build lolcow.sif lolcow.defYou will get information on the staus of the build and it will tell you when it´s ready.
Then you can run your new container:
apptainer run lolcow.sifIf you want you can change the %runscriptfrom your lolcow.def file and change datefor fortune. Now you will get the same cow but with a tale.