Best practices for HPC
This page lists some useful best practices to keep in mind when coding and running applications and pipelines on HPC systems.
Code coverage, testing, continuous integration
Every time we code, testing is a concern and is usually performed by the coder(s) regularly during the project. One can identify some basic main types of test:
-
Regression test Given an expected output from a specific input, the code is tested to reproduce that same output.
-
Unit test Tests the smallest units of the software (e.g. single functions) to identify bugs, especially in extreme cases of inputs and outputs
-
Continuous integration A set of tests the software runs automatically everytime the code is updated. This is useful to spot bugs before someone even uses the code.
More things one might need to test are the performance/scalability of the code, usability, and response to all the intended types of input data.
Unit and regression test can be useful, but at some point not really feasible, since the code can scale to be quite large and complex, with a lot of things to control. It is thus a good practice to use continuous integration, and implement simple but representative tests that cover all the code, so that bugs can be spotted often before the final users do that. Code coverage tools to implement such tests exists for several programming languages, and also for testing code deployed on GitHub version control.
Link | Description |
---|---|
pyTest | A package to test python code |
Cmake | To test both C , C++ and Fortran code |
Travis CI | Tool for continuous integration in most of the used programming languages. Works on Git version control. |
covr | Test coverage reports for R |
Code styling
An important feature of a computer code is that it is understandable to other people reading it. To ensure this is the case, a clean and coherent style of coding should be used in a project. Some languages have a preferred coding style, and in some GUIs (graphical user interfaces) those styling rules can be set to be required. One can also use ones own coding style, but it should be one easily readable by others, and it should be the same style throughout the whole project.
Link | Description |
---|---|
styleguide | Google guide for coding styles of the major programming languages |
awesome guidelines | A guide to coding styles covering also documentations, tools and development environments |
Pythonic rules | Intoduction to coding style in python. |
R style | A post on R coding style |
Containerized applications
In this section the benefits of project and package managers, that are a way of organizing packages in separated environments, will be outlined. However, a higher degree of isolation can be achieved by containerization than using environments. By containerizing, a user can virtualize the entire operating system, and make it ready to be deployed on any other machine. One can for example deploy a container without the need of installing anything on the hosting machine! Note that containers are a different concept from Virtual Machines, where it is the hardware being virtualized instead.
Link | Description |
---|---|
Docker | An open source widespread container that is popular both in research and industry |
Docker course | A course on the use of Docker freely hosted on youtube |
Docker curriculum | Beginner's introduction to docker |
Docker basics | Intoduction tutorials to Docker from the official documentation page |
Singularity | Singularity is another containerization tool. It allows you to decide at which degree a container interacts with the hosting system |
Singularity tutorial | A well done Singularity tutorial for HPC users |
Singularity video tutorial | A video tutorial on Singularity |
Reproducibility by containerization | A video on reproducibility with Singularity containers |
Documentation
When creating a piece of software, it is always a good idea to create a documentation explaining the usage of each element of the code. For packages, there are software that automatically create a documentation by using the declarations of functions and eventually some text included into them as a string.
Link | Description |
---|---|
MkDocs | A generator for static webpages, with design and themes targeted to documentation pages, but also other type of websites. This website is itself made with MkDocs. |
mkdocstrings | Python handler to automatically generate documentation with MkDocs |
pdoc3 | A package that automatically creates the documentation for your coding projects. It is semi-automatic (infers your dependencies, classes, etc. but adds a description based on your docstrings) |
pdoc3 101 | How to run pdoc to create an HTML documentation |
Roxygen2 | A package to generate R documentation — it can be used also with Rcpp |
Sphinx | Another tool to write documentation — it produces also printable outputs. Sphinx was first created to write the python language documentation. Even though it is a tool especially thought for python code, it can be used to generate static webpages for other projects. |
Documents with live code
Programming languages like python
and R
allows users to write documents that contain text, images and equations together with executable code and its output. Text is usually written using the very immediate markdown language
. Markdown files for R
can be created in the GUI Rstudio
, while python
uses jupyter notebooks
.
Link | Description |
---|---|
Introduction to Markdown | Markdown for R in Rstudio |
Jupyter notebooks | create interactive code with python . You can write R code in a jupyter notebook by using the python package rpy2 |
Package/Environment management systems
When coding, it is essential that all the projects are developed under specific software conditions, i.e. the packages and libraries used during development (dependencies) should not change along the project's lifetime, so that variations in things such as output formats and new algorithmic implementations will not create conflicts difficult to trace back under development. An environment and package manager makes the user able to create separated frameworks (environments) where to install specific packages that will not influence other software outside the environment in use. A higher degree of isolation can be achieved through containers (see the related part of this page).
Link | Description |
---|---|
Conda | an easy to use and very popular environment manager |
Getting started with conda | Introduction to conda setup and usage from the official documentation |
Conda cheat sheet | Quick reference for conda usage |
YARN | An alternative to conda |
Many short jobs running
Every time a job is submitted to the job manager (e.g. Slurm) of a computing cluster, there is an overhead time necessary to elaborate resource provision, preparation for output, and queue organization. Therefore it is wise to create, when possible, longer jobs. One needs to find the correct balance for how to organizing jobs: if these are too long and fail because of some issue, than a lot of time and resources have been wasted, but such problems can be overcome by tracking the outputs of each step to avoid rerunning all computations. For example, at each step of a job outputting something relevant, there can be a condition checking if the specific output is already present.
Massive standard outputs
Try to avoid printing many outputs on the standard output (stdout
), in other words a large amount of printed outputs directly to the terminal screen. This can be problematic when a lot of parallel jobs are running, letting stdout
filling all the home directory up, and causing errors and eventual data loss. Instead use an output in software-specific data structures (such as .RData
files for the R
language) or at least simple text files.
Packaging a coding project
When coding a piece of software in which there are multiple newly implemented functions, it can be smart to organize all those functions as a package, that can be reused and eventually shared with ease. Such a practice is especially easy and can be mastered very quickly for coding projects in python
and R
.
Link | Description |
---|---|
pyPA | python packaging user guide |
R package development | Develop an R package using Rstudio |
Pipe-lining and submitting jobs in Slurm
Slurm is a job scheduler. It allows a user to specify a series of commands and resources requirements to run such commands. Slurm does consider the job submission on an HPC system together with all the other jobs, and prioritize them among other things according to the resources requirement and the available computational power.
In figure above, the priority assigned to a Slurm job when the requested time increases, by keeping the memory and CPUs fixed. Decreased priority has higher values. Adapted from A Slurm Simulator: Implementation and Parametric Analysis. Simakov et al 2017.
The Danish national HPCs, and most of the other EuroHPC supercomputers, use Slurm as job manager.
Link | Description |
---|---|
Interactive Slurm | An interactive Slurm tutorial, in which you will be guided through the process of using Slurm. |
SLURM example 1 and SLURM example 2 | Some examples of how to make a Slurm script to submit a job from the danish HPC GenomeDK and from Princeton Research Computing. |
Gwf, a simple python tool to create interdependent job submissions | Gwf, developed at the University of Aarhus, makes it easy to create Slurm jobs and organize them as a pipeline with dependencies, using the python language (you need python 3.5+). You get to simply create the shell scripts and the dependencies, without the complicating syntax of Slurm. The page contains also a useful guide. |
Version control
Version control is the tracking of your development history for a project. This allows multiple people working on the same material to keep changes in sync without stepping over each other's contributions. Version control tools allow to commit changes with a description, set up and assign project objectives, open software issues from users and contributors, test automatically the code to find bugs before users step into them. Version control is useful for both teams and single users, and it is a good practice to have version control as a standard for any project.
Link | Description |
---|---|
GitHub | the most used tool for version control |
Github 101 | quick introduction to get started on Github |
GitLab and BitBucket | Two other popular alternatives to Github |