Overview and learning objectives

The goal of this material is to show how bioinformatics workloads behave on an HPC cluster (CPU threads, memory bandwidth, filesystem I/O, dependency management, and reproducibility) and how to run representative tools efficiently on many-core nodes.

Audience

  • HPC admins who operate many-core partitions.

  • Staff who need to support users running genomics, RNA-seq, or R-based analyses and want hands-on familiarity with the tools and typical resource profiles.

Learning objectives

By the end of this material, participants should be able to:

  • Identify whether a bioinformatics tool is CPU-bound, memory-bound, or I/O-bound.

  • Correctly request resources and set thread counts for multi-threaded tools.

  • Use Biocontainers/Singularity/Apptainer to run common genomics and RNA-seq tools reproducibly.

  • Be familiar with frequently used bioinformatics tools and how to run them.

Assumptions / prerequisites

  • SLURM is used as the scheduler (all examples assume SLURM).

  • Participants have access to a HPC cluster with many-core nodes.

  • Biocontainers (or a local container registry) are available.

  • Participants are comfortable with basic shell usage.