Setting Up a 3-Node HPC Cluster with Slurm on Rocky Linux

HPC Cluster

Image credit: Affan Javid


High-Performance Computing (HPC) clusters are essential for complex computations in scientific research, engineering, and data analysis. Setting up an HPC cluster may seem daunting, but with the right guidance, it becomes an achievable task even in a test lab environment. In this post, we’ll walk through setting up a 3-node HPC cluster using Rocky Linux and Slurm as the workload manager.


Table of Contents

  1. Introduction
  2. Prerequisites
  3. Step 1: Install Rocky Linux on All Nodes
  4. Step 2: Configure Network Settings
  5. Step 3: Configure SSH Key-Based Authentication
  6. Step 4: Install Necessary Software Packages
  7. Step 5: Configure NFS for Shared Storage
  8. Step 6: Configure Slurm Workload Manager
  9. Step 7: Test the HPC Cluster
  10. Troubleshooting Common Issues
  11. Conclusion
  12. Additional Resources

Introduction

This guide aims to help you set up a 3-node HPC cluster using Rocky Linux, an enterprise-level operating system that is binary-compatible with Red Hat Enterprise Linux. We’ll use Slurm (Simple Linux Utility for Resource Management) as our workload manager, which is widely adopted in HPC environments.


Prerequisites

Hardware Requirements

  • Three Physical or Virtual Machines:
    • Head Node (SMS): Acts as the master node controlling the cluster.
    • Compute Nodes (HPC-N1, HPC-N2): Perform computational tasks.
  • Network Equipment:
    • A network switch or router to connect all nodes.
    • Ethernet cables for connectivity.
  • Minimum Specifications for Each Node:
    • CPU: 2 cores or more.
    • RAM: 4 GB or more.
    • Storage: 20 GB or more free disk space.

Software Requirements

  • Operating System:
    • Rocky Linux 8 installed on all nodes.
  • User Account:
    • A non-root user with sudo privileges on all nodes.
  • Software Packages:
    • OpenSSH server and client.
    • Development tools (GCC, make, etc.).
    • MPI libraries (OpenMPI).
    • Slurm Workload Manager.
    • NFS utilities for shared storage.

Step 1: Install Rocky Linux on All Nodes

  1. Download Rocky Linux ISO:
  2. Install Rocky Linux:
    • Boot each machine from the ISO and follow the installation prompts.
    • Choose a minimal installation to reduce unnecessary packages.
    • Set up the root password and create a non-root user with sudo privileges.
  3. Update the System:
    sudo dnf update -y
    

Step 2: Configure Network Settings

Assign Static IP Addresses

  1. Identify Network Interfaces:
    ip addr
    
    • Note the interface name (e.g., ens33, eth0).
  2. Configure Static IP on Each Node:Edit Network Configuration:
    sudo nmcli con mod "System eth0" ipv4.addresses 192.168.1.X/24 ipv4.gateway 192.168.1.1 ipv4.dns "8.8.8.8" ipv4.method manual
    
    • Replace eth0 with your interface name.
    • Replace 192.168.1.X with the desired IP address for each node.
  3. Restart Network Manager:
    sudo systemctl restart NetworkManager
    

Set Hostnames

  1. Set Hostnames on Each Node:Head Node:
    sudo hostnamectl set-hostname SMS
    

    Compute Nodes:

    sudo hostnamectl set-hostname HPC-N1
    sudo hostnamectl set-hostname HPC-N2
    
  2. Verify Hostnames:
    hostname
    

Update /etc/hosts File

  1. Edit /etc/hosts on All Nodes:
    sudo nano /etc/hosts
    
    • Add the following entries (adjust IPs accordingly):
      192.168.1.100   SMS
      192.168.1.101   HPC-N1
      192.168.1.102   HPC-N2
      
  2. Ensure Consistency:
    • The /etc/hosts file should be identical on all nodes.

Step 3: Configure SSH Key-Based Authentication

  1. Generate SSH Key on the Head Node:
    ssh-keygen -t rsa -b 4096
    
    • Press Enter through the prompts to accept defaults and set no passphrase.
  2. Copy SSH Key to Compute Nodes:
    ssh-copy-id HPC-N1
    ssh-copy-id HPC-N2
    
    • Enter the password when prompted.
  3. Test SSH Access:
    ssh HPC-N1
    exit
    ssh HPC-N2
    exit
    
    • You should be able to log in without a password.

Step 4: Install Necessary Software Packages

Install Development Tools

On All Nodes:

sudo dnf groupinstall "Development Tools" -y
sudo dnf install epel-release -y
sudo dnf install nano wget curl git -y

Install MPI Libraries

Install OpenMPI on All Nodes:

sudo dnf install openmpi openmpi-devel -y

Set Environment Variables:

Add to .bashrc:

echo 'export PATH=$PATH:/usr/lib64/openmpi/bin' >> ~/.bashrc
echo 'export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib64/openmpi/lib' >> ~/.bashrc

Reload .bashrc:

source ~/.bashrc

Install Slurm Workload Manager

  1. Add Slurm Repository:
    sudo dnf install https://download.schedmd.com/slurm/slurm-20.11.9-1.el8.x86_64.rpm -y
    
    • Replace the URL with the latest Slurm RPM for Rocky Linux 8.
  2. Install Slurm Packages:On Head Node:
    sudo dnf install slurm slurm-slurmctld -y
    

    On Compute Nodes:

    sudo dnf install slurm slurm-slurmd -y
    

Step 5: Configure NFS for Shared Storage

Sharing directories like /home via NFS allows users and applications to access the same files across all nodes.

Set Up NFS Server on the Head Node

  1. Install NFS Utilities:
    sudo dnf install nfs-utils -y
    
  2. Configure Exports:
    sudo nano /etc/exports
    
    • Add the following lines:
      /home           *(rw,sync,no_root_squash,no_subtree_check)
      
  3. Start NFS Services:
    sudo systemctl enable --now nfs-server rpcbind
    
  4. Export File Systems:
    sudo exportfs -a
    

Mount NFS Shares on Compute Nodes

  1. Install NFS Utilities:
    sudo dnf install nfs-utils -y
    
  2. Create Mount Points:
    sudo mkdir -p /home
    
  3. Edit /etc/fstab:
    sudo nano /etc/fstab
    
    • Add the following lines:
      SMS:/home    /home   nfs defaults 0 0
      
  4. Mount NFS Shares:
    sudo mount -a
    
  5. Verify Mounts:
    df -h
    

Step 6: Configure Slurm Workload Manager

Configure MUNGE Authentication

MUNGE (MUNGE Uid ‘N’ Gid Emporium) is an authentication service for creating and validating credentials.

  1. Install MUNGE on All Nodes:
    sudo dnf install munge munge-libs munge-devel -y
    
  2. Generate MUNGE Key on Head Node:
    sudo /usr/sbin/create-munge-key
    
  3. Copy MUNGE Key to Compute Nodes:
    sudo scp /etc/munge/munge.key HPC-N1:/etc/munge/
    sudo scp /etc/munge/munge.key HPC-N2:/etc/munge/
    
  4. Set Permissions on All Nodes:
    sudo chown munge: /etc/munge/munge.key
    sudo chmod 400 /etc/munge/munge.key
    
  5. Start MUNGE Service on All Nodes:
    sudo systemctl enable --now munge
    
  6. Test MUNGE:
    munge -n | unmunge | grep STATUS
    
    • Output should be STATUS: Success (0).

Configure Slurm on the Head Node

  1. Create Slurm User and Directories:
    sudo useradd slurm
    sudo mkdir -p /var/spool/slurmctld
    sudo chown slurm: /var/spool/slurmctld
    sudo touch /var/log/slurmctld.log
    sudo chown slurm: /var/log/slurmctld.log
    
  2. Edit Slurm Configuration File:
    sudo nano /etc/slurm/slurm.conf
    
    • Basic slurm.conf Example (You can edit as required):
      ClusterName=mycluster
      ControlMachine=SMS
      
      MpiDefault=none
      ProctrackType=proctrack/pgid
      ReturnToService=2
      SlurmctldPidFile=/var/run/slurmctld.pid
      SlurmctldPort=6817
      SlurmdPidFile=/var/run/slurmd.pid
      SlurmdPort=6818
      SlurmdSpoolDir=/var/spool/slurmd
      SlurmUser=slurm
      StateSaveLocation=/var/spool/slurmctld
      SwitchType=switch/none
      TaskPlugin=task/none
      
      SchedulerType=sched/backfill
      SelectType=select/cons_res
      SelectTypeParameters=CR_Core_Memory
      
      SlurmctldDebug=info
      SlurmctldLogFile=/var/log/slurmctld.log
      SlurmdDebug=info
      SlurmdLogFile=/var/log/slurmd.log
      
      NodeName=HPC-N[1-2] CPUs=2 RealMemory=2000 State=UNKNOWN
      PartitionName=debug Nodes=HPC-N[1-2] Default=YES MaxTime=INFINITE State=UP
      
    • Set MailProg:
      MailProg=/bin/true
      
  3. Save and Exit the File.

Distribute Slurm Configuration to Compute Nodes

  1. Create Slurm User and Directories on Compute Nodes:
    sudo useradd slurm
    sudo mkdir -p /var/spool/slurmd
    sudo chown slurm: /var/spool/slurmd
    sudo touch /var/log/slurmd.log
    sudo chown slurm: /var/log/slurmd.log
    
  2. Copy slurm.conf to Compute Nodes:
    sudo scp /etc/slurm/slurm.conf HPC-N1:/etc/slurm/
    sudo scp /etc/slurm/slurm.conf HPC-N2:/etc/slurm/
    

Start Slurm Services

  1. Start Slurm Controller on Head Node:
    sudo systemctl enable --now slurmctld
    
  2. Start Slurm Daemon on Compute Nodes:
    sudo systemctl enable --now slurmd
    
  3. Verify Slurm Services:
    • On Head Node:
      sinfo
      
      • Should display the nodes and their states.

Step 7: Test the HPC Cluster

Compile a Test MPI Program

  1. Create an MPI “Hello World” Program:
    nano mpi_hello.c
    
    • mpi_hello.c:
      #include <mpi.h>
      #include <stdio.h>
      
      int main(int argc, char** argv) {
          MPI_Init(NULL, NULL);
          int world_size;
          MPI_Comm_size(MPI_COMM_WORLD, &world_size);
          int world_rank;
          MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);
          char processor_name[MPI_MAX_PROCESSOR_NAME];
          int name_len;
          MPI_Get_processor_name(processor_name, &name_len);
          printf("Hello world from processor %s, rank %d out of %d processors\n",
                 processor_name, world_rank, world_size);
          MPI_Finalize();
          return 0;
      }
      
  2. Compile the Program:
    mpicc mpi_hello.c -o mpi_hello
    
  3. Create a Slurm Job Script:
    nano mpi_job.sh
    
    • mpi_job.sh:
      #!/bin/bash
      #SBATCH --job-name=mpi_test
      #SBATCH --output=mpi_test.out
      #SBATCH --nodes=2
      #SBATCH --ntasks-per-node=2
      #SBATCH --time=00:05:00
      
      srun ./mpi_hello
      
  4. Submit the Job:
    sbatch mpi_job.sh
    
  5. Monitor the Job:
    squeue
    
  6. Check the Output:
    cat mpi_test.out
    
    • You should see output from all ranks across the nodes like this


Troubleshooting Common Issues

Issue 1: Unable to Contact Slurm Controller

Error Message:

slurm_load_partitions: Unable to contact slurm controller (connect failure)

Solution:

  1. Ensure slurmctld is running on the head node:
    sudo systemctl status slurmctld
    
  2. Check slurm.conf for correct ControlMachine:
    • Ensure it matches the head node’s hostname.
  3. Verify MUNGE is running on all nodes:
    sudo systemctl status munge
    
  4. Ensure firewalls are not blocking Slurm ports (6817, 6818):
    sudo firewall-cmd --permanent --add-port=6817-6818/tcp
    sudo firewall-cmd --reload
    

Issue 2: Configured MailProg is Invalid

Error Message:

error: Configured MailProg is invalid

Solution:

  1. Install the mailx package:
    sudo dnf install mailx -y
    
  2. Update slurm.conf:
    • Set MailProg to the correct path or disable it:
      MailProg=/bin/true
      
  3. Restart slurmctld:
    sudo systemctl restart slurmctld
    

Issue 3: Cluster Name Mismatch

Error Message:

fatal: CLUSTER NAME MISMATCH.

Solution:

  1. Stop slurmctld:
    sudo systemctl stop slurmctld
    
  2. Remove old state files:
    sudo rm -rf /var/spool/slurmctld/*
    
  3. Ensure ClusterName is consistent in slurm.conf:
    ClusterName=mycluster
    
  4. Restart slurmctld:
    sudo systemctl start slurmctld
    

Conclusion

Setting up an HPC cluster with Slurm on Rocky Linux involves several steps, from configuring network settings to installing and configuring essential services like MUNGE and Slurm. By carefully following this guide and addressing common issues, you can build a functional HPC test lab that provides a platform for exploring parallel computing and cluster management.


Additional Resources


Note: Always ensure your systems are secure by following best practices, such as keeping software up to date, using strong passwords, and limiting access to authorized users.

Happy Computing!


Posted

in

by

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *