Strumenti Utente

Strumenti Sito


Hybrid programming MPI/openMP

Parallel programming: why and when

In case of HPC applications, openMP should be use instead of scalar encoding since nodes are multi-CPU and multi-core. When the problem size (time and/or space) cannot be supported by a single node you should move to MPI or hybrid MPI/openMP programming.

Processor and Memory affinity

Processor and memory affinity is the capability to get and control the location (Processor and memory) where a process is executed.

This capability is useful especially in case of Numa (Non Uniform Memory Access) architectures.

Numbering of processors, cores and sockets:

cat /proc/cpuinfo | egrep '(processor|core id|physical id)'

Tools and libraries


NUMACTL implements simple NUMA policy supporting memory and process/thread (CPU level) affinity.

It provides a command line tool.

numactl --show
numactl --hardware
numactl --cpubind=1 --membind=0 ./life

and a set of APIs numa_set_membind() numa_run_on_node()

sched library

Sched Library is a glibc extensions for process/thread affinity

Main routines:

 sched_getaffinity() sched_setaffinity() 

MPI thread safety

By default MPI could be non thread safe.

The user can invoke the thread support using MPI_Init_thread (in place of MPI_Init). There are 4 support levels:

  • MPI_THREAD_SINGLE : no support
  • MPI_THREAD_FUNNELED : master thread only call MPI (default)
  • MPI_THREAD_SERIALIZED: more then one thread can call MPI, but in turn.
  • MPI_THREAD_MULTIPLE: MPI thread safe


int MPI_Init_thread(int *argc, char ***argv, int required, int *provided)

“Required” is the level required by the user. “Provided” is the level provided by MPI.

Hybrid program

#include "mpi.h"
int main(int argc, char **argv){
int rank, size, ierr, i;
  MPI_Comm_rank (...,&rank);
  MPI_Comm_size (...,&size);
#pragma omp parallel for
for(i=0; i<n; i++)
  printf("do some work\n"; 
/var/www/html/dokuwiki/data/pages/grid/hybridprogramming.txt · Ultima modifica: Y/m/d H:i da