Version 2


    Parallel computing brings into mind bulky and expensive hardware like desktop PCs or complex virtual machines. The same can be said for building a cluster. In contrast, Raspberry Pi makes a refreshing alternative; it makes parallel computing relatively inexpensive. It also makes for easy implementation, allowing those interested to explore both software and hardware aspects of the computing world.


    What is a Computer Cluster?

    A computer cluster is several computers connected via LAN. The networked computers act as a single, powerful machine. It also realizes faster processing speeds, better data integrity and reliability, and bigger storage capacity.



    A cluster brings to fruition the following benefits:.

    • Real-time constraints: A necessity of a program or a process to be finished within a specific period of time
    • Throughput: A cluster can deliver resources to process many associated simulations
    • Memory: A cluster offers a better way to provide massive amounts of program memory for use
    • Clusters offer fault tolerance



    The architecture of a computer cluster is described in Figure 1 below.


    Figure 1: Cluster Layers


    • Nodes: A cluster is comprised of compute nodes and supporting nodes. Compute nodes are dedicated servers that constitute the cluster’s processing power. The supporting nodes are made of management nodes, storage nodes, and head nodes. Users can only communicate with the head node. The latter also interfaces the cluster to the outside world.
    • Interconnect Hardware: Interconnect Hardware connect Compute nodes. They perform the function of communicating data and control messages between the compute nodes. This hardware includes Network Interface Cards, Switches, and Cables.
    • Cluster Management Tools: Since a cluster is a group of many different parts, including multiple computer systems, it is vital to be able to manage the cluster as a single entity. Cluster management tools provide a single entity view to the system administrators.
    • Parallel Programming library: Programmers deploy a number of parallel programming techniques. Parallelism can be implemented in three ways: (a) By using a distributed communication library, like the Message Passing Interface (MPI) library. (b) By making ad hoc use of a lower level communication protocol, for example by using a sockets interface. (c) By deploying a software layer which hides the interconnect from the programmer, essentially providing a virtual shared memory environment.
    • Interconnect Messaging Protocol: The TCP/IP protocol is suitable for many cluster applications.  It is the easiest to use when building a cluster with inexpensive Fast Ethernet products. The MPI library transfers the message to the particular protocol layer used to implement the MPI library. This protocol layer then either transfers the message directly or via the operating system on to the physical network.
    • User Application / OS: The operating system in the individual cluster nodes provides a basic system support for cluster operations. The operating system works whether the user is opening files, sending messages, or starting additional processes. The operating system's primary role is to manage multiple processes onto hardware components that comprise a system (resource management and scheduling). It also provides a high-level software interface for user applications.



    Clusters are classified into the following groups:

    • Storage cluster-These provide a steady file system image across servers arranged in a cluster, allowing the servers to consecutively read and write to a single shared file system.
    • High availability- High availability clusters ensure continuous availability of services by eliminating single points of failure.
    • Load balancing- These clusters operate by evenly distributing a workload over multiple backend nodes.
    • High-performance- High-performance clusters are also called computational clusters or grid computing. These use cluster nodes to perform concurrent calculations.


    Beowulf Cluster

    The simplest definition of a Beowulf cluster is a Virtual Parallel Supercomputer made of computers connected by a small local area network. One computer functions as a server, and all the other remaining computers are the nodes. All computers in the network have the same programs and libraries installed. This allows the nodes in the cluster to share different processes, data, and computation within them. Beowulf clusters deliver cost-effective computing power for scientific applications. they can be used as storage, high availability, load balancing, and high-performance clusters.


    The advantages of Beowulf clusters include:

    • Scalability
    • Performance
    • Flexible configuration
    • Ability to keep up with technological changes
    • Better fault tolerance and reliability
    • Users enjoy a better level of control
    • Easier maintenance


    Raspberry Pi Beowulf Cluster

    The Raspberry Pi is sold with a built-in Ethernet port and can be easily connected to a router or switch. Several Raspberry Pis connected to a switch/router can form a cluster. Any cluster construction must have at least two or three nodes (Raspberry Pis). If this arrangement is deemed inadequate, it is possible to add more later on. The hardware requirements for building a Raspberry Pi cluster are:


    • Raspberry Pi modules
    • 16GB micro SD cards
    • External hard disk
    • USB to Micro USB cables
    • Port switch
    • Router
    • Ethernet cable


    Figure 2: Block Diagram of Raspberry Pi Cluster


    Figure 2 shows a cluster consisting of multiple Raspberry Pis connected to an Ethernet switch. One Raspberry Pi is the master and is also called the head of the cluster. The rests are slaves or workers. Each node has a static IP address. Here the master can communicate with every node only through the secure shell. A Network File System (NFS) server can be created and configured to both master and slave Raspberry Pis. For the master node, the boot partition is in the microSD card, and the root partition is located in the external hard disk. Power can be supplied to a Raspberry Pi via the micro USB connector or through the GPIO pins.


    Node and Software Installation

    Each node consists of a Raspberry Pi in the cluster with a 16GB SD card. The first step is to prepare the master node by loading the Raspbian operating system in the microSD card. The Raspbian operating system is explicitly designed for the Raspberry Pi and based on Debian Linux.


    Once a single Pi is booted up, the following packages can be installed:

    • GCC FORTRAN software package

    The GCC FORTRAN compiler has optimization and multithreading features. It is the default compiler suite in High-Performance Computing (HPC).


    • Message Passing Interface (MPI)

    MPI can be used by installing the MPICH software package and the MPI4PY software. MPI implementation used on Raspberry Pis are Open MPI and MPICH (Message Passing Interface Chameleon). MPICH is a free MPI distribution written for UNIX-like operating systems. It is implemented in this case. MPICH is a high performance and widely portable implementation of the MPI standard.


    MPI4PY stands for MPI for Python. It provides MPI bindings for Python and allows any Python programs to use a multiple-processor configuration computer.


    It is possible to take the image of the microSD card from a Raspberry Pi and copy it into all microSD cards of the slave Raspberry Pis which the cluster is composed of.


    The next step is to configure the network. Each Raspberry Pi can be accessed by giving a static IP address from the domain. The configuration of the SSH (Secure Shell)  keys for each Pi follows, since the master node requires passwordless access into the slave Raspberry Pis over Secure Shell.


    The head node (Master) contains a file system that will be shared via a Network File System (NFS) server.  This file system will consequently be mounted by the sub-nodes. Using NFS brings multiple advantages, like allowing programs, packages, and features to be installed on a single file system and then subsequently shared throughout the network. This is faster and easier to maintain compared to manually copying files and programs. Passwordless SSH allows you to efficiently run of commands on the sub-nodes.


    Using the Cluster

    In order to run a program on the cluster, the source must include the MPICH library and must be compiled using the MPICH wrapper for C or C++ programs. After this procedure a program will be able to run on multiple nodes.


    Below is a simple hello world example:


    #include "mpi.h"

    #include <stdio.h>


    int main( int argc, char *argv[] )


        int rank, size;


        MPI_Init( &argc, &argv );

        MPI_Comm_rank( MPI_COMM_WORLD, &rank );

        MPI_Comm_size( MPI_COMM_WORLD, &size );

        printf( "Hello World from process %d of %d\n", rank, size );


        return 0;



    The output may look like:


    Hello World from process 0 of 4

    Hello World from process 2 of 4

    Hello World from process 3 of 4

    Hello World from process 1 of 4


    Raspberry Pi cluster Applications

    Some applications of Raspberry Pi clusters include edge-computing, expendable-computing, and portable clusters which are hard to implement with traditional computing hardware. Micro-scale data centers using Raspberry Pi have helped students learn and deal with the software stack which is similar to those found in large-scale commercial data centers. Newer generation data centers are likely to be equipped with ARM processing cores, so a micro-cluster can be an ideal environment when it comes to prototyping future data center applications.