Embedded Multicore Building Blocks

Boost your applications' performance


To get started with EMB², you may follow the instructions given below. We will guide you through the installation process and show how to create a simple application on both Linux and Windows. Additionally, you may also try out the examples contained in the subdirectory doc/examples or the tutorial application (doc/tutorial/application) after EMB² has been installed. If you want to use EMB² for an application built with CMake, have a look at the README.md file which explains how to easily integrate the library into your build process.

Linux

Installation

  1. Make sure that you have CMake installed. On a Debian or Ubuntu system, for example, type:
    sudo apt-get install cmake
  2. Download the latest release of EMB² (.tar.gz file) from GitHub and save it in a directory of your choice.
  3. Open a shell, change to that directory, and untar the file (replace X.Y.Z with the actual version number):
    tar xfz embb-X.Y.Z.tar.gz
  4. Create a subdirectory build and change to it:
    cd embb-X.Y.Z
    mkdir build
    cd build
  5. Generate the build files using CMake:
    cmake ..
  6. Now, you can compile EMB² using the generated build files:
    cmake --build .
  7. After compilation has finished, execute the tests:
    binaries/run_tests.sh
    None of the tests should fail.
  8. Finally, you can install EMB² (the default path is /usr/local):
    sudo cmake --build . --target install

Creating and Running an Application

  1. Change to a directory where you want to keep the application and open a text editor.
  2. Copy the source code of the application (see below), paste it into the editor, and save it in the file sum_of_squares.cpp. Note: For simplicity, we are using C++11 although EMB² can also be used with C++03. Please make sure that you have a recent compiler installed.
  3. Compile the application as follows:
    g++ -O3 -Wall --std=c++11 sum_of_squares.cpp -o sum_of_squares -lembb_dataflow_cpp -lembb_algorithms_cpp -lembb_containers_cpp -lembb_mtapi_cpp -lembb_mtapi_c -lembb_base_cpp -lembb_base_c -lpthread
    Note: For the sake of generality, we link all libraries provided by EMB², not only the required ones.
  4. Run the generated executable:
    ./sum_of_squares
    You should get something like this:
    Result:  1.00744e+23
    Runtime: 85 ms
    Note: The result may vary due to rounding errors, and the runtime of course depends on the processor speed as well as the number of cores. You may increase or decrease vector_size to get acceptable but not too short runtimes (if the vector is too large, you will get a bad_alloc exception).
  5. To compare the parallel version with a sequential implementation, replace the call to Reduce (lines 25-30) with a simple loop:
    double result = 0.0;
    for (double x : vec)
      result += pow(x, 2.0);
  6. Compile and execute the program as in steps 3 and 4, respectively.
  7. Get a coffee and stop programming sequentially (or using threads)!

Windows

Installation

  1. Download and run the Windows installer of CMake. Make sure that CMake is added to the system path: CMake installation options
  2. Download the latest release of EMB² (.zip file) from GitHub and save it in a directory of your choice.
  3. Open a developer command prompt for Visual Studio (use a version supporting C++11 to be able to compile the sample application) with administrator privileges (right-click on the icon and select "Run as administrator"). Change to the directory where you saved the .zip file and unzip it (replace X.Y.Z with the actual version number):
    unzip embb-X.Y.Z.zip
  4. Create a subdirectory build and change to it:
    cd embb-X.Y.Z
    mkdir build
    cd build
  5. Generate the build files using CMake, for example:
    cmake -G "Visual Studio 14 2015" ..
    Note: Make sure that you specify the correct version of Visual Studio. A list of supported CMake generators can be displayed by typing:
    cmake --help
    In the following, we will assume a 32 bit configuration (64 bit are handled similiarly).
  6. Now, you can compile EMB² using the generated build files:
    cmake --build . --config Release
    Note: As opposed to a Linux build, the type [Release|Debug] has to be specified explicitly.
  7. After compilation has finished, execute the tests:
    binaries\run_tests.bat
    None of the tests should fail.
  8. Finally, you can install EMB² (the default path is C:\Program Files\embb-X.Y.Z\ or C:\Program Files (x86)\embb-X.Y.Z depending on the architecture):
    cmake --build . --target install --config Release
    Note: In case of errors, you probably did not run the developer command prompt as administrator.

Creating and Running an Application

  1. Open Visual Studio and click on FileNewProject...
  2. Select Win32 Console Application, enter a (solution) name, e.g., sum_of_squares, and click OK: Visual Studio project creation
  3. Click Next >, deselect Precompiled header, and click Finish: Visual Studio application settings
  4. Copy the source code of the application (see below) and paste it into sum_of_squares.cpp.
  5. Press Alt+F7 to open the project properties.
  6. In the Configuration field, select All Configurations.
  7. Select C/C++General and add C:\Program Files (x86)\EMBB-X.Y.Z\include to the Additional Include Directories field: Visual Studio include directories
  8. Select LinkerGeneral and add C:\Program Files (x86)\EMBB-X.Y.Z\lib to the Additional Library Directories field: Visual Studio library directories
  9. Select LinkerInput and add embb_dataflow_cpp.lib; embb_algorithms_cpp.lib; embb_containers_cpp.lib; embb_mtapi_cpp.lib; embb_mtapi_c.lib; embb_base_cpp.lib; embb_base_c.lib; to the Additional Dependencies field: Visual Studio additional dependencies Note: For the sake of generality, we link all libraries provided by EMB², not only the required ones.
  10. Select Release configuration in the toolbar: Visual Studio release configuration
  11. Press F7 to build the solution and CTRL+F5 to run it. You should get something like this:
    Result:  1.00744e+23
    Runtime: 936 ms
    Note: The result may vary due to rounding errors, and the runtime of course depends on the processor speed as well as the number of cores. You may increase or decrease vector_size to get acceptable but not too short runtimes (if the vector is too large, you will get a bad_alloc exception).
  12. To compare the parallel version with a sequential implementation, replace the call to Reduce (lines 25-30) with a simple loop:
    double result = 0.0;
    for (double x : vec)
      result += pow(x, 2.0);
  13. Compile and execute the program as in step 11.
  14. Get a coffee and stop programming sequentially (or using threads)!

Sample Application

The following program serves as an example of how to use the parallel patterns provided by EMB². The program squares each element of a vector and computes the sum of the resultung values—a typical scenario for exploiting data parallelism. As a first step, it initializes the task scheduler and the input vector. Then, it calls the Reduce function which is part of the algorithms namespace. Similar to std::accumulate, this function takes as arguments the input range, the neutral element w.r.t to the performed operation (initial value), and the operation itself (reduction function). Moreover, it takes a transformation function which is applied to each element before the actual reduction takes place. Internally, Reduce splits the input range into blocks which are processed in parallel by the task scheduler. Hence, there is no need to care about thread management and synchronization.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
#include <cmath>
#include <vector>
#include <iostream>
#include <chrono>
#include <functional>

#include <embb/algorithms/algorithms.h>

// Domain and Node ID for MTAPI
const mtapi_domain_t domain_id(1);
const mtapi_node_t node_id(1);

// Size of vector (adjust depending on processor speed)
const size_t vector_size(1 << 26);

int main() {
  // Initialize task scheduler
  embb::mtapi::Node::Initialize(domain_id, node_id);
  // Create and initialize vector
  std::vector<double> vec(vector_size);
  for (size_t i = 0; i < vec.size(); i++)
    vec[i] = static_cast<double>(i);
  // Compute sum of squares
  auto start = std::chrono::steady_clock::now();
  double result = embb::algorithms::Reduce(
    vec.begin(), vec.end(),  // input range
    0.0,  // neutral element (w.r.t. addition)
    std::plus<double>(),  // reduction fn. (addition)
    [] (double x) {return pow(x, 2.0);}  // transformation fn. (square)
  );
  auto stop = std::chrono::steady_clock::now();
  // Print result and runtime
  auto runtime = std::chrono::duration_cast<std::chrono::milliseconds>
    (stop - start).count();
  std::cout << "Result:  " << result << std::endl;
  std::cout << "Runtime: " << runtime << " ms" << std::endl;
}