This page describes the generic-parallel interface.

The generic-parallel interface has been designed to be compatible with current parallel programming models (see the Backends section).

Prototypes of the generated functions

The generic-parallel interface for material properties generates two functions. Those functions can be used:

In addition, the generic-parallel interface also exports:

First prototype supporting data with strides

The first generated function matches the following prototype:

void (*)(mfront_gmp_OutputStatus* const,       // output status
         mfront_gmp_real* const,               // output values
         const mfront_gmp_size_type,           // output stride
         const mfront_gmp_real* const,         // values of the arguments
         const mfront_gmp_size_types* const,   // strides of the arguments
         const mfront_gmp_size_type,           // number of arguments
         const mfront_gmp_size_type,           // number of points
         const mfront_gmp_OutOfBoundsPolicy);  // out of bounds policy

The mfront_gmp_OutputStatus structure and the mfront_gmp_OutOfBoundsPolicy enumeration type are described in the page dedicated to the generic interface for material properties. The mfront_gmp_real type is by default an alias to double precision floating point numbers.

If the number of points is null, no computation is made.

A null stride means that the associated argument is uniform. The stride of the output can be null only if all the strides of all the arguments are null, otherwise an error is reported.

Most backends will treat the case where all strides are equal to one as a special case and will use an optimized implementation.

Second prototype

The second generated function matches the following prototype:

void (*)(mfront_gmp_OutputStatus* const,       // output status
         mfront_gmp_real* const,               // output values
         const mfront_gmp_real* const,         // values of the arguments
         const mfront_gmp_size_type,           // number of arguments
         const mfront_gmp_size_type,           // number of points
         const mfront_gmp_OutOfBoundsPolicy);  // out of bounds policy

This function assumes that the values of the arguments and the values of the output are stored contiguously in memory. In other words, this prototype is equivalent to the the first prototype when all strides are equal to one.

Backends

Backends are associated with parallel programming models.

CUDA backend

Example of usage

Let UO2_ShearModulus.mfront be an implementation of a material property computing the shear modulus of uranium dioxide, which depends on the temperature and on the porosity. This file can be compiled as follows:

$ mfront --obuild --configuration-file=config-cuda.json \
         --interface=generic-parallel UO2_ShearModulus.mfront
The following library has been built :
- libGenericParallelUO2-cuda.so :  UO2_ShearModulus UO2_ShearModulus2

where the configuration file config-cuda.json provide the required information to call a CUDA compiler (nvcc or clang++) with the appropriate flags. This file may also contain the options associated with the CUDA backend, see below.

UO2_ShearModulus implements the first prototype, while UO2_ShearModulus2 implements the second.

The following code shows how to call the function UO2_ShearModulus. In this example, the porosity is assumed to be uniform and has thus a stride equal to zero. For the sake of simplicity, memory allocation on the host and on the device are handled by the Thrust library.

thrust::host_vector<double> G;
thrust::host_vector<double> T = {300, 500, 300, 800};
thrust::host_vector<double> f = {0.1};

thrust::device_vector<double> d_T(T);
thrust::device_vector<double> d_f(f);
thrust::device_vector<double> d_G(T.size());

auto output = mfront_gmp_OutputStatus{};
const auto policy = GENERIC_MATERIALPROPERTY_NONE_POLICY;
const auto args = std::array<double *, 2u>{thrust::raw_pointer_cast(d_T.data()),
                                     thrust::raw_pointer_cast(d_f.data())};
const auto args_stride = std::array<mfront_gmp_size_type, 2u>{1, 0};
UO2_ShearModulus(&output, thrust::raw_pointer_cast(d_G.data()), 1,
                 args.data(), args_stride.data(), 2, 4, policy);
G = d_G;

The following remarks can be made:

Example of configuration file for nvcc

The following configuration file exemplifies how to use the nvcc compiler to build the source code generated by the CUDA backend

compilation_options : {
  cuda : {
    compiler: "/usr/local/cuda-12.8/bin/nvcc",
    compilation_flags: {"-O2 -std=c++20 -diag-suppress 20012",
       "--expt-relaxed-constexpr", "-Xcompiler -fPIC"}
  }
}

Example of configuration file for clang++

The following configuration file exemplifies how to use the clang++ compiler to build the source code generated by the CUDA backend:

compilation_options : {
  cuda : {
    compiler: "clang++",
    compilation_flags: {"-O3 -std=c++20 -march=native",
      "-x cuda  --cuda-compile-host-device --cuda-gpu-arch=sm_86",
      "--cuda-path=/usr/local/cuda-12.8/ -fPIC -DPIC"}
  }
},
linking_options: {
  linker_flags: "-L/usr/local/cuda-12.8/lib64 -lcudart -ldl -lrt -pthread"
}

Options

The only option available is number_of_threads_per_block, the number of threads per block. By default, the value used is \(64\). This number is generally chosen a multiple of the number of threads per warps, which is typically \(32\) or \(64\). Typical values are thus \(64\), \(128\) and \(256\).

Internal details

When required, the strides are stored in constant memory on the device.

If the material property exposes parameters, those are stored in a global variable on the host (the CPU). For efficiency, parameters are copied in constant memory on the device when evaluating the material property. This mechanism allows to keep the flexibility provided by parameters: this flexibility is required to perform sensitivity analyses, uncertainty propagation studies or perform a new identification. Note that parameters’ handling can still be disabled by setting the parameters_as_static_variables DSL option to true.

The errors are handled by allocating a managed array of integers associated with all possible kind of errors. This array is only accessed if an error occurs, minimizing the cost of error handling. On output, the user knows if an error occurred, but not where this errors occurred. For instance, the UO2_ShearModulus function may report that one temperature passed on input was negative at one evaluation point. For material properties, errors are currently associated with violation of bounds and physical bounds. Those tests are disabled by setting the disable_runtime_checks DSL optionto true. Note that this option also disables other checks performed on the host (notably regarding the number of arguments passed).

Parallel STL backend (stdpar)

Example of usage

Let UO2_ShearModulus.mfront be an implementation of a material property computing the shear modulus of uranium dioxide, which depends on the temperature and on the porosity. This file can be compiled as follows:

$ mfront --obuild --configuration-file=config-stdpar.json \
         --interface=generic-parallel UO2_ShearModulus.mfront
The following library has been built :
- libGenericParallelUO2-stdpar.so :  UO2_ShearModulus UO2_ShearModulus2

where the configuration file config-stdpar.json provide the required information to compile the generated source files and also selects the execution policy to be used (see below).

UO2_ShearModulus implements the first prototype, while UO2_ShearModulus2 implements the second.

auto G = std::vector<double>(4);
const auto T = std::vector<double>{300, 500, 300, 800};
const auto f = std::vector<double>{0.1};
auto output = mfront_gmp_OutputStatus{};
const auto policy = GENERIC_MATERIALPROPERTY_NONE_POLICY;
const auto args = std::array<const double *, 2u>{T.data(), f.data()};
const auto args_strides = std::array<mfront_gmp_size_type, 2u>{1, 0};
UO2_ShearModulus(&output, G.data(), 1, args.data(), args_strides.data(), 2, 4,
                 policy);

Example of configuration file for g++

The following configuration file exemplifies how to use the clang++ compiler to build the source code generated by the stlpar backend:

interfaces_options: {
  generic-parallel: {
    backend: {stlpar: {execution_policy: "parallel_unsequenced_policy"}}
  }
},
compilation_options : {
  cxx : {
    compiler: "g++",
    compilation_flags: "-O2 -std=c++20 -march=native" 
  }
},
linking_options : {
  linker_flags : "-ltbb"
}

Example of configuration file for nvhcp

The following configuration file exemplifies how to use the nvhcp compiler to build the source code generated by the stlpar backend and execute the computation of the material property on the device:

interfaces_options: {
  generic-parallel: {
    backend: {stlpar: {execution_policy: "parallel_unsequenced_policy"}}
  }
},
compilation_options : {
  cxx : {
    compiler: "nvc++",
    compilation_flags: "-O2 -stdpar=gpu -std=c++20 -march=native -gpu=sm_89" 
  }
},
linking_options : {
  linker_flags: "-stdpar=gpu"
}

Options

The only option available is execution_policy, which can have one of the following values:

The exact meaning of those policies are implementations defined and may depend on compiler flags used. See this page for details.

Notes