The page describes the new functionalities of Version 4.0 of the TFEL project.

This version was released on October 26, 2021 along with TFEL-3.0.10, TFEL-3.1.10, TFEL-3.2.7, TFEL-3.3.3 and TFEL-3.4.3 and inherits from the issues fixed in those releases.

1 Known incompatibilities

1.1 Changes to existing models

1.1.1 Mohr Coulomb Abbo Sloan

2 Rejuvenation of the TFEL/Math library

2.1 Description of some shortcomings of previous versions

2.1.1 Template metaprogramming

The first versions of TFEL were based on C++-98 standard which had limited support for template metaprogramming and no support for concept based requirements.

Template metaprogramming is the basis of many optimization techniques used in the library, such as expression templates, loop unrolling, compile-time conditionals, etc.

Metaprogramming in C++ is becoming much easier at each revision of the standard. For example, constexpr functions were introduced in C++-11 and extended in later revisions C++-11 and C++-17.

Compile-time conditionals based in the if constexpr construct have been introduced in C++-17. This new construct eliminates the need of recurring to partial template specialization of auxiliary class template classes to implement compile-time conditionals, a pattern frequently used within the TFEL/Math library which is both verbose and difficult to read.

The sigmaeq function compute the von Mises norm of a symmetric second order tensor. Depending on the space dimension, some components are null are shall be neglected. Since the space dimension is known at compile-time, the correct implementation can be chosen at this stage, thus avoiding a conditional branch at runtime.

2.1.1.1.1 Implementation in Version 3.4.1

The implementation of this function in TFEL-3.4.1 is illustrated in Listing lst. 1.

Listing 1: Implementation of the sigmaeq function in TFEL-3.4.1

template <typename T>
typename std::enable_if<tfel::meta::Implements<T, StensorConceptBase>::cond,
                        StensorNumType<T>>::type
sigmaeq(const T& s) {
  typedef tfel::math::internals::SigmaEqImpl<StensorTraits<T>::dime> Impl;
  return Impl::exe(s);
}

The returned type of this function is quite complex and will be discussed in depth in Section 2.1.2. In this section, let us focus on the body of this function which selects a specialization of the class SigmaEqImpl based on the space dimension and call its static function member exe.

Listing lst. 2 shows this specialization in \(1\) dimension.

Listing 2: Implementation of the SigmaEqImpl class in TFEL-3.4.1

namespace internals {

  struct SigmaEqImplBase {
    template <typename T>
    static auto square(const T x) {
      return x * x;
    }
  };

  template <unsigned short N>
  struct SigmaEqImpl;

  template <>
  struct SigmaEqImpl<1u> : public SigmaEqImplBase {
    template <class T>
    static typename std::enable_if<
        tfel::meta::Implements<T, StensorConceptBase>::cond,
        StensorNumType<T>>::type
    exe(const T& s) {
      typedef StensorNumType<T> NumType;
      typedef tfel::typetraits::base_type<NumType> base;
      constexpr auto cste = base{3} / base{2};
      const auto tr = trace(s) / 3;
      return std::sqrt(cste * (SigmaEqImplBase::square(s(0) - tr) +
                               SigmaEqImplBase::square(s(1) - tr) +
                               SigmaEqImplBase::square(s(2) - tr)));
    }
  };

} // end of namespace internals

The specializations in \(2D\) and \(3D\) are of course very similar. In fine, the whole implementation is hardly readable and verbose with about 80 lines.

2.1.1.1.2 Implementation in Version 4.0

In constrast, the use of the if constexpr construct of C++-17 leads to this aknowledgly much more compact and readable code:

Listing 3: Implementation of the sigmaeq function in TFEL-4.0

template <typename StensorType>
std::enable_if_t<implementsStensorConcept<StensorType>(),
                 numeric_type<StensorType>>
sigmaeq(const StensorType& s) {
  using real = base_type<numeric_type<StensorType>>;
  constexpr auto N = getSpaceDimension<StensorType>();
  constexpr auto one_third = real(1) / 3;
  constexpr auto cste = real(3) / 2;
  static_assert((N == 1) || (N == 2) || (N == 3), "invalid space dimension");
  auto square = [](const auto x) { return x * x; };
  const auto tr = one_third * trace(s);
  if constexpr (N == 1u) {
    return std::sqrt(cste * (square(s(0) - tr) +  //
                             square(s(1) - tr) +  //
                             square(s(2) - tr)));
  } else if constexpr (N == 2u) {
    return std::sqrt(cste * (square(s(0) - tr) +  //
                             square(s(1) - tr) +  //
                             square(s(2) - tr) + square(s(3))));
  } else if constexpr (N == 3u) {
    return std::sqrt(cste * (square(s(0) - tr) +  //
                             square(s(1) - tr) +  //
                             square(s(2) - tr) +  //
                             square(s(3)) + square(s(4)) + square(s(5))));
  }
}  // end of sigmaeq

2.1.2 Concepts

Concepts are used in the library to relate objects having the same mathematical nature, such as symmetric second order tensors.

Indeed, several data structures in the library matches the concept of symmetric second order tensors:

Support for concepts is not part of the C++-17 standard and has only been introduced in the C++-20 standard. Hence, the support of concepts currently rely on a hand crafted machinery which has been simplified using C++-17 facilities and prepared for its elimination when the library will be ported to C++-20.

Requirements on template arguments are currently based on the SFINAE rule and the std::enable_if facility, which takes as first template argument a boolean value which states if the template parameters of the function satisfies the requirements and as second argument giving the result of the function. To illustrate this, one may consider the declaration of the sigmaeq function:

template <typename StensorType>
std::enable_if_t<implementsStensorConcept<StensorType>(),
                 numeric_type<StensorType>>
sigmaeq(const StensorType&);

In this example, the implementsStensorConcept function can be called at compile-time to check if the type StensorType is a symmetric second order tensor. If not, this function is not considered by the compiler.

Several functions similar to implementsStensorConcept has been introduced in TFEL-4.0 to facilitate the transition to C++-20 built-in support of concepts.

Declaration in the future TFEL-5.0

With C++-20, the declaration of the sigmaeq will probably be much simplier and is expected to look as follows:

auto sigmaeq(const StensorConcept auto&);

2.1.3 Need of a higher level of abstraction and genericity

2.1.3.1 Basic assignement operators, scaling operators

A mathematical object must implement some basic assignement operators such as:

Those operators must be valid for any mathematical object assignable to the current object or to an object representing an operation whose result is assignable to this object (see the expression template technique described in Section 2.2.1.6). For example, assigning a strain tensor to a stress tensor is not allowed because a strain value can be assigned to a stress value.

A mathematical object must implement scaling operators allowing to mulply (operator*=) and divide (operator/=) every components of the object by a scalar value. This operation is only valid if the result of the operation of multiplying the component by the scalar value is assignable to the component. For example, scaling a strain tensor by a stress value is not allowed (because the result of the multiplication of a strain value by a stress value is a stress value which is not assignable to a strain value).

Those operators were implemented once per supported mathematical object (vector, symmetric tensors, etc.) in dedicated base classes such as:

`tvector_base`, `tmatrix_base`, `stensor_base`, `tensor_base`,
`st2tost2_base`, `t2tot2_base`, `st2tot2_base`, `t2tost2_base`

The use of base class allowed to factorize code between the class implementing the mathematical objects and surrogate classes acting like those mathematical objects like views (see Section 2.2.1.7).

Nevertheless, implementing such operators increased the total code size and were extra-work which prevented introduction of new mathematical objects.

This was required by the lack of a common way to access individual components of a mathematical object. Providing such access is one of the driving force for the re-design of the TFEL/Math library.

2.1.3.2 Views

Views are an important feature of the library which allows to map an externally allocated memory area to mathematical objets. The concept of views is fully described in Section 2.2.1.7.

In previous versions, views were implemented as lightweight classes `which were introduced as needed. With time, a myriad of such classes were introduced:

`ConstTMatrixView`, `tmatrix_column_view`,
`tmatrix_const_column_view`, `tmatrix_const_row_view`,
`tmatrix_const_submatrix_view`, `tmatrix_row_view`,
`tmatrix_submatrix_view` `TMatrixView`, `ConstST2toST2View`,
`ST2toST2View`, `ST2toT2FromTinyMatrixView2`,
`ST2toT2FromTinyMatrixView`, `ST2toT2View`, `ConstStensorView`,
`StensorFromTinyVectorView`, `StensorView`, `ConstT2toST2View`,
`T2toST2FromTinyMatrixView2`, `T2toST2FromTinyMatrixView`,
`T2toST2View`, `T2toT2FromTinyMatrixView2`, `T2toT2FromTinyMatrixView`,
`T2toT2View`, `TensorFromTinyVectorView`, `TensorView`,
`ConstTVectorView`, `TinyVectorFromTinyVectorView`,
`TinyVectorOfStensorFromTinyVectorView`,
`TinyVectorOfTinyVectorFromTinyVectorView`,
`TVectorFromTinyVectorView` `TVectorView`, TensorViewFromStensor`,
TensorFromTinyMatrixRowView`, TensorFromTinyMatrixColumnView2`,
MatrixViewFromTensor`, TensorFromTinyMatrixRowView2`,
TensorFromTinyMatrixColumnView`, VectorFromTinyMatrixColumnView2`,
VectorFromTinyMatrixRowView2`, VectorFromTinyMatrixRowView`,
VectorFromTinyMatrixColumnView`, StensorFromTinyMatrixColumnView`,
StensorFromTinyMatrixRowView`, StensorFromTinyMatrixColumnView2`,
StensorFromTinyMatrixRowView2`

Multiplying such utility classes became a maintenability burden and was also a reason to limit the number of mathematical objects supported by the library and in MFront.

Moreover, those classes were not designed to handle quantities appropriately (see Section 2.2.1.1), i.e. one could not map math objets with units.

All those classes are now remplaced by the two classes View and ViewsArray which heavily relies on the notion of indexing policy described in Section 2.2.1.3 and which correctly handles quantities. Thoses classes are described in depth in Section 2.2.1.7.

2.1.4 Support for compile-time operations (constexpr)

The constexpr keywords has been introduced in C++-11. They can qualify functions and member functions. However, they were rarely used in the TFEL/Math library in previous versions because of important restrictions put by the C++-11 standard and brittle compiler support, as least when TFEL-3.0 was released. Those restrictions were relaxed by C++-14 and C++-17 standards.

Declaring most functions constexpr allow to test them at compile-time which is incredibly confortable: basically, if a unit-test based on the evaluation of constexpr function compiles, then the test does not even have to be run.

2.1.5 Error handling and noexcept correctness.

The default way of reporting errors in the TFEL/Math library was exceptions thrown using one of the tfel::raise functions.

In hindsight, it appears that relying on exceptions was in most cases a bad design choice1.

Most errors can be grabbed at compile-time by the compiler or are just avoided by design. For example, checking if the indices passed to the access operator to a symmetric tensors are valid is mostly useless as:

As runtime-time checks may prevent optimisations, such tests were inhibited in release builds and only enabled in debug builds. In this case, it seems pretty clear that aborting the execution is a much better choice than throwing an exception. This is the purpose of the reportContractViolation function.

As a consequence, most functions and methods of the TFEL/Math library are marked as noexcept which may allow more aggressive optimisation by the compiler.

2.2 Overview of the new design of the TFEL/Math library

This overview of Version 4.0 of the TFEL/Math library mostly focuses on two important features:

2.2.1 Mathematical objects

The mathematical objects provided by the TFEL/Math library are based on scalar values which can be standard C++ numeric types (float, double or long double), or so-called quantities, i.e. an object describing a value with an unit. The later allows to distinguish physical quantities such as a strain from a stress and prevent illegal operations between them. Quantities are described in depth in Section 2.2.1.1.

The TFEL/Math library then introduces some mathematical objects of rank 1, i.e. objects which can be stored in a contiguous memory location and access by a single index, such as:

Higher order objects are then defined as the derivatives of objects of lower ranks. For instance, the derivative of a symmetric tensor with respect to an unsymmetric tensor is implemented by the t2tost2 class. The latter can also be viewed a linear application between unsymmetric tensor and symmetric tensor or as a fourth order tensor. In practice, there are four kind of fourth order tensor defined in the library respectively implemented by the st2tost2, st2tot2, t2tost2, and t2tot2 classes.

Currently only tensors up to the fourth order are specialized which allows to provide some specific features.

Higher order tensors may be represented by a generic class which represent derivatives of two mathematical objects. Currently, the number of operations allowed by this generic class is rather limited and will be extended as needed in future versions.

The main concepts behing the implementation of the mathematical objects are described in Section 2.2.1.2.

2.2.1.1 Quantities

Quantities were introduced in TFEL/Math following ideas of the book D. Abrahams and A. Gurstovoy [1] to allow dimensional analysis at compile time. Basically, a quantity is the combinaison of a value and an unit. The type of the value is called the base type of the quantity.

The unit is encrypted in a type called Unit so that the C++ type-system can be used to detect at compile-time (without runtime-checks) if operations are legal. According to the International System of Units (SI), an unit is decomposed along \(7\) basic units which are the mass (kg), the length (l), the time (s), the electric current (A), the temperature (K), the luminous intensity (cd) and the amount of substance (mole). To be able to describe fractional unit, such as the fracture toughness which is has the unit of a stress times the square root of a length, the Unit class is parametrized by \(14\) integers. The TFEL/Math library provides some convenient aliases for the most common units.

The NoUnit unit is a special case where all those integers are set to 0 for the numerator parts and 1 (by convention) for the denomiator parts.

2.2.1.1.1 The qt class

A quantity is presented by the qt class which is parametrized by a class describing the unit and a numeric base type (generally float, double or long double) which stores the value of the quantity. By default, the double type is used. This class can be used as follows:

constexpr qt<Mass> m1(100.);
constexpr qt<Mass> m2(100.);
constexpr qt<Mass> m3 = m1 + 0.5 * m2;
constexpr qt<Acceleration> a(2);
constexpr qt<Force> f = m1 * a;

The previous code sample illustrates how to declare a new quantity and how to perform operations on quantities. It also highlights that those operations can be used in a constexpr context, i.e. that at compile-time.

2.2.1.1.2 The qt_ref and const_qt_ref classes

The library also provides two wrapper classes called qt_ref and const_qt_ref which allows to wrap a value into an object which acts respectively as an mutable an immutable quantity, as illustrated in the following code sample:

constexpr auto eps = 1e-14;
constexpr auto value = [] ()constexpr {
 constexpr qt<Mass> m(100.);
 constexpr qt<Acceleration> a(2);
 auto vf = double{};
 auto f = qt_ref<Force>(vf);
 f = m * a;
 return vf;
}();
TFEL_TESTS_STATIC_ASSERT(std::abs(value - 200.) < eps);

While a bit contrieved, this example shows that the qt_ref class can also be used in a constexpr context.

The qt_ref and const_qt_ref classes are parametrized by two template parameters which describe respectively the unit and a numeric base type.

The qt_ref and const_qt_ref classes are essentially used to build views of mathematical objects from a raw memory area, as described in Section 2.2.1.7.

2.2.1.1.3 The Quantity class

For the sake of simplicity, the qt, qt_ref and const_qt_ref were described as classes. This is not actually the case in the current implementation which defines them through type aliases to a more general Quantity class which is parametrized by three template arguments: the unit, the basic numeric type and a third argument describing how the value associated with the quantity is handled.

2.2.1.1.4 Operations on quantities

Common operations on quantities, such as additions, substraction, multiplications, division and negations are supported.

Scaling and multiplication by a standard numeric value works as expected, as demonstrated by the following code:

constexpr qt<Mass> m(100.);
// multiplication by a raw numeric type
auto m2 = 2. * m;
// scaling
m2 *= 4;

A quantity with unit NoUnit is implicitly convertible to its base type:

constexpr qt<NoUnit> q(1.2);
constexpr double q_value = q;
const auto cos_q = std::cos(q);

The latter line shows that standard mathematical functions can be called directly.

Fractional power of a quantity are supported through the power function as follows:

// declaration of a stress symmetric tensor
stensor<3u,qt<Stress>> s;
// contracted product of s which has the unit
// of the square of a stress
const auto s2 = s|s;
// norm of s
const auto norm = power<1,2>(s2);

About the power function

The power function is a very convenient function which takes two template parameters N and D defining the exponent as the fraction \(N/D\). The default value of D is \(1\), so the following syntax is valid:

auto v = power<3>(12);

which computes \(12^{3}\).

The power function is optimised for the following values of the denominator:

2.2.1.2 A generic framework to build tensorial-like objects

This section first introduces the notion of array policy and indexing policy which describes respectively how data are handled (stored) and how those data are accesssed.

Common objects of rank \(1\) are then defined, including tiny vectors, symmetric tensors and unsymmetric tensors. Higher order objects are defined recursively as derivative of objects of lower ranks.

2.2.1.2.1 Description of the ArrayPolicy concept

Array policies describes how data are handled and accessed. For the latter, the description of the access partern is delegated to an indexing policy which will be described later in Section 2.2.1.3. Thus, this section mostly focuses on how array policies describe how the data are handled.

A class matching the ArrayPolicy concept must provide:

It is worth illustrating those concepts in two commonly used cases:

2.2.1.2.2 Standard array policies

The StandardArrayPolicy class is based in the following statements:

Standard array policies are which are used to implement concrete mathematical objects such as tvector, tmatrix, stensor, tensor, etc.

Note

Contrary to view array policies described in the next paragraph, the storage_type is not restricted to be a basic numeric type (float, double or long double).

2.2.1.2.3 View array policies

Views generally maps a memory location made of basic numeric types (float, double or long double) to a mathematical objects.

To support quantities, the type stored (storage_type) must be distinguished from the logical type of the mapped object (value_type). The reference type is then a simple wrapper around a numeric value which acts as the desired quantity.

For example, let us consider the case of a view mapping a stress symmetric tensor from an array of double precision numbers. In this case, storage_type is the base type of value_type, so:

This mostly describes the implementation of the ViewArrayPolicy class.

However, those rules only applies when storage_type is a quantity. Otherwise, i.e. when value_type is a basic numeric type, the implementation of the ViewArrayPolicy class is equivalent to the StandardArrayPolicy class.

2.2.1.3 Description of the IndexingPolicy concept

A class matching the IndexingPolicy concept must provide:

2.2.1.3.1 An example of indexing policy
Storage of the elements of a matrix using the row-major format
Figure 1: Storage of the elements of a matrix using the row-major format

Let us consider an \(N\,\times\,M\) matrix stored in a contiguous memory location using the row major format, i.e. all elements of the same row are stored continuously as depicted on Figure 1. The position of \(i,j\) element from the start of the memory area is given by \(i \, M + j\).

Let us now consider a view on a \(L\,\times\,K\) submatrix in a which starts at row \(i0\) and column \(j0\). The position of the \(i,j\) element of the submatrix in the matrix is also given by \(i \, M + j\) if we start from the first element of the submatrix.

This formula is implemented by the getIndex method of the FixedSizeRowMajorMatrixIndexingPolicy which is used by many classes provide by the library, such as tmatrix (matrices whose size is known at compile time), st2tost2, t2tot2, etc.. and views to those objects in a matrix.

2.2.1.3.2 Higher order indexing policies for fixed-size mathematical objects

Let us consider an mathematical object \(o_{3}\) resulting from the derivation of an mathematical object \(o_{1}\) of arity \(a_{1}\) with respect to an mathematical object \(o_{2}\) of arity \(a_{2}\). \(o_{3}\) has an arity of \(a_{1}+a_{2}\).

The storage of those objects are described respectively by the indexing policies \(p_{1}\) and \(p_{2}\). We assume that the object \(o_{1}\) can be stored in a memory area of size \(s_{1}\) and that the object \(o_{2}\) can be stored in a memory area of size \(s_{2}\). Then the object \(o_{3}\) can be stored in a memory location \(s_{1}\,s_{2}\).

Then, an indexing policy \(p_{3}\) suitable to describe \(o_{3}\) may compute the position of the derivative of the component \(o_{1}{\left(i_{0},\ldots,i_{a_{1}-1}\right)}\) with respect to the component \(o_{2}{\left(j_{0},\ldots,j_{a_{2}-1}\right)}\) is given by:

\[ p_{3}{\left(i_{0},\ldots,i_{a_{1}-1},j_{0},\ldots,j_{a_{2}-1}\right)}= p_{1}{\left(i_{0},\ldots,i_{a_{1}-1}\right)}\,s_{2}+p_{2}{\left(j_{0},\ldots,j_{a_{2}-1}\right)} \]

This choice is implemented in the FixedSizeIndexingPoliciesCartesianProduct class which is used to by the FixedSizeArrayDerivative class to describe derivatives of two arbitrary mathematical objects.

2.2.1.4 Low rank mathematical objects

The library is based on a few low rank mathematical objects:

All those objects are represented by objects of rank one using a vector-like notations. For example, a \(3D\) symmetric tensor is represented as follows:

\[ \underline{s}= \begin{pmatrix} s_{\,11}\quad s_{\,22}\quad s_{\,33}\quad \sqrt{2}\,s_{\,12}\quad \sqrt{2}\,s_{\,13}\quad \sqrt{2}\,s_{\,23} \end{pmatrix}^{T} \]

This notations has the property that the contracted product of two symmetric tensors is the scalar product of this vector form (hence the \(\sqrt{2}\)).

In a similar manner, an unsymmetric tensor is represented as follows: \[ \underline{s}= \begin{pmatrix} s_{\,11}\quad s_{\,22}\quad s_{\,33}\quad s_{\,12}\quad s_{\,21}\quad s_{\,13}\quad s_{\,31}\quad s_{\,23}\quad s_{\,32} \end{pmatrix}^{T} \]

2.2.1.5 Higher order objects defined as derivatives

The library provides a generic class called FixedSizeArrayDerivative which allows to create higher order objects as being the derivative of two objects of lowest ranks.

This class is currently very limited but will be extended in future versions of the library.

2.2.1.5.1 The case of fourth order tensors

Fourth order tensors can be defined as derivatives of two tensors or as linear mappings from the second order tensors to second order tensors.

As there are two kinds of second order tensors (i.e. symmetric and non symmetric tensors), there are four kinds of fourth order tensors defined in the TFEL/Math library, which satisfy the following concepts:

An end user will mostly use the following implementations of those concepts: st2tost2, st2tot2, t2tost2 and t2tot2 respectively. Those classes have the following template arguments:

2.2.1.5.2 The tmatrix case

A fixed size matrix, implemented by the tmatrix class can be seen as the derivative of a tiny tensor with respect to a tiny vecor.

2.2.1.5.3 The derivative_type metafunction

The TFEL/Math library provides a very convenient type alias called derivative_type which automatically selects the correct type as the derivative of two objects of fixed sizes. This alias also works with scalars. This type alias takes quantities into account if required.

Here are a few examples:

2.2.1.6 Expressions templates

One may expect that the addition of two vector results in a new vector. This naive approach may lead to poor performances, due to temporaries objects and data copying [2].

Expression templates is a C++ template metaprogramming technique which introduces additional classes which represent the actions to be performed on some objects and lazily delay the execution of those actions until the result is explicitly requested.

The objects of those classes are placeholders, also called handlers within the library, that are meant to be assigned to an object whose type is the expected result of the operation treated.

To illustrate this technique, let us consider the addition of three vectors \(\vec{a}\), \(\vec{b}\) and \(\vec{c}\) and its assignment to a vector \(\vec{d}\):

d = a + b + c;

The addition of the vectors \(\vec{a}\) and \(\vec{b}\) produces an intermediate object \(\vec{e1}\) of type Expr1 which keeps a reference to those two vectors. Similarly, the addition of those three vectors defines another object \(\vec{e2}\) of type Expr2 which stands for the addition of \(\vec{e1}\) and the vector \(\vec{c}\).

Application of the expression templates technique to the addition of three vectors.
Figure 2: Application of the expression templates technique to the addition of three vectors.

Figure 2 shows how the access operator of e2 is implemented. The assignment of an object of type Expr2 to a vector d is implemented as a standard for loop:

for(size_type i=0;i!=d.size();++i){
  d(i) = e2(i);
}

The temporary objects e1 and e2 are meant to be eliminated by the compiler optimisation process. Thanks to function inlining, the compiler is able to produce a code that is equivalent to what would have been obtained with the following instructions:

for(size_type i=0;i!=d.size();++i){
  d[i] = a[i] + b[i] + c[i];
}

About loop unrolling

When dealing with objects whose size is known at compile-time, the TFEL/Math library also performs an additional optimisation technique known as loop unrolling. For example, if \(\underline{a}\), \(\underline{b}\) and \(\underline{c}\) are three \(1D\) symmetric tensors, the code:

c = a + 2 * b;

is equivalent to:

c[0] = a[0] + 2 * b[0];
c[1] = a[1] + 2 * b[1];
c[2] = a[2] + 2 * b[2];
2.2.1.6.1 The eval function

One possible caveat of the expression template technique can be illustrated by the following example. Let us consider two vectors a and b:

a[0] = 1;
b[0] = 2;
auto c = a + b;
// Here, c[0] evaluates to 3, as expected.
a[0] = 2;
// However, at this stage, c[0] evaluates to 4 !

Another caveat is that is sometimes more efficient to evaluate the result of an operation once and use the result of this evaluation rather than performing the evaluation of the operation several times.

To avoid those two caveats, the eval function allows the evaluation of an expression, as follows:

const auto c = eval(a + b);

2.2.1.7 Views

Views allows to map memory area to mathematical objets.

Typical usage of views in MFront

A typical usage of views is given by the example of the integration of a behaviour using an implicit scheme. In such a scheme, a non linear solver tries to determine the increments \(\Delta\,Y\) of a set of internal state variables packed in a vector \(Y\) which are the zero of residual denoted \(F\). The derivative of the residual \(F\) with respect to \(\Delta\,Y\) is called the jacobian and is denoted \(J\).

If one considers a simple plastic law with isotropic hardening, the vector of internal state variables typically contains the elastic strain, a symmetric tensor, and the equivalent plastic strain, a scalar.

In the general case, the vector of internal state variables, the residual and the jacobian can be decomposed as follows:

\[ Y= \begin{pmatrix} y_{1} \\ \vdots \\ y_{i} \\ \vdots \\ y_{n} \\ \end{pmatrix} \quad\quad F= \begin{pmatrix} f_{y_{1}} \\ \vdots \\ f_{y_{i}} \\ \vdots \\ f_{y_{n}} \\ \end{pmatrix} \quad\quad J = {\displaystyle \frac{\displaystyle \partial F}{\displaystyle \partial \Delta\,Y}} = \begin{pmatrix} {\displaystyle \frac{\displaystyle \partial f_{y_{1}}}{\displaystyle \partial \Delta\,y_{1}}} & \ldots & \ldots & \ldots & \ldots \\ \vdots & \vdots & \vdots & \vdots & \vdots \\ \vdots & \vdots & {\displaystyle \frac{\displaystyle \partial f_{y_{i}}}{\displaystyle \partial \Delta\,y_{j}}} & \vdots & \vdots \\ \vdots & \vdots & \vdots & \vdots & \vdots \\ \ldots & \ldots & \ldots & \ldots & {\displaystyle \frac{\displaystyle \partial f_{y_{N}}}{\displaystyle \partial \Delta\,y_{N}}} \\ \end{pmatrix} \]

MFront automatically define views to the objects \(y_{i}\), \(f_{y_{i}}\), \({\displaystyle \frac{\displaystyle \partial f_{y_{i}}}{\displaystyle \partial \Delta\,y_{j}}}\) which allows to compute the residual blocks and the jacobian blocks in a very intuitive ways using tensorial operations. Hence, the user never manipulate directly the vectors \(Y\), \(\Delta\,Y\) and \(F\) nor the jacobian matrix \(J\) but views which acts as tensorial objects.

2.2.1.7.1 The map function

The map function is a small utility function which simplifies the creation of views from either raw pointers or from tiny vectors (i.e. objects of type tvector).

For example, Listing lst. 4 shows how a vector containing the elastic strain and the equivalent plastic strain can be decomposed by blocks. This decomposition is illustrated on Figure 3.

Listing 4: Usage of the `map` function

auto Y = tvector<7, double>{0};
auto eel = map<stensor<3u, double>>(Y);
auto& p = Y[6];
“Exemple of decomposition by blocks of a memory area using views”
Figure 3: “Exemple of decomposition by blocks of a memory area using views”

The map function allows to define offset at compile-time: this allows to checks at compile-time that the memory area is large enough to store the mapped object when mapping a fixed size object (i.e. an object whose size is known at compile-time) from a memory area hold by a tiny vector.

2.2.1.7.2 The map_array function

The map_array returns an object which acts like a fixed size of mathematical objects. It takes one template argument which describes an arry of of mathematical objects. This template argument is used to determine the number of object mapped and the kind of object mapped.

It can be used as follows:

auto a = map_array<tvector<2u, stensor<2u, double>>>(ptr);

where ptr is a pointer to a suitable memory location.

2.2.1.7.3 The map_derivative function

The map_derivative function allows to create a view of the derivative of two math objects in a matrix as illustrated by the following example which create a view of the object resulting of the derivation of a symmetric stress tensor with respect to a scalar whose first element is located in element \(1,1\) of the matrix:

auto r = tmatrix<5, 3>{0};
map_derivative<1, 1, stensor<2u, qt<Stress>>, double>(r) = stensor<2u, Stress>::Id();

The result of this operation is the matrix:

\[ \begin{pmatrix} 0 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 1 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 0 \\ \end{pmatrix} \]

2.2.2 Solvers for fixed size non linear systems

The TFEL/Math library provides several non linear solvers which are mainly used by MFront implicit schemes. Those solvers mostly targets systems of small size and focuse on robustness and flexibility. Each solver implements a classical algorithm by also provides several customisation points which allows to create many variants of this basic algorithms.

The solvers available are based on the following algorithms:

Those classes implements the curiously recurring template pattern (CRTP) to avoid the use of virtual calls [3]. The derived class must provide a method called computeResidual which must compute the residual for the current estimate of the unknowns and, if required by the solver, the jacobian.

A typical usage of those classes is given by the following example:

struct NewtonRaphsonSolver
    : public tfel::math::
          TinyNewtonRaphsonSolver<2u, double, NewtonRaphsonSolver> {
  NewtonRaphsonSolver() {
    this->zeros = {0., 0.};
    this->epsilon = 1.e-14;
    this->iterMax = 20;
  }

  bool solve() { return this->solveNonLinearSystem(); }

  auto getCurrentEstimate() const noexcept { return this->zeros; }

  bool computeResidual() noexcept {
    constexpr double a = 1.;
    constexpr double b = 10.;
    auto& x = this->zeros;
    auto& f = this->fzeros;
    auto& J = this->jacobian;
    f = {1 - x(0), 100 * (x(1) - x(0) * x(0))};
    J = {-1., 0.,  //
         -200 * x(0), 100.};
    return true;
  }  // end of computeResidual

};  // end of struct NewtonRaphsonSolver

which solves the non linear system:

\[ \vec{f}{\left(x,y\right)} = \begin{pmatrix} 0 \\ 0 \\ \end{pmatrix} \quad\text{with}\quad \vec{f}{\left(x,y\right)} = \begin{pmatrix} 1-x \\ 10\,{\left(y-x^{2}\right)} \\ \end{pmatrix} \]

whose obvious root is \({\left(1,1\right)}\).

This previous example shows that:

  1. The current estimate of the unknowns is stored in a data member called zeros.
  2. The residual is stored in a data member called fzeros.
  3. The jacobian is stored in a data member called jacobian.

Those names have been chosen in the early versions of MFront and are kept for back-ward compatibility.

This section describes the generic framework used to implement those algorithms.

2.2.2.1 The TinyNonLinearSolverBase class

All the available solvers are based on the TinyNonLinearSolverBase which provides two main methods solveNonLinearSystem and solveNonLinearSystem2 for the end users and many methods that can be overloaded to customize the behaviour of the algorithm.

In pratice, the NewtonRaphsonSolver class presented in the previous example inherits from TinyNewtonRaphson Solver<2u, double, NewtonRaphsonSolver> which itself inherits from TinyNonLinearSolverBase<2u, double, NewtonRaphsonSolver>.

While a bit contrieved, this design ensures that all algorithms share the same customization points, a constraint which is not easy to enforce when relying on CRTP (compared to standard approach based on virtual calls).

As a consequence, the TinyNewtonRaphsonSolver class only implements some methods specific to the Newton-Raphson algorithm while the global algorithmic structure in handled by the TinyNonLinearSolverBase class.

More precisely, the TinyNonLinearSolverBase provides a method called solveNonLinearSystem. This method internally calls a method called solveNonLinearSystem2 which indeed implements the core of the resolution algorithm. The solveNonLinearSystem2 method is described in depth in Section 2.2.2.1.2.

The point of the solveNonLinearSystem is to handle failures of the resolution algorithms. The current strategy to handle those failures is described in depth in Section 2.2.2.1.3.

2.2.2.1.1 Some data members of the TinyNonLinearSolverBase class

The TinyNonLinearSolverBase class provides the following data members:

It is worth metionning that a few variables must be initalized, by the base class before calling this method, such as:

The iter and has_delta_zeros members are automatically initialized at the beginning of the solveNonLinearSystem method. This method also calls the processNewEstimate method.

2.2.2.1.2 The solveNonLinearSystem2 method

This method is called internally by the solveNonLinearSystem method. It could be called directly if the required initialization are performed beforehand.

The algorithm implemented by solveNonLinearSystem2 method is depicted in Figure 4.

“Flowchart for the resolution of non linear systems proposed by the solveNonLinearSystem2 of the TinyNonLinearSolverBase class”
Figure 4: “Flowchart for the resolution of non linear systems proposed by the solveNonLinearSystem2 of the TinyNonLinearSolverBase class”

The computeNewCorrection is the only method that must be implemented in the derived class. It totally defines the resolution algorithm. This method thus has no default implementation.

This flowchart also shows that the solveNonLinearSystem method has many customization points which either defines the underlying algorithm and or that can be use to enhance directly affects the performance and the robustness of the algorithm:

The algorithm also provides methods which are meant to display informations about the state of the resolution:

Note that other methods for reporting the current status of the algorithm, such as reportFailure and reportSuccess are also available but are called in the solveNonLinearMethod.

Usage in MFront

It is worth hilighting how the implicit DSL’s provided by MFront may override those methods:

2.2.2.1.3 The solveNonLinearSystem method

As highlighted by Figure 4, the solveNonLinearSystem2 methods may fail for many reasons. A very common one is that the current estimate of the unknowns are unphysical, leading to a failure in the evaluation of the residual.

The solveNonLinearSystem method implements a simple algorithm which can be seen as a hand crafted line search method which greatly improve the robustness of the non linear solvers. When used correctly, this method may also be used to increase the performances of the non linear solvers (i.e. reduce the total number of iterations).

“Flowchart for the resolution of non linear systems proposed by the solveNonLinearSystem of the TinyNonLinearSolverBase class”

The idea of this hand-crafted linesearch is simply to take the last correction to the unknowns and divide it by two and restart the core algorithm. In other words, the search direction is leaved unchanged, but the norm of the correction is reduced. This operation can be repeated several times to find a suitable estimate of the unknowns.

Of course, this only works a correction is known, i.e. if the has_delta_zeros flag is true. Otherwise, this means that the initial guess of the unknowns is incorrect. In this case, we just divide this initial guess by zero. This choice may seem arbitrary but does makes sense in most cases of MFront implicit schemes where the unknowns are almost always the increment of the state variables: if we divide the increments of the state variables, their estimates at the middle of the time step tends to theirs values at the beginning of the time step, which is generally physically acceptable2.

Pratical importance

While very simple, the strategy described in this section is in practice extremly powerful and can be used to easily build very robust and efficient algorithms based on physical considerations. Let us consider a few example:

The reader may find an example of such algorithms in the case of a perfect plastic behaviour based on the Hosford stress criterion on this page of the MFront gallery: http://tfel.sourceforge.net/hosford.html

An important caveat to this strategy is link to the use of an active-set method to describe multi-surface plasticity. The active-set method makes a priori assumptions on which plastic mechanisms are active and solve the non-linear equations with those assumptions. After convergence, those assumptions are checked and some mechanisms may be activated or desactivated and the non linear solver is restarted. The activation of a plastic mechanism can lead the computeResidual method to fail (as described earlier, rejecting steps leading to a prediction well beyond the current plastic limit is generally a good strategy). The trouble here is that the last correction computed by the solver is very small since the algorithm had converged. Thus, the strategy implemented by the solveNonLinearSystem method would divide a correction that is already almost null. To avoid this caveat, MFront automatically resets the is_delta_zeros_defined member to false. More precisely, here is the implementation of the checkConvergence method when the user has defined additional convergence checks:

bool checkConvergence(const NumericType error) {
  auto converged = error < this->epsilon;
  auto mfront_internals_converged = converged;
  this->additionalConvergenceChecks(converged, error);
  if((mfront_internals_converged) && (!converged)){
    this->is_delta_zeros_defined = false;
  }
}

3 New MFront features

3.1 Improvements to implicit DSLs

3.1.1 New keywords

3.1.1.1 The @ProcessNewCorrection keyword

The @ProcessNewCorrection keyword introduces a code block called when a new correction of the increments of the integration variables is available.

This method can be used to:

This increment is stored in an array called delta_zeros. The delta_zeros variable is not meant to be used directly by the users and views to the corrections of the increments of the integration variables are automatically declared in this code block.

Let v be an integration variable, then the variable delta_dv is a view to the correction of the increment of this variable. If unicode notations are used, let υ be the symbolic name of v, then δΔv is an alias for delta_dv.

The increments of the integration variables are not updated at this stage of the resolution algorithm.

3.1.1.1.1 Example

The following code limits the amplitude of the correction given to the increment of the elastic strain:

@ProcessNewCorrection{
 constexpr const real δΔεᵉˡ_m = 1.e-4;
 const auto e = abs(δΔεᵉˡ);
 if(e > δΔεᵉˡ_m){
   δΔεᵉˡ *= e / δΔεᵉˡ_m;
 }
}

3.1.1.2 The @ProcessNewEstimate keyword

The @ProcessNewEstimate keyword introduces a code block called after the update of the increments of the integration variables.

This method may be used to compute local variables dependent on the updated value of the integration variables.

For example, MFront may define or update this code block to evaluate material properties dependent on the value of the state variable (for example, a Young modulus depending of the porosity), if any.

4 Issues fixed

4.1 Issue:#277: [material properties] Support for quantities

The UseQt keyword now allows to turn on support for quantities in material properties for interfaces that supports it.

All interfaces delivered by MFront have proper support for quantities.

For more details, see : https://sourceforge.net/p/tfel/tickets/277/

4.2 Issue #276: Support for quantities in TFEL/PhysicalConstants

The PhysicalConstants class now have an additional boolean template parameter stating if quantities are to be used. For backward compatibility, this boolean value is false by default.

The inline variables in the tfel::constants now also have a similar template parameter.

For more details, see : https://sourceforge.net/p/tfel/tickets/276/

4.3 Issue #275: [material properties] Define standard MFront scalar types

For consistency with behaviours, aliases to many scalar types are now automatically defined in material properties, such as:

A complete list of those aliases can be found on this page.

For more details, see : https://sourceforge.net/p/tfel/tickets/275/

4.4 Issue #160: Add the ability to change the linear solver used by the Implicit DSL

Changing the linear solver can now be done by defining an user defined algorithm. The general framework to a new linear solver is documented here: http://tfel.sourceforge.net/tfel-math.html

For more details, see : https://sourceforge.net/p/tfel/tickets/160/

References

1.
Abrahams, David and Gurtovoy, Aleksey. C++ template metaprogramming: Concepts, tools, and techniques from boost and beyond. Boston : Addison-Welsley, 2004. ISBN 0321227255 9780321227256.
2.
Veldhuizen, Todd. Techniques for Scientific C++. 1999.
3.
Coplien, James O. Curiously recurring template patterns. C++ Report. 1995. Vol. 7, no. 2, p. 24–27.
4.
Abbo, A. J. and Sloan, S. W. A smooth hyperbolic approximation to the Mohr-Coulomb yield criterion. Computers & Structures. February 1995. Vol. 54, no. 3, p. 427–441. DOI 10.1016/0045-7949(94)00339-5. Available from: http://linkinghub.elsevier.com/retrieve/pii/0045794994003395
5.
Nagel, Thomas, Minkley, Wolfgang, Böttcher, Norbert, Naumov, Dmitri, Görke, Uwe-Jens and Kolditz, Olaf. Implicit numerical integration and consistent linearization of inelastic constitutive models of rock salt. Computers & Structures. April 2017. Vol. 182, p. 87–103. DOI 10.1016/j.compstruc.2016.11.010. Available from: http://www.scopus.com/inward/record.url?eid=2-s2.0-85006482432{\&}partnerID=MN8TOARS http://linkinghub.elsevier.com/retrieve/pii/S0045794916306319