Compiler making a wrong assumption on stack alignment

It appears that this was a GCC bug that has been fixed in GCC 4.5. If you hit this issue, please upgrade to GCC 4.5 and report to us, so we can update this page.

This is an issue that, so far, we met only with GCC on Windows: for instance, MinGW and TDM-GCC.

By default, in a function like this,

void foo()
{
//...
}
The quaternion class used to represent 3D orientations and rotations.
Definition: Quaternion.h:276

GCC assumes that the stack is already 16-byte-aligned so that the object q will be created at a 16-byte-aligned location. For this reason, it doesn't take any special care to explicitly align the object q, as Eigen requires.

The problem is that, in some particular cases, this assumption can be wrong on Windows, where the stack is only guaranteed to have 4-byte alignment. Indeed, even though GCC takes care of aligning the stack in the main function and does its best to keep it aligned, when a function is called from another thread or from a binary compiled with another compiler, the stack alignment can be corrupted. This results in the object 'q' being created at an unaligned location, making your program crash with the assertion on unaligned arrays. So far we found the three following solutions.

Local solution

A local solution is to mark such a function with this attribute:

__attribute__((force_align_arg_pointer)) void foo()
{
//...
}
svint32_t PacketXi __attribute__((arm_sve_vector_bits(EIGEN_ARM64_SVE_VL)))

Read this GCC documentation to understand what this does. Of course this should only be done on GCC on Windows, so for portability you'll have to encapsulate this in a macro which you leave empty on other platforms. The advantage of this solution is that you can finely select which function might have a corrupted stack alignment. Of course on the downside this has to be done for every such function, so you may prefer one of the following two global solutions.

Global solutions

A global solution is to edit your project so that when compiling with GCC on Windows, you pass this option to GCC:

-mincoming-stack-boundary=2

Explanation: this tells GCC that the stack is only required to be aligned to 2^2=4 bytes, so that GCC now knows that it really must take extra care to honor the 16 byte alignment of fixed-size vectorizable Eigen types when needed.

Another global solution is to pass this option to gcc:

-mstackrealign

which has the same effect than adding the force_align_arg_pointer attribute to all functions.

These global solutions are easy to use, but note that they may slowdown your program because they lead to extra prologue/epilogue instructions for every function.