Memory Alignment – Comparison of Today and 20 Years Ago

assemblyc++gccx86

In the famous paper "Smashing the Stack for Fun and Profit", its author takes a C function

void function(int a, int b, int c) {
  char buffer1[5];
  char buffer2[10];
}

and generates the corresponding assembly code output

pushl %ebp
movl %esp,%ebp
subl $20,%esp

The author explains that since computers address memory in multiples of word size, the compiler reserved 20 bytes on the stack (8 bytes for buffer1, 12 bytes for buffer2).

I tried to recreate this example and got the following

pushl   %ebp
movl    %esp, %ebp
subl    $16, %esp

A different result! I tried various combinations of sizes for buffer1 and buffer2, and it seems that modern gcc does not pad buffer sizes to multiples of word size anymore. Instead it abides the -mpreferred-stack-boundary option.

As an illustration — using the paper's arithmetic rules, for buffer1[5] and buffer2[13] I'd get 8+16 = 24 bytes reserved on the stack. But in reality I got 32 bytes.

The paper is quite old and a lot of stuff happened since. I'd like to know, what exactly motivated this change of behavior? Is it the move towards 64bit machines? Or something else?

Edit

The code is compiled on a x86_64 machine using gcc version 4.8.2 (Ubuntu 4.8.2-19ubuntu1) like that:

$ gcc -S -o example1.s example1.c -fno-stack-protector -m32

Best Answer

What has changed is SSE, which requires 16 byte alignment, this is covered in this older gcc document for -mpreferred-stack-boundary=num which says (emphasis mine):

On Pentium and PentiumPro, double and long double values should be aligned to an 8 byte boundary (see -malign-double) or suffer significant run time performance penalties. On Pentium III, the Streaming SIMD Extension (SSE) data type __m128 suffers similar penalties if it is not 16 byte aligned.

This is also backed up by the paper Smashing The Modern Stack For Fun And Profit which covers this an other modern changes that break Smashing the Stack for Fun and Profit.

Related Question