C Programming – Relying on Machine-Dependent Behavior for Undefined Behavior

c++undefined-behavior

Generally, UB is regarded as being something that has to be avoided, and the current C standard itself lists quite a few examples in appendix J.

However, there are cases where I can see no harm in exploiting UB other than sacrificing portability.

Consider the following definition:

int a = INT_MAX + 1;

Evaluating this expression leads to UB. However, if my program is intended to run on a, say, 32-bit CPU with modular arithmetic representing values in Two's Complement, I'm inclined to believe that I can predict the outcome.

In my opinion, UB is sometimes just the C standard's way of telling me: "I hope you know what you're doing, because we can't make any guarantees on what will happen."

Hence my question: is it safe to sometimes rely on machine-dependent behavior, even if the C standard considers it to invoke UB, or is "UB" really to be avoided, no matter what the circumstances are?

Best Answer

No, unless you're also keeping your compiler the same and your compiler documentation defines the otherwise undefined behavior.

Undefined behavior means that your compiler can ignore your code for any reason, making things true that you don't think should be.
Sometimes this is for optimization, and sometimes it's because of architecture restrictions like this.


I suggest you read this, which addresses your exact example. An excerpt:

Signed integer overflow:

If arithmetic on an int type (for example) overflows, the result is undefined. One example is that INT_MAX + 1 is not guaranteed to be INT_MIN. This behavior enables certain classes of optimizations that are important for some code.

For example, knowing that INT_MAX + 1 is undefined allows optimizing X + 1 > X to true. Knowing the multiplication "cannot" overflow (because doing so would be undefined) allows optimizing X * 2 / 2 to X. While these may seem trivial, these sorts of things are commonly exposed by inlining and macro expansion. A more important optimization that this allows is for <= loops like this:

for (i = 0; i <= N; ++i) { ... }

In this loop, the compiler can assume that the loop will iterate exactly N + 1 times if i is undefined on overflow, which allows a broad range of loop optimizations to kick in. On the other hand, if the variable is defined to wrap around on overflow, then the compiler must assume that the loop is possibly infinite (which happens if N is INT_MAX) - which then disables these important loop optimizations. This particularly affects 64-bit platforms since so much code uses int as induction variables.

Related Question