C++ no_unique_address – How to Determine Which Member Needs no_unique_address and Why in C++20

c++c++20language-lawyer

Consider the following two structs whose sizes are 8 and 1 bytes respectively:

class eight {
    int i;
    char c;

    eight(const blub&) {}
};

class one {
    char s;
    
    one(const blob&) {}
};

When they are embedded in another struct like this:

struct example1 {
    eight b0;
    one b1;
};

The sizeof(example1) will be 12 bytes because b0 and b1 are stored in non-overlapping storage (9 bytes) and then the 4-byte alignment requirement of eight rounds that up to 12. See it on godbolt.

C++20 introduces the no_unique_address attribute, which allows adjacent empty members to share the same address. It also explicitly allows the scenario described above of using padding of one member to store another. From cppreference:

Indicates that this data member need not have an address distinct from all other non-static data members of its class. This means that if the member has an empty type (e.g. stateless Allocator), the compiler may optimise it to occupy no space, just like if it were an empty base. If the member is not empty, any tail padding in it may be also reused to store other data members.

If we use [[no_unique_address]] on the first member (eight), the size drops to 8 bytes.

struct example2 {
    [[no_unique_address]] eight b0;
    one b1;
};

However, if we put it on the second member, the size is still 12 bytes (see it on godbolt):

struct example3 {
    eight b0;
    [[no_unique_address]] one b1;
};

Why?

Intuitively I would expect it to either be required on both ("both overlapped members must opt-in") or on either one ("anyone can opt in an then overlapping is allowed"), but not for it work on one and not the other.

Best Answer

To understand why it goes on the first member, you need to understand the reason why the compiler is forbidden from doing this without an explicit markup. One particular reason is listed in [basic.types]/2&3, which outlines the ability to do a memcpy from one trivially copyable object to another of the same type. This copy acts exactly like copy-assigning those objects:

For any object (other than a potentially-overlapping subobject) of trivially copyable type T, whether or not the object holds a valid value of type T, the underlying bytes (6.7.1) making up the object can be copied into an array of char, unsigned char, or std::byte (17.2.1).36 If the content of that array is copied back into the object, the object shall subsequently hold its original value.

For any trivially copyable type T, if two pointers to T point to distinct T objects obj1 and obj2, where neither obj1 nor obj2 is a potentially-overlapping subobject, if the underlying bytes (6.7.1) making up obj1 are copied into obj2,37 obj2 shall subsequently hold the same value as obj1.

This prevents the compiler from allowing b1 to take up storage in b0. Why? Because if it did, then doing a memcpy into b0 would modify b1. There's nothing in the above section which allows such a memcpy to affect objects other than the ones being copied. Therefore, the compiler is forbidden from allowing b1 to take up storage inside of b0.

But note that memcpying into b1 is fine. There's nothing wrong with that memcpy operation even if b1's storage is inside of b0.

You may have noticed that the cited text provides exceptions for "potentially-overlapping subobjects". By putting no_unique_address on a member variable, it becomes a "potentially-overlapping subobject". That makes doing the above memcpy operation yield undefined behavior. And therefore, it is now OK for the compiler to use b0's storage for b1.

That's why the attribute goes on b0: because it is b0 that must be prevented from being used in certain ways, not b1.