C++ – Identifying Undefined Behavior in Code

c++undefined-behavior

(I am aware of the fact that returning address/reference to a variable local to the function should be avoided and a program should never do this.)


Does returning a reference to a local variable/reference result in Undefined Behavior? Or does the Undefined Behavior only occur later, when the returned reference is used (or "dereferenced")?

i.e. at what exact statement (#1 or #2 or #3) does code sample below invoke Undefined Behavior? (I've written my theory alongside each one)

#include <iostream>

struct A
{ 
   int m_i;
   A():m_i(10)
   {

   } 
};  
A& foo() 
{     
    A a;
    a.m_i = 20;     
    return a; 
} 

int main()
{
   foo();                // #1 - Not UB; return value was never used
   A const &ref = foo(); // #2 - Not UB; return value still not yet used
   std::cout<<ref.m_i;   // #3 - UB: returned value is used
}

I am interested to know what the C++ standard specifies in this regard.

I would like a citation from the C++ standard which will basically tell me which exact statement makes this code ill-formed.

Discussions about how specific implementations handle this are welcome but as I said an ideal answer would cite an reference from the C++ Standard that clarifies this beyond doubt.

Best Answer

Of course, when the reference is first initialised it is done so validly, satisfying the following:

[C++11: 8.3.2/5]: There shall be no references to references, no arrays of references, and no pointers to references. The declaration of a reference shall contain an initializer (8.5.3) except when the declaration contains an explicit extern specifier (7.1.1), is a class member (9.2) declaration within a class definition, or is the declaration of a parameter or a return type (8.3.5); see 3.1. A reference shall be initialized to refer to a valid object or function. [ Note: in particular, a null reference cannot exist in a well-defined program, because the only way to create such a reference would be to bind it to the “object” obtained by dereferencing a null pointer, which causes undefined behavior. As described in 9.6, a reference cannot be bound directly to a bit-field. —end note ]

The reference being returned from the function is an xvalue:

[C++11: 3.10/1]: [..] An xvalue (an “eXpiring” value) also refers to an object, usually near the end of its lifetime (so that its resources may be moved, for example). An xvalue is the result of certain kinds of expressions involving rvalue references (8.3.2). [ Example: The result of calling a function whose return type is an rvalue reference is an xvalue. —end example ] [..]

That means the following does not apply:

[C++11: 12.2/1]: Temporaries of class type are created in various contexts: binding a reference to a prvalue (8.5.3), returning a prvalue (6.6.3), a conversion that creates a prvalue (4.1, 5.2.9, 5.2.11, 5.4), throwing an exception (15.1), entering a handler (15.3), and in some initializations (8.5).

[C++11: 6.6.3/2]: A return statement with neither an expression nor a braced-init-list can be used only in functions that do not return a value, that is, a function with the return type void, a constructor (12.1), or a destructor (12.4).

A return statement with an expression of non-void type can be used only in functions returning a value; the value of the expression is returned to the caller of the function. The value of the expression is implicitly converted to the return type of the function in which it appears. A return statement can involve the construction and copy or move of a temporary object (12.2). [ Note: A copy or move operation associated with a return statement may be elided or considered as an rvalue for the purpose of overload resolution in selecting a constructor (12.8). —end note ] A return statement with a braced-init-list initializes the object or reference to be returned from the function by copy-list-initialization (8.5.4) from the specified initializer list. [ Example:

std::pair<std::string,int> f(const char* p, int x) {
   return {p,x};
}

—end example ]

Additionally, even if we interpret the following to mean that an initialisation of a new reference "object" is performed, the referee is probably still alive at the time:

[C++11: 8.5.3/2]: A reference cannot be changed to refer to another object after initialization. Note that initialization of a reference is treated very differently from assignment to it. Argument passing (5.2.2) and function value return (6.6.3) are initializations.

  • This makes #1 valid.

However, your initialisation of a new reference ref inside main quite clearly violates [C++11: 8.3.2/5]. I can't find wording for it, but it stands to reason that the function scope has been exited when the initialisation is performed.

  • This would make #2 (and consequently #3) invalid.

At the very least, there does not appear to be anything further stated about the matter in the standard, so if the above reasoning is not sufficient then we have to conclude that the standard is ambiguous in the matter. Fortunately, it's of little consequence in practice, at least in the mainstream.

Related Question