C++ Performance – Passing by Value vs Reference vs Pointer

c++pass-by-pointerpass-by-referencepass-by-valuepointers

Let's consider an object foo (which may be an int, a double, a custom struct, a class, whatever). My understanding is that passing foo by reference to a function (or just passing a pointer to foo) leads to higher performance since we avoid making a local copy (which could be expensive if foo is large).

However, from the answer here it seems that pointers on a 64-bit system can be expected in practice to have a size of 8 bytes, regardless of what's being pointed. On my system, a float is 4 bytes. Does that mean that if foo is of type float, then it is more efficient to just pass foo by value rather than give a pointer to it (assuming no other constraints that would make using one more efficient than the other inside the function)?

Best Answer

This answer applies to static functions, or with LTO (link time optimization), but not in general across boundaries where the calling convention has to be followed. (Like between libraries and the main executable. Or without LTO, between source files.)


There is a certain GCC optimization called IPA SRA, that replaces "pass by reference" with "pass by value" automatically: https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html (-fipa-sra)

This is most likely done for scalar types (eg. int, double, etc), that does not have non-default copy semantics and can fit into cpu registers.

This makes

void(const int &f)

probably as fast (and space optimized)

void(int f)

So with this optimization enabled, using references for small types should be as fast as passing them by value.

On the other hand passing (for example) std::string by value could not be optimized to by-reference speed, as custom copy semantics are being involved.

From what I understand, using pass by reference for everything should never be slower than manually picking what to pass by value and what to pass by reference.

This is extremely useful especially for templates:

template<class T>
void f(const T&)
{
    // Something
}

is always optimal when the template inlines or at least is private to the call-sites using it. e.g. you don't take the address of the template function and pass it as a callback.