Prefer passing primitive types (int, char, float, ...) and POD structs that are cheap to copy (Point, complex) by value.
This will be more efficient than the indirection required when passing by reference.
See Boost's Call Traits.
The template class call_traits<T>
encapsulates the "best" method to pass a parameter of some type T to or from a function, and consists of a collection of typedefs defined as in the table below. The purpose of call_traits
is to ensure that problems like "references to references" never occur, and that parameters are passed in the most efficient manner possible.
This answer applies to static
functions, or with LTO (link time optimization), but not in general across boundaries where the calling convention has to be followed. (Like between libraries and the main executable. Or without LTO, between source files.)
There is a certain GCC optimization called IPA SRA, that replaces "pass by reference" with "pass by value" automatically: https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html (-fipa-sra
)
This is most likely done for scalar types (eg. int, double, etc), that does not have non-default copy semantics and can fit into cpu registers.
This makes
void(const int &f)
probably as fast (and space optimized)
void(int f)
So with this optimization enabled, using references for small types should be as fast as passing them by value.
On the other hand passing (for example) std::string by value could not be optimized to by-reference speed, as custom copy semantics are being involved.
From what I understand, using pass by reference for everything should never be slower than manually picking what to pass by value and what to pass by reference.
This is extremely useful especially for templates:
template<class T>
void f(const T&)
{
// Something
}
is always optimal when the template inlines or at least is private to the call-sites using it. e.g. you don't take the address of the template function and pass it as a callback.
Best Answer
A good way to find out why there are any differences is to check the disassembly. Here are the results I got on my machine with Visual Studio 2012.
With optimization flags, both functions generate the same code:
This is basically equivalent to:
Without optimization flags, you will probably get different results.
function (no optimizations):
function2 (no optimizations)
Why is pass by value faster (in the no optimization case)?
Well,
function()
has two extramov
operations. Let's take a look at the first extramov
operation:Here we are dereferencing the pointer. In
function2 ()
, we already have the value, so we avoid this step. We first move the address of the pointer into register eax. Then we move the value of the pointer into register ecx. Finally, we multiply the value by five.Let's look at the second extra
mov
operation:Now we are moving backwards. We have just finished multiplying the value by 5, and we need to place the value back into the memory address.
Because
function2 ()
does not have to deal with referencing and dereferencing a pointer, it gets to skip these two extramov
operations.