C++ – Why is Iterating Through std::vector Faster Than std::array?

benchmarkingc++performance

I recently asked this question:
Why is iterating an std::array much faster than iterating an std::vector?

As people quickly pointed out, my benchmark had many flaws. So as I was trying to fix my benchmark, I noticed that std::vector wasn't slower than std::array and, in fact, it was quite the opposite.

#include <vector>
#include <array>
#include <stdio.h>
#include <chrono>

using namespace std;

constexpr int n = 100'000'000;
vector<int> v(n);
//array<int, n> v;

int main()
{
    int res = 0;
    auto start = chrono::steady_clock::now();
    for(int x : v)
        res += x;
    auto end = chrono::steady_clock::now();
    auto diff = end - start;
    double elapsed =
        std::chrono::duration_cast<
            std::chrono::duration<double, std::milli>
        >(end - start).count();
    printf("result: %d\ntime: %f\n", res, elapsed);
}

Things I've tried to improve from my previous benchmark:

Made sure I'm using the result, so the whole loop is not optimized away
Using -O3 flag for speed
Use std::chrono instead of the time command. That's so we can isolate the part we want to measure (just the for loop). Static initialization of variables and things like that won't be measured.

The measured times:

array:

$ g++ arrVsVec.cpp -O3
$ ./a.out
result: 0
time: 99.554109

vector:

$ g++ arrVsVec.cpp -O3
$ ./a.out
result: 0
time: 30.734491

I'm just wondering what I'm doing wrong this time.

Watch the disassembly in godbolt

Best Answer

The difference is due to memory pages of array not being resident in process address space (global scope array is stored in .bss section of the executable that hasn't been paged in, it is zero-initialized). Whereas vector has just been allocated and zero-filled, so its memory pages are already present.

If you add

std::fill_n(v.data(), n, 1); // included in <algorithm>

as the first line of main to bring the pages in (pre-fault), that makes array time the same as that of vector.

On Linux, instead of that, you can do mlock(v.data(), v.size() * sizeof(v[0])); to bring the pages into the address space. See man mlock for full details.

Related Solutions

C++ Performance – Is std::vector Slower Than Plain Arrays?

Using the following:

g++ -O3 Time.cpp -I <MyBoost>
./a.out
UseArray completed in 2.196 seconds
UseVector completed in 4.412 seconds
UseVectorPushBack completed in 8.017 seconds
The whole thing completed in 14.626 seconds

So array is twice as quick as vector.

But after looking at the code in more detail this is expected; as you run across the vector twice and the array only once. Note: when you resize() the vector you are not only allocating the memory but also running through the vector and calling the constructor on each member.

Re-Arranging the code slightly so that the vector only initializes each object once:

 std::vector<Pixel>  pixels(dimensions * dimensions, Pixel(255,0,0));

Now doing the same timing again:

g++ -O3 Time.cpp -I <MyBoost>
./a.out
UseVector completed in 2.216 seconds

The vector now performance only slightly worse than the array. IMO this difference is insignificant and could be caused by a whole bunch of things not associated with the test.

I would also take into account that you are not correctly initializing/Destroying the Pixel object in the UseArrray() method as neither constructor/destructor is not called (this may not be an issue for this simple class but anything slightly more complex (ie with pointers or members with pointers) will cause problems.

Vector vs Array Performance in C++ – Benchmarking Guide

A simpler explanation: you're building with optimisations disabled. You want -O3, not -o3.

I don't have clang available to exactly reproduce your tests, but my results are as follows:

//Array run # 1
$ g++ -std=c++11 -O3 test.cpp -o b.out && time ./b.out

real    0m25.323s
user    0m25.162s
sys 0m0.148s

//Vector run #1
$ g++ -std=c++11 -O3 test.cpp -o b.out && time ./b.out

real    0m25.634s
user    0m25.486s
sys 0m0.136s

Best Answer

Related Solutions

C++ Performance – Is std::vector Slower Than Plain Arrays?

Vector vs Array Performance in C++ – Benchmarking Guide

Related Question