It appears that using [] around a generator expression (test1) behaves substantially better than putting it inside of list() (test2). The slowdown isn't there when I simply pass a list into list() for shallow copy (test3). Why is this?
Evidence:
from timeit import Timer
t1 = Timer("test1()", "from __main__ import test1")
t2 = Timer("test2()", "from __main__ import test2")
t3 = Timer("test3()", "from __main__ import test3")
x = [34534534, 23423523, 77645645, 345346]
def test1():
[e for e in x]
print t1.timeit()
#0.552290201187
def test2():
list(e for e in x)
print t2.timeit()
#2.38739395142
def test3():
list(x)
print t3.timeit()
#0.515818119049
Machine: 64 bit AMD, Ubuntu 8.04, Python 2.7 (r27:82500)
Best Answer
Well, my first step was to set the two tests up independently to ensure that this is not a result of e.g. the order in which the functions are defined.
Sure enough, I can replicate this. OK, next step is to have a look at the bytecode to see what's actually going on:
Notice that the first method creates the list directly, whereas the second method creates a
genexpr
object and passes that to the globallist
. This is probably where the overhead lies.Note also that the difference is approximately a microsecond i.e. utterly trivial.
Other interesting data
This still holds for non-trivial lists
and for less trivial map functions:
and (though less strongly) if we filter the list: