Why are Python's arrays slow? -
i expected array.array
faster lists, arrays seem unboxed.
however, following result:
in [1]: import array in [2]: l = list(range(100000000)) in [3]: = array.array('l', range(100000000)) in [4]: %timeit sum(l) 1 loop, best of 3: 667 ms per loop in [5]: %timeit sum(a) 1 loop, best of 3: 1.41 s per loop in [6]: %timeit sum(l) 1 loop, best of 3: 627 ms per loop in [7]: %timeit sum(a) 1 loop, best of 3: 1.39 s per loop
what cause of such difference?
the storage "unboxed", every time access element python has "box" (embed in regular python object) in order it. example, sum(a)
iterates on array, , boxes each integer, 1 @ time, in regular python int
object. costs time. in sum(l)
, boxing done @ time list created.
so, in end, array slower, requires substantially less memory.
here's relevant code recent version of python 3, same basic ideas apply cpython implementations since python first released.
here's code access list item:
pyobject * pylist_getitem(pyobject *op, py_ssize_t i) { /* error checking omitted */ return ((pylistobject *)op) -> ob_item[i]; }
there's little it: somelist[i]
returns i
'th object in list (and python objects in cpython pointers struct initial segment conforms layout of struct pyobject
).
and here's __getitem__
implementation array
type code l
:
static pyobject * l_getitem(arrayobject *ap, py_ssize_t i) { return pylong_fromlong(((long *)ap->ob_item)[i]); }
the raw memory treated vector of platform-native c
long
integers; i
'th c long
read up; , pylong_fromlong()
called wrap ("box") native c long
in python long
object (which, in python 3, eliminates python 2's distinction between int
, long
, shown type int
).
this boxing has allocate new memory python int
object, , spray native c long
's bits it. in context of original example, object's lifetime brief (just long enough sum()
add contents running total), , more time required deallocate new int
object.
this speed difference comes from, has come from, , come in cpython implementation.
Comments
Post a Comment