DJV DJV - 1 month ago 30
Python Question

Is it possible to effectively initialize bytearray with non-zero value?

I need to have huge boolean array. All values should be initialized as "True":

arr = [True] * (10 ** 9)

But created as above it takes too much memory. So I decided to use
for that:

arr = bytearray(10 ** 9) # initialized with zeroes

Is it possible to initialize
as effectively as it is initialized by

I understand I could initialize
with zeros and inverse my logic. But I'd prefer not to do that if possible.


>>> from timeit import timeit
>>> def f1():
... return bytearray(10**9)
>>> def f2():
... return bytearray(b'\x01'*(10**9))
>>> timeit(f1, number=100)
>>> timeit(f2, number=100)


Consider using NumPy for this sort of thing. On my computer, np.ones (which initializes an array of all-1 values) with boolean "dtype" is just as fast as the bare bytearray constructor:

>>> import numpy as np
>>> from timeit import timeit
>>> def f1(): return bytearray(10**9)
>>> def f2(): return np.ones(10**9, dtype=np.bool)
>>> timeit(f1, number=100)
>>> timeit(f2, number=100)

If you don't want to use third-party modules, another option with competitive performance is to create a one-element bytearray and then expand that, instead of creating a large byte-string and converting it to a bytearray.

>>> def f3(): return bytearray(b'\x01')*(10**9)
>>> timeit(f3, number=100)

Since my computer appears to be slower than yours, here is the performance of your original option for comparison:

>>> def fX(): return bytearray(b'\x01'*(10**9))
>>> timeit(fX, number=100)

Cost in all cases is going to be dominated by allocating a decimal gigabyte of RAM and writing to every byte of it. fX is roughly twice as slow as the other three functions because it has to do this twice. A good rule of thumb for you to remember when working with code like this is: minimize the number of allocations. It may be worth dropping down to a lower-level language in which you can explicitly control allocation (if you don't know any such language already, I recommend Rust).