Hoap Humanoid Hoap Humanoid - 11 months ago 81
R Question

Representing numbers as 8-bit objects

I have a big multidimensional array and I want it to occupy as little memory as possible. In python, this occupies 66 Mb.

m = np.zeros([1000, 70, 1, 1000], dtype='bool')
size = sys.getsizeof(m)/1024/1024
print("Size: %s MB" % size)

However, in R, the same array occupies 4 times more memory (267Mb).

m <- array(FALSE, dim = c(1000, 70, 1, 1000))
format(object.size(m), units = "auto")

Any idea on how to reduce the array size in R?

This array will be used as the X input in an external API. This function takes as argument an array or an internal iterator called

Answer Source

Your assertion that these arrays are the same is clearly wrong. If they were the same arrays, then you would need the same memory allocation in R than in any other language.

From the help for ?as.integer:

Note that current implementations of R use 32-bit integers for integer vectors

So clearly the 4x memory usage is because you are using 32-bit objects in R, whereas you are using 8-bit objects in Python.

To use 8-bit objects in R, you can use raw vectors. From the help for ?as.raw:

The raw type is intended to hold raw bytes

Try this:

m3 <- array(raw(0), dim = c(1000, 70, 1, 1000))
format(object.size(m3), units = "auto")

[1] "66.8 Mb"

This is identical to the value you report that Python uses.