Hoap Humanoid Hoap Humanoid - 2 months ago 15
R Question

Representing numbers as 8-bit objects

I have a big multidimensional array and I want it to occupy as little memory as possible. In python, this occupies 66 Mb.

m = np.zeros([1000, 70, 1, 1000], dtype='bool')
size = sys.getsizeof(m)/1024/1024
print("Size: %s MB" % size)


However, in R, the same array occupies 4 times more memory (267Mb).

m <- array(FALSE, dim = c(1000, 70, 1, 1000))
format(object.size(m), units = "auto")


Any idea on how to reduce the array size in R?




EDIT:
This array will be used as the X input in an external API. This function takes as argument an array or an internal iterator called
mx.io.arrayiter
.

Answer

Your assertion that these arrays are the same is clearly wrong. If they were the same arrays, then you would need the same memory allocation in R than in any other language.

From the help for ?as.integer:

Note that current implementations of R use 32-bit integers for integer vectors

So clearly the 4x memory usage is because you are using 32-bit objects in R, whereas you are using 8-bit objects in Python.

To use 8-bit objects in R, you can use raw vectors. From the help for ?as.raw:

The raw type is intended to hold raw bytes

Try this:

m3 <- array(raw(0), dim = c(1000, 70, 1, 1000))
format(object.size(m3), units = "auto")

[1] "66.8 Mb"

This is identical to the value you report that Python uses.

Comments