python_freak python_freak - 22 days ago 9
Python Question

python ravel vs. transpose when used in reshape

I have a 2D array

v
,
v.shape=(M_1,M_2)
, which I want to reshape into a 3D array with
v.shape=(M_2,N_1,N_2)
, and
M_1=N_1*N_2
.

I came up with the following lines which produce the same result:

np.reshape(v.T, reshape_tuple)


and

np.reshape(v.ravel(order='F'), reshape_tuple)


for
reshape_tuple=(M_2,N_1,N_2)
.

Which one is computationally better and in what sense (comp time, memory, etc.) if the original
v
is a huge (possibly complex-valued) matrix?

My guess would be that using the transpose is better, but if
reshape
does an automatic
ravel
then maybe the ravel-option is faster (though
reshape
might be doing the
ravel
in C or Fortran and then it's not clear)?

Answer

The order in which they do things - reshape, change strides, and make a copy - differs, but they end up doing the same thing.

I like to use __array_interface__ to see where the data buffer is located, and other changes. I suppose I should add the flags to see the order. But we/you know that transpose changes the order to to F already, right?

In [549]: x=np.arange(6).reshape(2,3)
In [550]: x.__array_interface__
Out[550]: 
{'data': (187732024, False),
 'descr': [('', '<i4')],
 'shape': (2, 3),
 'strides': None,
 'typestr': '<i4',
 'version': 3}

transpose is a view, with different shape, strides and order:

In [551]: x.T.__array_interface__
Out[551]: 
{'data': (187732024, False),
 'descr': [('', '<i4')],
 'shape': (3, 2),
 'strides': (4, 12),
 'typestr': '<i4',
 'version': 3}

ravel with different order is a copy (different data buffer pointer)

In [552]: x.ravel(order='F').__array_interface__
Out[552]: 
{'data': (182286992, False),
 'descr': [('', '<i4')],
 'shape': (6,),
 'strides': None,
 'typestr': '<i4',
 'version': 3}

transpose ravel is also a copy. I think the same data pointer is just a case of memory reuse (since I'm not assigning to a variable) - but that can be checked.

In [553]: x.T.ravel().__array_interface__
Out[553]: 
{'data': (182286992, False),
 'descr': [('', '<i4')],
 'shape': (6,),
 'strides': None,
 'typestr': '<i4',
 'version': 3}

add the reshape:

In [554]: x.T.ravel().reshape(2,3).__array_interface__
Out[554]: 
{'data': (182286992, False),
 'descr': [('', '<i4')],
 'shape': (2, 3),
 'strides': None,
 'typestr': '<i4',
 'version': 3}
In [555]: x.ravel(order='F').reshape(2,3).__array_interface__
Out[555]: 
{'data': (182286992, False),
 'descr': [('', '<i4')],
 'shape': (2, 3),
 'strides': None,
 'typestr': '<i4',
 'version': 3}

I think there's an implicit 'ravel' in reshape:

In [558]: x.T.reshape(2,3).__array_interface__
Out[558]: 
{'data': (182286992, False),
 'descr': [('', '<i4')],
 'shape': (2, 3),
 'strides': None,
 'typestr': '<i4',
 'version': 3}

(I should rework these examples to get rid of that memory reuse ambiguity.) In any case, reshape after transpose requires the same memory copy that a ravel with order change does. And as far as I can tell only one copy is required for either case. The other operations just involve changes to attributes like shape.

May be it's clearer if we just look at the arrays

In [565]: x.T
Out[565]: 
array([[0, 3],
       [1, 4],
       [2, 5]])

In the T we can still step through the array in numeric order. But after reshape, the 1 isn't anywhere close to the 0. Clearly there's been a copy.

In [566]: x.T.reshape(2,3)
Out[566]: 
array([[0, 3, 1],
       [4, 2, 5]])

the order of values after the ravel looks similar, and more obviously so after reshape.

In [567]: x.ravel(order='F')
Out[567]: array([0, 3, 1, 4, 2, 5])
In [568]: x.ravel(order='F').reshape(2,3)
Out[568]: 
array([[0, 3, 1],
       [4, 2, 5]])
Comments