user2909415 - 1 year ago 62

Python Question

I think this one is pretty clear. The functionality I am looking for looks something like this:

Edit: Data is read in from disk as a list of lists.

`data = np.array([[1, 2, 3, 4],`

[2, 3, 1],

[5, 5, 5, 5],

[1, 1]])

result = fix(data)

print result

[[ 1. 2. 3. 4.]

[ 2. 3. 1. 0.]

[ 5. 5. 5. 5.]

[ 1. 1. 0. 0.]]

These data arrays I'm working with are really large so I would really appreciate the most efficient solution.

Answer Source

This could be one approach -

```
# Input object dtype array
data = np.array([[1, 2, 3, 4],
[2, 3, 1],
[5, 5, 5, 5],
[1, 1]])
# Get lengths of each row of data
lens = np.array([len(data[i]) for i in range(len(data))])
# Mask of valid places in each row
mask = np.arange(lens.size) < lens[:,None]
# Setup output array and put elements from data into masked positions
out = np.zeros(mask.shape)
out[mask] = np.hstack((data[:]))
```

Sample input, output -

```
In [84]: data
Out[84]: array([[1, 2, 3, 4], [2, 3, 1], [5, 5, 5, 5], [1, 1]], dtype=object)
In [85]: out
Out[85]:
array([[ 1., 2., 3., 4.],
[ 2., 3., 1., 0.],
[ 5., 5., 5., 5.],
[ 1., 1., 0., 0.]])
```