I have the following code in Matlab which I'm not familiar with:
function segments = segmentEnergy(data, th)
mag = sqrt(sum(data(:, 1:3) .^ 2, 2));
mag = mag - mean(mag);
above = find(mag>=th*std(mag));
indicator = zeros(size(mag));
indicator(above) = 1;
plot(mag); hold on; plot(indicator*1000, 'r')
def segment_energy(data, th):
mag = np.linalg.norm((data['x'], data['y'], data['z']))
print "This is the mag: " + str(mag)
mag -= np.mean(mag)
above = np.where(mag >= th * np.std(mag))
indicator = np.zeros(mag.shape)
indicator[above] = 1
plt.plot(indicator * 1000, 'r')
line 23, in segment_energy
indicator[above] = 1
IndexError: too many indices for array
The output of
numpy.linalg.norm by default would give you a single scalar value given how you are currently calling the function. Because the output of
mag is now a scalar, the rest of the code will not function as intended for the following reasons:
Performing mean subtraction with a single scalar will give you a value of 0 (i.e.
mag <- mag - np.mean(mag) --> 0).
above statement will always return a tuple of a single element. This element contains a NumPy array of length 1 containing the index 0, symbolizing that the first element of the "array" which is a scalar in this case satisfies the constraint. This is satisfied always as the standard deviation of a single constant is also 0 by using the default definition of
shape for a single scalar value is undefined and it will actually give you an empty shape:
(). Note that if you did not subtract with
mag.shape would actually give you an error as it is not a NumPy array. Subtracting with
np.mean coalesces the scalar to a NumPy array.
In : mag = 10 In : type(mag) Out: int In : mag -= np.mean(mag) In : type(mag) Out: numpy.float64
Finally, calling the
indicator creation code will produce an array of empty dimensions and since you are trying to index into an array that has no size, it will give you an error.
Observe this reproducible error assuming that
mag was calculated to be some value... say... 10 and
th = 1:
In : mag = 10 In : mag -= np.mean(mag) In : mag.shape Out: () In : th = 1 In : above = np.where(mag >= th * np.std(mag)) In : indicator = np.zeros(mag.shape) In : indicator Out: array(0.0) In : mag Out: 0.0 In : indicator[above] = 1 --------------------------------------------------------------------------- IndexError Traceback (most recent call last) <ipython-input-67-adf9cff7610a> in <module>() ----> 1 indicator[above] = 1 IndexError: too many indices for array
Therefore, the solution for you is to rethink how you are writing this function. The MATLAB code assumes that
data is a 2D matrix already, so they're computing the norm or length of each row independently. Because we now know that the input is a
DataFrame, we can very easily apply
numpy operations on it just like what is done in MATLAB. Assuming that your columns are labelled
z in your code and each column is a
numpy array of values, just change the first line of code.
def segment_energy(data, th): mag = np.sqrt(np.sum(data.loc[:, ['x','y','z']]** 2.0, axis=1)) # Change mag = np.array(mag) # Convert to NumPy array mag -= np.mean(mag) above = np.where(mag >= th * np.std(mag)) indicator = np.zeros(mag.shape) indicator[above] = 1 plt.plot(mag) plt.plot(indicator * 1000, 'r') plt.show()
The first statement in the code is the actual NumPy translation of the code in MATLAB. We use the
loc method that's part of the
pandas dataframe to index the three columns you are looking for. We also need to convert to a NumPy array for the rest of the calculations to work.
You can also use
numpy.linalg.norm, but specify an axis which to operate on. Assuming that the data is now 2D as earlier, specify
axis=1 to compute the row-wise norms of your matrix:
mag = np.linalg.norm(data.loc[:, ['x', 'y', 'z']], axis=1)
The above will coalesce the data into a NumPy array for you.