canada11 canada11 - 1 year ago 184
Python Question

Why does librosa librosa.feature.mfcc() spit out a 2D array?

Calling librosa.feature.mfcc() on an audio file spits out a 2D array like so:

array([[ -5.229e+02, -4.944e+02, ..., -5.229e+02, -5.229e+02],
[ 7.105e-15, 3.787e+01, ..., -7.105e-15, -7.105e-15],
[ 1.066e-14, -7.500e+00, ..., 1.421e-14, 1.421e-14],
[ 3.109e-14, -5.058e+00, ..., 2.931e-14, 2.931e-14]])

My question is what are these? Because I was expecting a 1D array of coefficients, why is it 2D? and what are the dimensions? Maybe this is my misunderstanding of what I should be getting back, however any explanation would be appreciated. I tried looking online but everyone seems to just know what it is.

Answer Source

One dimension is the time, the other one are the different frequencies. This link shows how it looks if you plot it:

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download