rsnaveen rsnaveen - 1 month ago 9
Python Question

MATLAB's smooth implementation (n-point moving average) in NumPy/Python

Matlab's

smooth
function, by default, smooths data using a 5-point moving average. What would be the best way to do the same in python?
For example, if this is my data

0
0.823529411764706
0.852941176470588
0.705882352941177
0.705882352941177
0.676470588235294
0.676470588235294
0.500000000000000
0.558823529411765
0.647058823529412
0.705882352941177
0.705882352941177
0.617647058823529
0.705882352941177
0.735294117647059
0.735294117647059
0.588235294117647
0.588235294117647
1
0.647058823529412
0.705882352941177
0.764705882352941
0.823529411764706
0.647058823529412
0.735294117647059
0.794117647058824
0.794117647058824
0.705882352941177
0.676470588235294
0.794117647058824
0.852941176470588
0.735294117647059
0.647058823529412
0.647058823529412
0.676470588235294
0.676470588235294
0.529411764705882
0.676470588235294
0.794117647058824
0.882352941176471
0.735294117647059
0.852941176470588
0.823529411764706
0.764705882352941
0.558823529411765
0.588235294117647
0.617647058823529
0.647058823529412
0.588235294117647
0.617647058823529
0.647058823529412
0.794117647058824
0.823529411764706
0.647058823529412
0.617647058823529
0.647058823529412
0.676470588235294
0.764705882352941
0.676470588235294
0.647058823529412
0.705882352941177
0.764705882352941
0.705882352941177
0.500000000000000
0.529411764705882
0.529411764705882
0.647058823529412
0.676470588235294
0.588235294117647
0.735294117647059
0.794117647058824
0.852941176470588
0.764705882352941


the smoothed data should be

0
0.558823529411765
0.617647058823530
0.752941176470588
0.723529411764706
0.652941176470588
0.623529411764706
0.611764705882353
0.617647058823530
0.623529411764706
0.647058823529412
0.676470588235294
0.694117647058824
0.700000000000000
0.676470588235294
0.670588235294118
0.729411764705882
0.711764705882353
0.705882352941177
0.741176470588235
0.788235294117647
0.717647058823529
0.735294117647059
0.752941176470588
0.758823529411765
0.735294117647059
0.741176470588235
0.752941176470588
0.764705882352941
0.752941176470588
0.741176470588235
0.735294117647059
0.711764705882353
0.676470588235294
0.635294117647059
0.641176470588236
0.670588235294118
0.711764705882353
0.723529411764706
0.788235294117647
0.817647058823530
0.811764705882353
0.747058823529412
0.717647058823530
0.670588235294118
0.635294117647059
0.600000000000000
0.611764705882353
0.623529411764706
0.658823529411765
0.694117647058824
0.705882352941176
0.705882352941176
0.705882352941176
0.682352941176471
0.670588235294118
0.676470588235294
0.682352941176471
0.694117647058824
0.711764705882353
0.700000000000000
0.664705882352941
0.641176470588236
0.605882352941177
0.582352941176471
0.576470588235294
0.594117647058824
0.635294117647059
0.688235294117647
0.729411764705882
0.747058823529412
0.803921568627451
0.764705882352941


The syntax in Matlab to get this is

smooth(data)


I want to do the same in python but I am unable to find any function that would do this.

Answer

MATLAB's smoooth func is basically same as averaging across sliding windows of length 5, except the way it treats the 2 elems at either ends. As per the linked docs, those boundary cases are computed with these formulae -

yy = smooth(y) smooths the data in the column vector y ..
The first few elements of yy are given by

yy(1) = y(1)
yy(2) = (y(1) + y(2) + y(3))/3
yy(3) = (y(1) + y(2) + y(3) + y(4) + y(5))/5
yy(4) = (y(2) + y(3) + y(4) + y(5) + y(6))/5
...

So, to replicate the same implementation on NumPy/Python, we can use NumPy's 1D convolution for getting sliding windowed summations and divide them by the window length to give us the average results. Then, simply append the special case treated values for the boundary elems.

Thus, we would have an implementation to handle generic window sizes, like so -

def smooth(a,WSZ):
    out0 = np.convolve(a,np.ones(WSZ,dtype=int),'valid')/WSZ    
    r = np.arange(1,WSZ-1,2)
    start = a[:WSZ-1].cumsum()[::2]/r
    stop = (a[-WSZ+1:][::-1].cumsum()[::2]/r)[::-1]
    return np.hstack((  start , out0, stop  ))