ckp ckp - 2 months ago 7
Python Question

replacing value with median in python

lat
50.63757782
50.6375742
50.6375742
50.6374077762
50.63757782
50.6374077762
50.63757782
50.63757782


I have plotted a graph with these latitude values and noticed that there was sudden spike in the graph (outlier). I want to replace every lat value with median of last three values so that I can see a meaningful result

The output might be

lat lat_med
50.63757782 50.63757782
50.6375742 50.6375742
50.6375742 50.6375742
50.63740778 50.6375742
50.63757782 50.6375742
50.63740778 50.6375742
50.63757782 50.6375742
50.63757782 50.6375742


I have thousands of such lat values and need to solve this using a for loop. I know that the following code has errors and since I am a beginner in python, I appreciate your help in solving this.

for i in range(0,len(df['lat'])):
df['lat_med'][i]=numpy.median(numpy.array(df['lat'][i],df['lat'][i-2]))

Answer

Just go thought second to second to last elements and put save the median out of this, previous and next element. Note that first and last elements are left as they were.

Try this:

lat = [50.63757782, 50.6375742, 50.6375742, 50.6374077762, 50.63757782, 50.6374077762, 50.63757782, 50.63757782]

# returns median value out of the three values
def median(a, b, c):
    if a > b and a > c:
        return b if b > c else c

    if a < b and a < c:
        return b if b < c else c

    return a


# add the first element
filtered = [lat[0]]

for i in range(1, len(lat) - 1):
    filtered += [median(lat[i - 1], lat[i], lat[i + 1])]

# add the last element
filtered += [lat[-1]]

print(filtered)

What you are doing is a very basic Median filter