Mdras Mdras - 3 months ago 8
Python Question

Comparing previous and next values in a list or in a dataframe

I am new here and have, after a lot of research, not been able to crack this one.

My List looks somewhat like this:

lister=["AB1","AB2","AB3","AB3-2","AB3-3","AB3-4","AB4","AB4-2","AB5"]


It is a list of existing folders and cannot be changed into something more practical.
I also have this list as a pandas df column along with some other values.

The goal is for elements which have a "-2", "-3", "-#" to only use the element which has the biggest value. These "-#" values can go up to 10.

A Result from the list above would be:

resulter=["AB1","AB2","AB3-4","AB4-2","AB5]


Thanks a lot for the help!

UPDATE:

The answer from John Zwinck is working for the lists. However, when I try to use it on a pandas dataframe it gives me errors. So to reframe my question would possible be more helpful:

My Dataframe looks like this:

COL1 COL2 COL3 COL4 COL5 COL6
0 1 77 AB1 0.609856 2.145556 2.115333
1 2 77 AB2 0.603378 2.146333 2.125667
2 3 77 AB3 0.600580 2.150667 2.135000
3 4 89 AB1 0.609129 2.149056 2.097667
4 5 89 AB2 0.604061 2.175333 2.142667
5 6 89 AB3 0.606987 2.139944 2.107333
6 7 89 AB4 0.603696 2.122000 2.102000
7 8 94 AB1 0.606438 2.156444 2.142000
8 9 94 AB1-2 0.611260 2.133556 2.095000
9 10 94 AB2 0.596059 2.169056 2.137333


My requirement in this case is to remove the row 7 based on the value of COL3 (AB1) because there exists an AB1-2 value in row 8.

Thanks again!

Answer
gb = pd.Series(lister).str.split('-', 1, expand=True).groupby(0)[1].last().fillna('')

Gives you:

AB1     
AB2     
AB3    4
AB4    2
AB5     

Then:

gb.index + np.where(gb, '-' + gb, '')

Gives you:

['AB1', 'AB2', 'AB3-4', 'AB4-2', 'AB5']