Daniel Messias - 3 months ago 10

Python Question

I have a DataFrame of form

`person1, person2, ..., someMetric`

John, Steve, ..., 20

Peter, Larry, ..., 12

Steve, John, ..., 20

Rows 0 and 2 are interchangeable duplicates, so I'd want to drop the last row. I can't figure out how to do this in Pandas.

Thanks!

Answer

Here's a NumPy based solution -

```
df[~(np.triu(df.person1.values[:,None] == df.person2.values)).any(0)]
```

Sample run -

```
In [123]: df
Out[123]:
person1 person2 someMetric
0 John Steve 20
1 Peter Larry 13
2 Steve John 19
3 Peter Parker 5
4 Larry Peter 7
In [124]: df[~(np.triu(df.person1.values[:,None] == df.person2.values)).any(0)]
Out[124]:
person1 person2 someMetric
0 John Steve 20
1 Peter Larry 13
3 Peter Parker 5
```