ShinjiOno - 7 months ago 34
Python Question

# Smallest Difference Between 2 object in Dataframe

``````df = pd.DataFrame( {'Stock'  : ['Apple','Broadcomm','Citi','D&G'],
'PE' : pd.Series([1.5,3.9,5.6,6.8]),
})
``````

I'm looking for an algorithm to rank stock pair from a pool based on the difference of PE i.e PE stock 1 - PE stock 2

i.e pool of 40 stocks, rank based on unique stock pair based on smallest PE difference. Total will have 20 unique pairs
eg. MSFT appears in pair 1, with smallest PE associated with MSFT pair, MSFT should not reappear again in the subsequent pair

What's the correct algorithm for doing this?

So far I have tried to find the PE difference of each and every pair and rank ascending. What should I do next?

This is an approach that uses `itertools.combinations()`, `isin()`, and `drop()`:

``````import pandas as pd
import itertools as it

df = pd.DataFrame({'Stock' : ['Apple', 'Broadcomm', 'Citi', 'D&G', 'Elixir', 'Foxtrot'],
'PE'    : [3.8, 3.9, 5.6, 6.8, 0.5, 3.9]})
print(df)

assert len(df) % 2 == 0
m = df.set_index('Stock')
ranking = pd.DataFrame(columns=['StockA', 'StockB', 'minPE', 'deltaPE'],
data=[(a, b, min(m.PE[a], m.PE[b]), abs(m.PE[a] - m.PE[b]))
for a, b in it.combinations(m.index, 2)])
print(ranking)
ranking.sort_values(['deltaPE', 'minPE'], inplace=True)
print(ranking)

# ranking is sorted from best to worst.
# Start with first line, eliminate other lines that belong to either one of
# this line's stocks (but not both), then proceed to next line and repeat.
for i in range(len(df) // 2):
a = ranking.iloc[i].StockA
b = ranking.iloc[i].StockB
contenders = ranking[ranking.StockA.isin([a, b]) ^ ranking.StockB.isin([a, b])]
ranking.drop(contenders.index, inplace=True)

print(ranking)
``````

Output:

``````    PE      Stock
0  3.8      Apple
2  5.6       Citi
3  6.8        D&G
4  0.5     Elixir
5  3.9    Foxtrot

# ---- Ranking before sorting:
StockA     StockB  minPE  deltaPE
1       Apple       Citi    3.8      1.8
2       Apple        D&G    3.8      3.0
3       Apple     Elixir    0.5      3.3
4       Apple    Foxtrot    3.8      0.1
9        Citi        D&G    5.6      1.2
10       Citi     Elixir    0.5      5.1
11       Citi    Foxtrot    3.9      1.7
12        D&G     Elixir    0.5      6.3
13        D&G    Foxtrot    3.9      2.9
14     Elixir    Foxtrot    0.5      3.4

# ---- Ranking after sorting:
StockA     StockB  minPE  deltaPE
4       Apple    Foxtrot    3.8      0.1
9        Citi        D&G    5.6      1.2
11       Citi    Foxtrot    3.9      1.7
1       Apple       Citi    3.8      1.8
13        D&G    Foxtrot    3.9      2.9
2       Apple        D&G    3.8      3.0
3       Apple     Elixir    0.5      3.3