Ohumeronen Ohumeronen - 3 months ago 20
Python Question

Pandas DataFrame with continuous index

I have the following code:

import pandas as pd
df = pd.DataFrame(
{'Index' : ['1', '2', '5','7', '8', '9', '10'],
'Vals' : [1, 2, 3, 4, np.nan, np.nan, 5]})


This gives me:

Index Vals
0 1 1.0
1 2 2.0
2 5 3.0
3 7 4.0
4 8 NaN
5 9 NaN
6 10 5.0


But what I want is something like this:

Index Vals
0 1 1.000000
1 2 2.000000
2 3 NaN
3 4 NaN
4 5 3.000000
5 6 NaN
6 7 4.000000
7 8 NaN
8 9 NaN
9 10 5.000000


I tried to achieve this by creating a new dataframe with a continuous index. Then I would like to assign the values which I already have but how? The only thing I have so far is this:

clean_data = pd.DataFrame({'Index' : range(1,11)})


Which gives me:

Index
0 1
1 2
2 3
3 4
4 5
5 6
6 7
7 8
8 9
9 10

Answer

So for your example it will look like:

import pandas as pd
import numpy as np 

df = pd.DataFrame(
    {'Index' : ['1', '2', '5','7', '8', '9', '10'],
     'Vals' : [1, 2, 3, 4, np.nan, np.nan, 5]})
df['Index'] = df['Index'].astype(int)
clean_data = pd.DataFrame({'Index' : range(1,11)})
result = clean_data.merge(df,on="Index",how='outer')

And the result is :

  Index Vals
0   1   1.0
1   2   2.0
2   3   NaN
3   4   NaN
4   5   3.0
5   6   NaN
6   7   4.0
7   8   NaN
8   9   NaN
9   10  5.0