Amadan Amadan -4 years ago 95
Python Question

Minimum of two uint64 series with missing values without float64 conversion

I have a types conundrum:

import pandas as pd

a = pd.Series([5, 3, 5], index=[1, 3, 4]) # int64
b = pd.Series([1, 9, 4], index=[1, 2, 4]) # int64

m = pd.DataFrame([a, b]).min() # float64


I know exactly why it happens: once I put
a
and
b
in the same dataframe, there are missing values, and missing values can't be represented in
int64
, so the dtype is bumped up to
float64
. But I'd really like to get that minimum without the conversion. Is there a way to pre-fill-in the missing values from the other column, or any other technique that would let me take the minimum of the two series without having to deal with
NaN
?

Answer Source

I think you can use reindex by union of both indexes, parameter fill_value replace NaN to some scalar. You need min, so one possible solution is some huge int like 10000 or max of Series:

idx = b.index.union(a.index)

print (pd.DataFrame([a.reindex(idx, fill_value=a.max()), 
                     b.reindex(idx, fill_value=b.max())]))

   1  2  3  4
0  5  5  3  5
1  1  9  9  4

m = pd.DataFrame([a.reindex(idx, fill_value=a.max()), 
                  b.reindex(idx, fill_value=b.max())]).min()
print (m)
1    1
2    5
3    3
4    4
dtype: int64
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download