user2570465 user2570465 - 1 month ago 8
Python Question

Stop Pandas from Converting Int to Float

I have a

DataFrame
. Two relevant columns are the following: one is a column of
int
and another is a column of
str
.

I understand that if I insert
NaN
into the
int
column, Pandas will convert all the
int
into
float
because there is no
NaN
value for an
int
.

However, when I insert
None
into the
str
column, Pandas converts all my
int
to
float
as well. This doesn't make sense to me - why does the value I put in column 2 affect column 1?

Here's a simple working example (Python 2):

import pandas as pd
df = pd.DataFrame()
df["int"] = pd.Series([], dtype=int)
df["str"] = pd.Series([], dtype=str)
df.loc[0] = [0, "zero"]
print df
print
df.loc[1] = [1, None]
print df


The output is

int str
0 0 zero

int str
0 0.0 zero
1 1.0 NaN


Is there any way to make the output the following:

int str
0 0 zero

int str
0 0 zero
1 1 NaN


without recasting the first column to
int
.


  • I prefer using
    int
    instead of
    float
    because the actual data in
    that column are integers. If there's not workaround, I'll just
    use
    float
    though.

  • I prefer not having to recast because in my actual code, I don't

    store the actual
    dtype
    .

  • I also need the data inserted row-by-row.



Thanks in advance for the help.

Answer

If you set dtype=object, your series will be able to contain arbitrary data types:

df["int"] = pd.Series([], dtype=object)
df["str"] = pd.Series([], dtype=str)
df.loc[0] = [0, "zero"]
print(df)
print()
df.loc[1] = [1, None]
print(df)

   int   str
0    0  zero
1  NaN   NaN

  int   str
0   0  zero
1   1  None