Morganis Morganis - 6 months ago 29
Python Question

ValueError: invalid literal for long() with base 10: '5B'

What I understood of this error is that it means that there is a column that is of type long(). But this column contains a value named '5B' which isn't a long type.

This is the line where the error occurs:

df_Company = df1.groupby(by=['manufacturer','quality_issue'], as_index=False) ['quality_issue2'].count()


I have checked all the column types of the dataframe df1. But there are no columns with the type long. 5B is a name of a manufacturer so I assume that the column manufacturer has suddenly became of type long during this sentence.

checked what types the dataframe df1 has.

print (df1.dtypes)
manufacturer object
yearweek int64
quality_issue object
quality_issue2 object


I 'think' I have to do something with
df_Company.astype(long)
but it seems I can't make it work. Does anyone has an idea how to fix this?

Note: the strange thing is that on my other computer where I have Python 3.5.1 the same code works just fine. but when I run the code on my current computer where I have Python 2.7.9 I get this long error.

Answer

Problem is different, see 8381, but in my pandas version 0.18.1 it works nice.

I think you can change False to True and then reset_index:

df_Company=df1.groupby(by=['manufacturer','quality_issue'], as_index=True)['quality_issue2']
              .count()
              .reset_index()

Differences between size and count (see differences with numeric values):

Sample with string values:

import pandas as pd
import numpy as np

df1=pd.DataFrame([['foo','foo','bar','bar','bar','oats'],
                  ['foo','foo','bar','bar','bar','oats'],
                  [None,'foo','bar',None,'bar','oats']]).T
df1.columns=['manufacturer','quality_issue','quality_issue2']
print (df1)
  manufacturer quality_issue quality_issue2
0          foo           foo           None
1          foo           foo            foo
2          bar           bar            bar
3          bar           bar           None
4          bar           bar            bar
5         oats          oats           oats

df_Company=df1.groupby(by=['manufacturer','quality_issue'], as_index=False)['quality_issue2']
              .count()
print (df_Company)

  manufacturer quality_issue  quality_issue2
0          bar           bar               2
1          foo           foo               1
2         oats          oats               1

df_Company1=df1.groupby(by=['manufacturer','quality_issue'])['quality_issue2']
               .size()
               .reset_index(name='quality_issue2')
print (df_Company1)

  manufacturer quality_issue  quality_issue2
0          bar           bar               3
1          foo           foo               2
2         oats          oats               1
Comments