tchakravarty tchakravarty - 3 months ago 7
Python Question

How to slice a pandas.Series of type int by length

I have a

pandas.Series
that is an integer with 5 digits. The first 3 digits are days from an epoch, and the last 2 are half-hours. I want to slice the integer series, so that I have two
Series
with the first 3 digits and the last 2 digits respectively.

Here is one way to do it, that requires two type conversions:

import pandas as pd
days_hours = pd.Series(npr.randint(low=1e4, high=99999, size=1000))
days = days_hours.astype('str').str.slice(start=0, stop=3).astype('int64')
hours = days_hours.astype('str').str.slice(start=3, stop=5).astype('int64')


This is very time-consuming given that on average my
Series
are 25e6 rows each (there are 6 such
Series
s). Is there a way that I can avoid the type conversions?

I tried an alternate solution which involved
apply
ing a
lambda
function to each element of the
Series
but that took longer.

Answer

It will be much quicker to do these operations arithmetically using integer division and the modulo operator:

days = days_hours // 100

hours = days_hours % 100
Comments