tchakravarty tchakravarty - 1 year ago 114
Python Question

How to slice a pandas.Series of type int by length

I have a

pandas.Series
that is an integer with 5 digits. The first 3 digits are days from an epoch, and the last 2 are half-hours. I want to slice the integer series, so that I have two
Series
with the first 3 digits and the last 2 digits respectively.

Here is one way to do it, that requires two type conversions:

import pandas as pd
days_hours = pd.Series(npr.randint(low=1e4, high=99999, size=1000))
days = days_hours.astype('str').str.slice(start=0, stop=3).astype('int64')
hours = days_hours.astype('str').str.slice(start=3, stop=5).astype('int64')


This is very time-consuming given that on average my
Series
are 25e6 rows each (there are 6 such
Series
s). Is there a way that I can avoid the type conversions?

I tried an alternate solution which involved
apply
ing a
lambda
function to each element of the
Series
but that took longer.

Answer Source

It will be much quicker to do these operations arithmetically using integer division and the modulo operator:

days = days_hours // 100

hours = days_hours % 100