Dance Party2 Dance Party2 - 25 days ago 13
Python Question

Pandas String to Integer by Character

In a Pandas data frame column, I want to convert each character in a string to an integer (as is done with ord()) and add 100 to the left. I know how to do this with a regular string:

st = "JOHNSMITH4817001141979"
a=[ord(x) for x in st]
b=[]
for x in a:
b.append('{:03}'.format(x)) #Add leading zero, ensuring 3 digits
b=['100']+b
b=''.join([ "%s"%x for x in b])
b=int(b)
b


Result:
100074079072078083077073084072052056049055048048049049052049057055057


But what if I wanted to perform this operation on every cell of a column in a Pandas data frame like this one?

import pandas as pd
df = pd.DataFrame({'string':['JOHNSMITH4817001141979','JOHNSMYTHE4817001141979']})
df

string
0 JOHNSMITH4817001141979
1 JOHNSMYTHE4817001141979


I just need a separate column with the result as an integer for each cell in 'string'.

Thanks in advance!

Answer

First, you transform your processing chain into a function such as:

def get_it(a):
    a=[ord(x) for x in st]
    b=[]
    for x in a:
        b.append('{:03}'.format(x)) #Add leading zero, ensuring 3 digits
    b=['100']+b
    b=''.join([ "%s"%x for x in b])
    return int(b)

and then you call it iteratively for each element in the column and make this list the new column

df['result'] = [get_it(i) for i in df['string']]

Although this does work, I yet think that you can find a better solution by optimizing your process "get_it"