Jari Klingler Jari Klingler - 2 months ago 19
Python Question

How to convert str to float in pandas

I'm trying to convert a string of my dataset to a float type. Here some context:

import pandas as pd
import numpy as np
import xlrd
file_location = "/Users/sekr2/Desktop/Jari/Leistungen/leistungen2_2017.xlsx"
workbook = xlrd.open_workbook(file_location)
sheet = workbook.sheet_by_index(0)

df = pd.read_excel("/Users/.../bla.xlsx")

df.head()

Leistungserbringer Anzahl Leistung AL TL TaxW Taxpunkte
0 McGregor Sarah 12 'Konsilium' 147.28 87.47 KVG 234.75
1 McGregor Sarah 12 'Grundberatung' 47.00 67.47 KVG 114.47
2 McGregor Sarah 12 'Extra 5min' 87.28 87.47 KVG 174.75
3 McGregor Sarah 12 'Respirator' 147.28 102.01 KVG 249.29
4 McGregor Sarah 12 'Besuch' 167.28 87.45 KVG 254.73


To keep working on this I need to find a way to create a new column:
df['Leistungswert'] = df['Taxpunkte'] * df['Anzahl'] * df['TaxW']
.

TaxW shows the string 'KVG' for each entry. I know from the data that 'KVG' = 0.89. I have hit a wall with trying to convert the string into a float. I cannot just create a new column with the float type because this code should work with further inputs. In the column TaxW there are about 7 different entries with all different values.

I'm thankful for all information on this matter.

KVG = 0.92

Answer Source

Assuming 'KVG' isn't the only possible string value in TaxW, you should store a mapping of strings to their float equivalent, like this:

map_ = {'KVG' : 0.89, ... } # add more fields here 

Then, you can use Series.map:

In [424]: df['Leistungswert'] = df['Taxpunkte'] * df['Anzahl'] * df['TaxW'].map(map_); df['Leistungswert']
Out[424]: 
0    2507.1300
1    1222.5396
2    1866.3300
3    2662.4172
4    2720.5164
Name: Leistungswert, dtype: float64

Alternatively, you can use df.transform:

In [435]: df['Leistungswert'] = df.transform(lambda x: x['Taxpunkte'] * x['Anzahl'] * map_[x['TaxW']], axis=1); df['Lei
     ...: stungswert']
Out[435]: 
0    2507.1300
1    1222.5396
2    1866.3300
3    2662.4172
4    2720.5164
Name: Leistungswert, dtype: float64