Runner Bean Runner Bean - 1 month ago 7
Python Question

Python: convert string [('a',0.2),('b',0.9),('a',0.4)] to dataframe

In Python, how to convert a string like

thisStr = '[('a', 0.332), ('d', 0.43766), ('b', 0.3244), ('b', 0.76577), ('a', 0.863), ('d', 0.96789)]'


into a DataFrame something like

index item value
0 a 0.332
1 d 0.43766
2 b 0.3244
3 b 0.76577
4 a 0.863
5 d 0.96789

Answer

It sounds like you're looking to change the string into a pandas dataframe and then do some manipulations. I'd change the string to the following through some simple replaces and manual editing on the beginning and end of the string. You're escaping the punctuation except for the ends so that you can apply the eval() function.

import pandas as pd

thisStr = eval('[(\'a\', 0.332), (\'d\', 0.43766), (\'b\', 0.3244), (\'b\', 0.76577), (\'a\', 0.863), (\'d\', 0.96789)]')

df = pd.DataFrame(thisStr)
df.rename(columns={0:'item', 1:'value'}, inplace=True)

# one approach to solving the problem of removing rows where
# item a has values less than 0.8.
mask = (df['item'] == 'a') & (df['value'] < 0.8)
df2 = df[~mask]