Michael Michael - 3 months ago 11
Python Question

Condensing repeat code with a "for" statement using strings - Python

I am very new with "for" statements in Python, and I can't get something that I think should be simple to work. My code that I have is:

import pandas as pd

df1 = pd.DataFrame({'Column1' : pd.Series([1,2,3,4,5,6])})
df2 = pd.DataFrame({'Column1' : pd.Series([1,2,3,4,5,6])})
df3 = pd.DataFrame({'Column1' : pd.Series([1,2,3,4,5,6])})

DF1 = pd.DataFrame({'Column1' : pd.Series([1,2,3,4,5,6])})
DF2 = pd.DataFrame({'Column1' : pd.Series([1,2,3,4,5,6])})
DF3 = pd.DataFrame({'Column1' : pd.Series([1,2,3,4,5,6])})


Then:

A1 = len(df1.loc[df1['Column1'] <= DF1['Column1'].iloc[2]])
Z1 = len(df1.loc[df1['Column1'] >= DF1['Column1'].iloc[3]])

A2 = len(df2.loc[df2['Column1'] <= DF2['Column1'].iloc[2]])
Z2 = len(df2.loc[df2['Column1'] >= DF2['Column1'].iloc[3]])

A3 = len(df3.loc[df3['Column1'] <= DF3['Column1'].iloc[2]])
Z3 = len(df3.loc[df3['Column1'] >= DF3['Column1'].iloc[3]])


As you can see, it is a lot of repeat code with just the identifying numbers being different. So my first attempt at a "for" statement was:

Numbers = [1,2,3]

for i in Numbers:
"A" + str(i) = len("df" + str(i).loc["df" + str(i)['Column1'] <= "DF" + str(i)['Column1'].iloc[2]])
"Z" + str(i) = len("df" + str(i).loc["df" + str(i)['Column1'] >= "DF" + str(i)['Column1'].iloc[3]])


This yielded the SyntaxError: "can't assign to operator". So I tried:

Numbers = [1,2,3]

for i in Numbers:
A = "A" + str(i)
Z = "Z" + str(i)
A = len("df" + str(i).loc["df" + str(i)['Column1'] <= "DF" + str(i)['Column1'].iloc[2]])
Z = len("df" + str(i).loc["df" + str(i)['Column1'] >= "DF" + str(i)['Column1'].iloc[3]])


This yielded the AttributeError: 'str' object has no attribute 'loc'. I tried a few other things like:

Numbers = [1,2,3]

for i in Numbers:
A = "A" + str(i)
Z = "Z" + str(i)
df = "df" + str(i)
DF = "DF" + str(i)
A = len(df.loc[df['Column1'] <= DF['Column1'].iloc[2]])
Z = len(df.loc[df['Column1'] <= DF['Column1'].iloc[3]])


But that just gives me the same errors. Ultimately what I would want is something like:

Numbers = [1,2,3]

for i in Numbers:
Ai = len(dfi.loc[dfi['Column1'] <= DFi['Column1'].iloc[2]])
Zi = len(dfi.loc[dfi['Column1'] <= DFi['Column1'].iloc[3]])


Where the output would be equivalent if I typed:

A1 = len(df1.loc[df1['Column1'] <= DF1['Column1'].iloc[2]])
Z1 = len(df1.loc[df1['Column1'] >= DF1['Column1'].iloc[3]])

A2 = len(df2.loc[df1['Column1'] <= DF2['Column1'].iloc[2]])
Z2 = len(df2.loc[df1['Column1'] >= DF2['Column1'].iloc[3]])

A3 = len(df3.loc[df3['Column1'] <= DF3['Column1'].iloc[2]])
Z3 = len(df3.loc[df3['Column1'] >= DF3['Column1'].iloc[3]])

Answer

It is "restricted" to generate variables in for loop (you can do that, but it's better to avoid. See other posts: post_1, post_2).

Instead use this code to achieve your goal without generating as many variables as your needs (actually generate only the values in the for loop):

# Lists of your dataframes
Hanimals = [H26, H45, H46, H47, H51, H58, H64, H65]
Ianimals = [I26, I45, I46, I47, I51, I58, I64, I65]

# Generate your series using for loops iterating through your lists above
BPM = pd.DataFrame({'BPM_Base':pd.Series([i_a for i_a in [len(i_h.loc[i_h['EKG-evt'] <=\
     i_i[0].iloc[0]]) / 10 for i_h, i_i in zip(Hanimals, Ianimals)]]),
    'BPM_Test':pd.Series([i_z for i_z in [len(i_h.loc[i_h['EKG-evt'] >=\
     i_i[0].iloc[-1]]) / 30 for i_h, i_i in zip(Hanimals, Ianimals)]])})

UPDATE

A more efficient way (iterate over "animals" lists only once):

# Lists of your dataframes
Hanimals = [H26, H45, H46, H47, H51, H58, H64, H65]
Ianimals = [I26, I45, I46, I47, I51, I58, I64, I65]

# You don't need using pd.Series(),
# just create a list of tuples: [(A26, Z26), (A45, Z45)...] and iterate over it
BPM = pd.DataFrame({'BPM_Base':i[0], 'BPM_Test':i[1]} for i in \
    [(len(i_h.loc[i_h['EKG-evt'] <= i_i[0].iloc[0]]) / 10,
    len(i_h.loc[i_h['EKG-evt'] >= i_i[0].iloc[-1]]) / 30) \
    for i_h, i_i in zip(Hanimals, Ianimals)])
Comments