Sean Peters Sean Peters - 2 months ago 4
Python Question

Using a column of one DataFrame in a function - Python

I have a DataFrame which has a column with unique IDs. That same ID is used elsewhere for functions I use. I have no problem using one of the individual IDs in my function, however, I would like to use a subset of those IDs in the function and then append that to a new DataFrame.

This is what my base DataFrame looks like:

First_Df

+---+------------+-------------+-------------+
| | Unique ID | A | B |
+---+------------+-------------+-------------+
| 1 | 123456 | xxxxx | aaaaa |
| 2 | 234567 | yyyyy | bbbbb |
| 3 | 345678 | zzzzz | ccccc |
| 4 | 456789 | uuuuu | ddddd |
| 5 | 567890 | vvvvv | eeeee |
| 6 | 678901 | wwwww | fffff |
+---+------------+-------------+-------------+


I have a subset of those values in a separate DatFrame like so:

Subset_Df

+---+------------+-------------+-------------+
| | Unique ID | A | B |
+---+------------+-------------+-------------+
| 2 | 234567 | yyyyy | bbbbb |
| 3 | 345678 | zzzzz | ccccc |
| 5 | 567890 | vvvvv | eeeee |
+---+------------+-------------+-------------+


If I run my function using one ID, my function will return a DF with the right values, however, if I try to give it my subset list of IDs, I get ValueError: No JSON object could be decoded.

Function123(Subset_Df['Unique ID'],arg1,arg2)


Thanks in advance, I can provide specific lines of code if needed.

EDIT:

This is what my base DataFrame looks like:

all_players_df.head(6)

+---+------------+---------------+-------------+-------------+
| | PERSON_ID |DISPLAY_LAST.. |TEAM_CITY |TEAM_CITY |
+---+------------+---------------+-------------+-------------+
| 1 | 123456 |Adams, Jordan | Memphis | Grizzlies |
| 2 | 234567 |Anderson, Alan | LA | Clippers |
| 3 | 345678 |Ayres, Jeff | LA | Clippers |
| 4 | 456789 |Aldrich, Cole | Minnesota | Timberwolves|
| 5 | 567890 |Albrines,Alex |Oklahoma City| Thunder |
| 6 | 678901 |Bass,Brandon | LA | Clippers |
+---+------------+---------------+-------------+-------------+


I have a subset of those values in a separate DataFrame like so:

clippers_players_df.head(3)

+---+------------+---------------+-------------+-------------+
| | PERSON_ID |DISPLAY_LAST.. |TEAM_CITY |TEAM_CITY |
+---+------------+---------------+-------------+-------------+
| 2 | 234567 |Anderson, Alan | LA | Clippers |
| 3 | 345678 |Ayres, Jeff | LA | Clippers |
| 5 | 678901 |Bass,Brandon | LA | Clippers |
+---+------------+---------------+-------------+-------------+


Then I'll run a function:

player_shooting_stats_overall_df(234567,'2015-16','Playoffs')


Running this, I'll get the correct returned DF for that function but I want to run the PERSON_ID for the clippers through my function. I'll try:

player_shooting_stats_overall_df(clippers_players_df['PERSONID'],'2015-16','Playoffs')


but this is where I get the error ValueError: No JSON object could be decoded.

Answer

Most likely your function takes scalars as input while you pass in a pandas series (i.e., column of a dataframe). Consider a pandas.Series.apply() for element-wise operations. For the other positional arguments, use args keyword.

dfSeries = clippers_players_df['PERSONID'].apply(player_shooting_stats_overall_df, 
                                                 args=('2015-16','Playoffs'))

Do note the above will create a series of multiple data frames, assuming a df is the returned value of function. To combine all into single data frame, use pd.concat() after converting series to a list:

clippersdf = pd.concat(dfSeries.tolist())
Comments