I have a Pandas data frame with columns that are 'dynamic' (meaning that I don't know what the column names will be until I retrieve the data from the various databases).
The data frame is a single row and looks something like this:
Make Date Red Blue Green Black Yellow Pink Silver
89 BMW 2016-10-28 300.0 240.0 2.0 500.0 1.0 1.0 750.0
Note that '89' is that particular row in the data frame.
I have the following code:
cars_bar_plot = df_cars.loc[(df_cars.Make == 'BMW') & (df_cars.Date == as_of_date)]
cars_bar_plot = cars_bar_plot.replace(0, value=np.nan)
cars_bar_plot = cars_bar_plot.dropna(axis=1, how='all')
This works fine in helping me to create the above-mentioned single-row data frame, BUT
some of the values in each column are very small (e.g. 1.0 and 2.0) relative to the other values and they are distorting a horizontal bar chart that I'm creating with Matplotlib. I'd like to get rid of numbers that are smaller than some minimum threshold value (e.g. 3.0).
Any idea how I can do that?
The following line of code helps, but does not fully solve the problem.
cars_bar_plot = cars_bar_plot.loc[:, (cars_bar_plot >= 3.0).any(axis=0)]
The problem is that it's eliminating unintended columns
. For example, referencing the original data frame, is it possible to modify this code such that it only removes columns with a value less than 3.0 to the right
of the "Black" column (under the assumption that we actually want to retain the value of 2.0 in the "Green" column)?