ShanZhengYang ShanZhengYang - 3 months ago 22
Python Question

How to plot a "grouped scatterplot" with non-categorical data?

I am a bit confused how to use

seaborn.stripplot()
to plot multiple columns of data points when these data do not have "categorical" labels.

For example, users can plot "grouped" scatterplots as follows, with the
tips
dataset:

import numpy as np
import pandas as pd
import matplotlib
import matplotlib.pyplot as plt

import seaborn as sns

tips = sns.load_dataset("tips") # internal dataset

print(tips)

total_bill tip sex smoker day time size
0 16.99 1.01 Female No Sun Dinner 2
1 10.34 1.66 Male No Sun Dinner 3
2 21.01 3.50 Male No Sun Dinner 3
3 23.68 3.31 Male No Sun Dinner 2
4 24.59 3.61 Female No Sun Dinner 4
5 25.29 4.71 Male No Sun Dinner 4
.... ..... ..... .....


There are measurements grouped together by the category
day
, whereby we produce scatterplots as follows:

sns.stripplot(x="day", y="total_bill", data=tips)


enter image description here

Now, I would like to re-produce this "grouped scatterplot format" plot with non-categorical data, with data in each column:

df = pd.read_csv("my_data.csv")

df

total_bill_A total_bill_B total_bill_C total_bill_D
0 16.99 21.01 15.99 14.50
1 10.34 21.66 12.99 16.50
2 21.01 23.50 7.25 17.50
3 23.68 23.31 9.99 12.50
4 24.59 23.61 10.00 15.50
5 25.29 24.71 11.00 19.50
.... ....


The y-axis here is
price
, and the x axis should be each of these columns,
total_bill_A
,
total_bill_B
,
total_bill_C
, and
total_bill_D
, similar to the above for Thursday, Friday, Saturday, Sunday.

How could I plot something like these
seaborn
? Is it possible to do this with
seaborn.stripplot()
?

Answer

You can melt the dataframe and name the parameters accordingly to apply to the stripplot as follows:

df_strip = pd.melt(df, var_name='total_bill', value_name='price')
sns.stripplot(x="total_bill", y="price", data=df_strip)

Image