HonzaB HonzaB - 1 year ago 295
Python Question

Plot venn diagram with pandas and matplotlib_venn

I'd like to plot venn diagrams based on my pandas data frame. I understand that

accepts sets as input. My dataset contain client id and two other columns with information if the client was in campaign or not.

df_dataset = pd.read_csv('...path...',delimiter=',',decimal=',')
campaign_a = df_dataset[(df_dataset['CAM_A'] == 1)]
campaign_b = df_dataset[(df_dataset['CAM_B'] == 1)]

set1 = set(campaign_a['CLI_ID'])
set2 = set(campaign_b['CLI_ID'])

venn3([set1, set2], ('Set1', 'Set2'))

However I get an error:

File "C:\Python27\Lib\site-packages\matplotlib_venn_venn3.py", line 44, in compute_venn3_areas
areas = np.array(np.abs(diagram_areas), float)

TypeError: bad operand type for abs(): 'set'


Based on lanS advice, it works now. But for some reasons, the diagrams are not together. But in their documentation, the same code works.


set1 = set(campaign_a['CLI_ID'])
set2 = set(campaign_b['CLI_ID'])
set3 = set(union['CLI_ID'])

venn3([set1, set2, set3], ('A', 'B', 'union'))

enter image description here

UPDATE 2 - solution

In the end, the simplest approach seems to be only insert size of each space, not dataset. Inspiration here.

Answer Source

I believe you need to pass 3 sets. Based on the code here, if you pass three subsets then they are transformed into a tuple before being passed to compute_venn3_areas, where np.abs can handle them. The case when you pass only 2 sets looks like an unhandled error.