New New - 27 days ago 10
Python Question

Comparing two sets of data with Intersection in Python

When comparing two sets, following_id and follower_id, the return result seems to be splitting everything.

import re
id1 = '[User(ID=1234567890, ScreenName=RandomNameHere), User(ID=233323490, ScreenName=AnotherRandomName), User(ID=4459284, ScreenName=YetAnotherName)]'
id2 = '[User(ID=1234467890, ScreenName=sdf), User(ID=233323490, ScreenName=AnotherRandomName), User(ID=342, ScreenName=443)]'

following_id = ', '.join( re.findall(r'ID=(\d+)', id1) )
follower_id = ', '.join( re.findall(r'ID=(\d+)', id2) )

a = list(set(following_id).intersection(follower_id))
print a


This results with
[' ', ',', '1', '0', '3', '2', '5', '4', '7', '6', '9', '8']


I would like the results to be
['233323490','54321']
which are the two IDs that match between the two sets.

The following works for me:

list1 = [1234567890, 233323490, 4459284, 230, 200, 234, 200, 0002]
list2 = [1234467890, 233323490, 342, 101, 234]
a = list(set(list1).intersection(list2))
print a


With a result of
[233323490, 234]


Does this have to do with the datatype for following_id and follower_id?

Answer

This is because you're making strings with .join, not lists:

following_id = ', '.join( re.findall(r'ID=(\d+)', id1) )
follower_id = ', '.join( re.findall(r'ID=(\d+)', id2) )
print(following_id) # '1234567890, 233323490, 4459284'
print(follower_id) # '1234467890, 233323490, 342'

You just need to use:

following_id = re.findall(r'ID=(\d+)', id1)
follower_id = re.findall(r'ID=(\d+)', id2)

As re.findall already returns a list of matches.

Comments