I would like to loop over two equally long sets and determine whether elements in each set are also in an array.
This is a Hackerrank question of which I have already solved. However, I am using Hackerrank to further understanding of Python. I have been learning about list comprehension and whilst I do believe how I am attempting to use it to be considered bad production code I still would like to explore the possibilities of the language syntax for my own knowledge.
This is the code that sets it up:
n, m = map(int, input().split())
arr = list(map(int, input().split()))
A = set(map(int, input().split()))
B = set(map(int, input().split()))
1 5 3
print(sum([1 for a in A if a in arr]) + sum([-1 for b in B if b in arr]))
sum([1 if a in arr else -1 if b in arr for a, b in zip(A, B)])
print(sum(1 if a in arr -1 if b in arr for a, b in zip(A, B)))
First of all, forgo the list comprehension; feed the values directly into
sum() with a generator expression:
sum(1 for a in A if a in arr)
A is a set, use the
set.intersection() method to extract the common values, then take the length of the result:
This is faster than trying to test
arr for each value. This does produce a new
set() object first however, but you were creating a list before so that's not much difference.
At that point it is far simpler just to subtract the second length:
len(A.intersection(arr)) - len(B.intersection(arr))
You can avoid creating sets altogether by looping over
arr and testing each value in
arr against either
B; this too is faster because set membership tests are O(1) constant time:
sum(1 if v in A else -1 if v in B else 0 for v in arr)
Your method, of testing
a in arr for every value in the set
A, requires a full list scan of
arr if the value
a is not present; this makes membership testing against a list a O(N) linear time problem, and you do so for every value in
A and for every value in
B, so you end up with O((A+B) * N) == O(KN) time. Testing each value in
arr against the set is only O(N * 1) == O(N) time.
Moreover, if values in
arr are not unique, your approach would actually lead to to the wrong answer; you'd only count happy or unhappy numbers once, while the problem requires them to be counted each time they appear.