joshuatvernon - 9 months ago 123

Python Question

I would like to loop over two equally long sets and determine whether elements in each set are also in an array.

This is a Hackerrank question of which I have already solved. However, I am using Hackerrank to further understanding of Python. I have been learning about list comprehension and whilst I do believe how I am attempting to use it to be considered bad production code I still would like to explore the possibilities of the language syntax for my own knowledge.

This is the code that sets it up:

`n, m = map(int, input().split())`

arr = list(map(int, input().split()))

A = set(map(int, input().split()))

B = set(map(int, input().split()))

The task is to output an integer with a value of +1 for every element both in A and arr and -1 for every element both in B and arr.

Sample Input:

`3 2`

1 5 3

3 1

5 7

Sample Output:

`1`

This achieves the required results:

`print(sum([1 for a in A if a in arr]) + sum([-1 for b in B if b in arr]))`

However, this is closer to what I would like to achieve:

`sum([1 if a in arr else -1 if b in arr for a, b in zip(A, B)])`

EDIT (this is closer actually):

`print(sum(1 if a in arr -1 if b in arr for a, b in zip(A, B)))`

As you can see both are one-liners so it's not about attempting to reduce the code but rather to just understand the possibilities of list comprehension and pythonic code. If this is not possible or even bad practice I am very interested also.

This is the Hackerrank link:

https://www.hackerrank.com/challenges/no-idea

Answer Source

First of all, forgo the list comprehension; feed the values directly into `sum()`

with a generator expression:

```
sum(1 for a in A if a in arr)
```

If `A`

is a *set*, use the `set.intersection()`

method to extract the common values, then *take the length of the result*:

```
len(A.intersection(arr)))
```

This is faster than trying to test `arr`

for each value. This *does* produce a new `set()`

object first however, but you were creating a list before so that's not much difference.

At that point it is *far simpler* just to subtract the second length:

```
len(A.intersection(arr)) - len(B.intersection(arr))
```

You can avoid creating sets altogether by looping over `arr`

and testing each value in `arr`

against either `A`

or `B`

; this too is faster because set membership tests are O(1) constant time:

```
sum(1 if v in A else -1 if v in B else 0 for v in arr)
```

Your method, of testing `a in arr`

for every value in the set `A`

, requires a full list scan of `arr`

if the value `a`

is not present; this makes membership testing against a list a O(N) linear time problem, and you do so for every value in `A`

*and* for every value in `B`

, so you end up with O((A+B) * N) == O(KN) time. Testing each value in `arr`

against the set is only O(N * 1) == O(N) time.

Moreover, if values in `arr`

*are not unique*, your approach would actually lead to to the wrong answer; you'd only count happy or unhappy numbers *once*, while the problem requires them to be counted each time they appear.