sebap123 - 8 months ago 13

Python Question

I have 3 very long (100K+ elements) vectors of different products names. Each vector has different length. What I want to do is to count in how many vectors each product is. So something like this:

`v1 = ['product1','product2','product3']`

v2 = ['product3','product1','product5','product7','product10']

v3 = ['product1','product10']

'product1' 3

'product2' 1

'product3' 2

'product5' 1

'product7' 1

'product10' 2

Products might be in any order within vector and within vector each product appears only once.

I wanted to use pandas

`DataFrame`

Does anyone has any idea what will be the best way to do this? I know that I can do simple bruteforce loop but I don't want to if I can use something from numpy or pandas.

Answer

You can use `Counter`

and `chain`

to do this in a few lines:

```
from collections import Counter
from itertools import chain
v1 = ['product1','product2','product3']
v2 = ['product3','product1','product5','product7','product10']
v3 = ['product1','product10']
c = Counter(chain(v1, v2, v3))
# more space-efficient than Counter(v1 + v2 + v3)
# Counter({'product1': 3, 'product10': 2, 'product3': 2, 'product7': 1, 'product5': 1, 'product2': 1})
c['product10']
# 2
```