HighwayJohn HighwayJohn - 2 months ago 9
Python Question

Special order of a list

I have the following problem. It starts with a list which I get. Let's say for example I got the list:

A=['1-00', '10--']
(The list can be longer or even shorter e.g.
B=['01-1', '1-00', '0-01']
). It can also have more entries than four or less. Now I first have to sort for the rank. The rank is defined as the number of digits, that you see.

In A I have one string of rank two
'10--'
and one of rank three
'1-00'
. (In B I have three of rank three) Now I need to get the number of four binary digits numbers which I get with all of the lowest rank strings first.

Here I have only one rank two string
'10--'
I can produce:
'1000', '1001', '1010', '1011'
. So I get 4 binary numbers for rank two. With
'1-00'
I get
'1100'
and
'1000'
. But I already got the second one by the rank two string. So the number I should get is 5 for rank three. That is the number of distinct strings I get with rank two and rank three.

I think its a tough problem but guess there are some commands which I don't know that could help. Hope you can give me some hints :)

Answer

If I understand your question, you want to sort by the number of fixed digits ("rank") first, then compute the total number of unique possible bit strings at or below each rank. So for A, you'd want it sorted to ['10--', '1-00'] (putting lower rank first), and you'd want to get the number of unique patterns for all rank 2 and lower patterns (in this case, there is only one), then for all rank 3 and lower patterns (again, in this case, only one), etc.

If that understanding is correct, the first step is sorting:

A.sort(key=lambda x: x.count('0') + x.count('1'))
# Or if all patterns are of length 4, the slightly simpler:
A.sort(key=lambda x: 4 - x.count('-'))

After that, you process by ranks using itertools.groupby, generating unique outputs with itertools.product and storing them in a set so you remove duplicates:

from __future__ import print_function  # For consistent print on Py2/Py3
from future_builtins import map  # Only on Py2, to get Py3 generator based map

from itertools import groupby, product

uniquebits = set()
for rank, pats in groupby(A, key=lambda x: x.count('0') + x.count('1')):
    for pat in pats:
        # product can be abused to sub 0, then 1 for each wildcarded space
        patpieces = [let if let != '-' else '01' for let in pat]
        # Generate all patterns (using ''.join reduces storage by converting
        # tuples back to str) updating the set to get unique values seen so far
        uniquebits.update(map(''.join, product(*patpieces)))
    print("For rank", rank, "and below,", len(uniquebits), "unique bit strings")

Which for your A would output:

For rank 2 and below, 4 unique bit strings
For rank 3 and below, 5 unique bit strings

If you want to avoid gaps, and output for all ranks out to the maximum length of the patterns, it's slightly more complicated (since groupby only produces groups when there is at least one input with that "key", in this case, at least one input of a given rank):

lastrank = None  # Change to 1 or 2 to force ranks prior to lowest rank to output
for rank, pats in groupby(A, key=lambda x: x.count('0') + x.count('1')):
    if lastrank is not None and lastrank + 1 != rank:
        for fillrank in range(lastrank + 1, rank):
            print("For rank", fillrank, "and below,", len(uniquebits), "unique bit strings")
    lastrank = rank

    for pat in pats:
        # product can be abused to sub 0, then 1 for each wildcarded space
        patpieces = [let if let != '-' else '01' for let in pat]
        # Generate all patterns (using ''.join reduces storage by converting
        # tuples back to str) updating the set to get unique values seen so far
        uniquebits.update(map(''.join, product(*patpieces)))
    print("For rank", rank, "and below,", len(uniquebits), "unique bit strings")

# Or len(max(A, key=len))+1 if uneven lengths
for rank in range(lastrank + 1, len(A[0]) + 1):
    print("For rank", rank, "and below,", len(uniquebits), "unique bit strings")