djtama djtama - 6 months ago 17
Python Question

Generating different combinations of string in python

I am working on a project and stuck at the point I am discussing below.
I have sets of permissions. P0, P1 etc. Permissions can be individual like P0, P1, P2 etc or in groups P1P2, P3P4P5, P1P7P11P34......the permissions are treated as strings and we have them in in sorted order(based on length). Here is one possible sequence of input strings:
P1
P2
P4
P23
P0P1
P3P5
P8P13P45P67
..........
.......

Now my work is to see if each longer string can be formed by some combination of the smaller strings. What I am doing is I am inserting the strings into a trie and checking whether the larger strings can be made using the previous string. If yes, I am doing nothing; otherwise, I put the latest seen string into the trie.

The problem I have is that, as the length of the strings increase, I have to do permutations/combinations of the strings and check whether they are present or not. Now the classic string permutations will not work, because first I have to have all the permutations together (not one by one) to check if they are in the trie. The permutation is order specific so P0P1 is feasible and not P1P0. Also, there are far too many permutations.

I am giving an example of the different combinations to make it more clear. Suppose I have a new permission string like P0P1P2. The different combinations I have to check before declaring that my previous entries in the trie is not suitable to build the string.

Note that the permutation must not include any permissions not in the target (new input) string.

P0 P1 P2
P0P1 P2
P0P1 P1P2
P0P1 P0P2
P0P2 P1
P0P2 P0P1
P0 P1P2


I am stuck at this point and want to know if there is some algorithm which generates such combinations for a very large strings, or I have to drop this idea and go in a different route.

Answer

I misread your problem statement earlier: I thought you had to obtain the permissions in a given order, such that P0P1 would later not be sufficient to cover a line of P1P0. Sorry about the delay.

I suggest an alternative to bit vectors: sets. They're slower, but automatically expandable, and perhaps easier to use. The process for each input line is:

  • Extract the permission numbers into a set.
  • Clear out the master set.
  • For each prior string,
  • If the prior is a subset of the current string (no extra permissions),
  • add it to the master set.
  • Check the master set; if it is a superset of the input line,
  • Report that the new string is covered by the priors;
  • else add the new string to the priors list.

I'm also assuming that you don't need to know which prior input lines are needed to cover the current one.

Code:

prior = []    # list of prior input sets (only those that weren't redundant)

with open("permission.txt") as in_file:
  for in_line in in_file:
    master = set([])
    perm_line = set([int(s) for s in in_line.split('P') if len(s)>0])

    # Check each of the shorter (prior) permission strings
    for short in prior:
      # If the prior string has no superfluous permissions, add it to the master set.
      extra = short - perm_line
      if len(extra) == 0:
        master = master.union(short)

    # Did the combination of prior strings cover all of the input line permissions?
    print in_line,
    if len(perm_line - master) == 0:
      print "\tPermissions included in prior strings"
    else:
      print "\tFound New permission combinations; adding to reference list"
      prior.append(perm_line)

Output: (input file echoed to make that part obvious; I added a few lines)

P1
    Found New permission combinations; adding to reference list
P2
    Found New permission combinations; adding to reference list
P4
    Found New permission combinations; adding to reference list
P23
    Found New permission combinations; adding to reference list
P0P1
    Found New permission combinations; adding to reference list
P3P5
    Found New permission combinations; adding to reference list
P8P13P45P67
    Found New permission combinations; adding to reference list
P0P2
    Found New permission combinations; adding to reference list
P0P1P3P5
    Permissions included in prior strings
P0P1P2
    Permissions included in prior strings
P0P1P2P3
    Found New permission combinations; adding to reference list

UPDATE: Here's how to track the permission sets used to make the new string. Focus on the new objects cover and cover_team. Note that this does not find the minimal coverage, although you'll get close to that if you add new elements to prior at the front, rather than the end. This makes the routine search from longest to shortest.

I've taken the "cheap" way out for reporting, just printing the permission sets. I'll let you worry about the formatting.

prior = [] # list of prior input sets (only those that weren't redundant)

with open("permission.txt") as in_file:
  for in_line in in_file:
    master = set([])
    cover = set([])
    cover_team = []
    perm_line = set([int(s) for s in in_line.split('P') if len(s)>0])

    # Check each of the shorter (prior) permission strings
    for short in prior:
      # If the prior string has no superfluous permissions, add it to the master set.
      extra = short - perm_line
      if len(extra) == 0:
        master = master.union(short)
        # Does this string add anything new to the coverage?
        ### print "compare", short, cover
        if len(short - cover) > 0:
          cover = cover.union(short)
          cover_team.append(short)
          ### print "Add to cover_team:", short

    # Did the combination of prior strings cover all of the input line permissions?
    print in_line,
    if len(perm_line - master) == 0:
      print "\tPermissions included in prior strings", cover_team
    else:
      print "\tFound New permission combinations; adding to reference list"
      prior.append(perm_line)

Output:

P1
    Found New permission combinations; adding to reference list
P2
    Found New permission combinations; adding to reference list
P4
    Found New permission combinations; adding to reference list
P23
    Found New permission combinations; adding to reference list
P0P1
    Found New permission combinations; adding to reference list
P3P5
    Found New permission combinations; adding to reference list
P8P13P45P67
    Found New permission combinations; adding to reference list
P0P2
    Found New permission combinations; adding to reference list
P0P1P3P5
    Permissions included in prior strings [set([1]), set([0, 1]), set([3, 5])]
P0P1P2
    Permissions included in prior strings [set([1]), set([2]), set([0, 1])]
P0P1P2P3
    Found New permission combinations; adding to reference list
P0P1P2P3P4P5
    Permissions included in prior strings [set([1]), set([2]), set([4]), set([0, 1]), set([3, 5])]
Comments