Rodolphe Rodolphe - 1 month ago 25
Python Question

Fast calculation of Pareto front in Python

I have a set of points in a 3D space, from which I need to find the Pareto frontier. Speed of execution is very important here, and time increases very fast as I add points to test.

The set of points looks like this:

[[0.3296170319979843, 0.0, 0.44472108843537406], [0.3296170319979843,0.0, 0.44472108843537406], [0.32920760896951373, 0.0, 0.4440408163265306], [0.32920760896951373, 0.0, 0.4440408163265306], [0.33815192743764166, 0.0, 0.44356462585034007]]


Right now, I'm using this algorithm:

def dominates(row, candidateRow):
return sum([row[x] >= candidateRow[x] for x in range(len(row))]) == len(row)

def simple_cull(inputPoints, dominates):
paretoPoints = set()
candidateRowNr = 0
dominatedPoints = set()
while True:
candidateRow = inputPoints[candidateRowNr]
inputPoints.remove(candidateRow)
rowNr = 0
nonDominated = True
while len(inputPoints) != 0 and rowNr < len(inputPoints):
row = inputPoints[rowNr]
if dominates(candidateRow, row):
# If it is worse on all features remove the row from the array
inputPoints.remove(row)
dominatedPoints.add(tuple(row))
elif dominates(row, candidateRow):
nonDominated = False
dominatedPoints.add(tuple(candidateRow))
rowNr += 1
else:
rowNr += 1

if nonDominated:
# add the non-dominated point to the Pareto frontier
paretoPoints.add(tuple(candidateRow))

if len(inputPoints) == 0:
break
return paretoPoints, dominatedPoints


Found here: http://code.activestate.com/recipes/578287-multidimensional-pareto-front/

What's the fastest way to find the set of non-dominated solutions in an ensemble of solutions? Or, in short, can Python do better than this algorithm?

Answer

I took a shot at rewriting the same algorithm with a couple of tweaks. I think most of your problems come from inputPoints.remove(row). This requires searching through the list of points; removing by index would be much more efficient. I also modified the dominates function to avoid some redundant comparisons. This could be handy in a higher dimension.

def dominates(row, rowCandidate):
    return all(r >= rc for r, rc in zip(row, rowCandidate))

def cull(pts, dominates):
    dominated = []
    cleared = []
    remaining = pts
    while remaining:
        candidate = remaining[0]
        new_remaining = []
        for other in remaining[1:]:
            [new_remaining, dominated][dominates(candidate, other)].append(other)
        if not any(dominates(other, candidate) for other in new_remaining):
            cleared.append(candidate)
        else:
            dominated.append(candidate)
        remaining = new_remaining
    return cleared, dominated
Comments