ARKBAN ARKBAN - 1 month ago 8
C# Question

Sorting algorithm for a non-comparison based sort problem?

I am currently faced with a difficult sorting problem. I have a collection of events that need to be sorted against each other (a comparison sort) and against their relative position in the list.

In the simplest terms I have list of events that each have a priority (integer), a duration (seconds), and an earliest occurrence time that the event can appear in the list. I need to sort the events based on priority, but no event can appear in the list before its earliest occurrence time. Here's an example to (hopefully) make it clearer:

// Psuedo C# code
class Event { int priority; double duration; double earliestTime ; }

void Example()
{
Event a = new Event { priority = 1, duration = 4.0, earliestTime = 0.0 };
Event b = new Event { priority = 2, duration = 5.0, earliestTime = 6.0 };
Event c = new Event { priority = 3, duration = 3.0, earliestTime = 0.0 };
Event d = new Event { priority = 4, duration = 2.0, earliestTime = 0.0 };

// assume list starts at 0.0 seconds
List<Event> results = Sort( new List<Event> { a, b, c, d } );

assert( results[ 0 ] == a ); // 4.0 seconds elapsed
assert( results[ 1 ] == c ); // 7.0 seconds elapsed
assert( results[ 2 ] == b ); // 12.0 seconds elapsed
assert( results[ 3 ] == d ); // 14.0 seconds elapsed
}


Item "b" has to come last because it isn't allowed to start until 6.0 seconds into the list, so it is deferred and "c" gets to go before "b" even though its priority is lower. (Hopefully the above explains my problem, if not let me know and I'll edit it.)

My current idea is to use an insertion sort to manage the sorting process. Unlike many of the other common sorting algorithms, insertion sort decides the order of the list one at a time and in order. So for each index I should be able to find the next lowest priority event whose earliest occurrence time will be satisfied.

I'm hoping to find resources about sorting algorithms and data structures to help me design a good solution for this "sort" of problem. My real problem is actually more complex than this: hierarchical sorting, variable buffers between events, multiple non-constant time constraints, so the more information or ideas the better. Speed and space are not really a concern. Accuracy in sorting and maintainability of the code are a concern.

Edit: Clarifications (based on comments)


  • Events consume their entire duration (that is there is no overlap of events allowed)

  • Events must occur at or after their earliestTime, they cannot occur before their earliestTime.

  • Events can occur later than their earliestTime if lower priority events exist

  • Events cannot be interrupted

  • There is a maximum duration the sum of all events that can fit in a list. This is not shown above. (In reality the duration of all events will be far greater than the time list's maximum duration.)

  • There cannot be any gaps. (There are no holes to try and back fill.)



Edit: Answer

While David Nehme gave the answer I selected, I wanted to point out that his answer is an insertion sorts at heart, and several other people provided insertions sort type answers. This confirms for me that a specialized insertion sort is probably the way to go. Thanks to all of you for your answers.

Answer

This is actually more than a sorting problem. It's a single-machine scheduling problem with release dates. Depending on what you are trying to do, the problem might be NP-Hard. For example, if you are trying to mimimize the weighted-sum of the completion times (the weight being inversely proportional to the priority), the the problem is categorized as

1|ri;pmtn|Σ wiCi

and is NP-hard. There are numerous papers on this topic, but it might be more than what you need.

In your case, you never want a solution with gaps, so what you might just need to do is a simple discrete-event simulation ( O(n log(n)) ) time. You need to store released_jobs as a priority queue.

unreleased_jobs = jobs  // sorted list of jobs, by release date
released_jobs = {}      // priority queue of jobs, by priority
scheduled_jobs = {}     // simple list
while (!unreleased_jobs.empty() || !released_jobs.empty()) {

    while (unreleased_jobs.top().earliestTime  <= t) {
        released_jobs.push(unreleased_jobs.pop())
    }
    if (!released_jobs.empty()) {
       next_job = released_jobs.pop();
       scheduled_jobs.push_back(next_job)
       t = t + next_job.duration
    } else {
       // we have a gap
       t = unreleased_jobs.top().earliestTime
    }
}

One problem is that you might have a low-priority job with a release time just before a short, high-priority job, but it will produce a schedule with the property that there are no gaps (if a schedule with no gaps is possible).