grebos grebos - 1 month ago 6
C Question

Why does my quicksort crash with large, reverse-sorted arrays?

I'm learning C and I tried out a recursive quicksort algorithm. At small input sizes, it works as expected; with random generated arrays it had no problems with all tested sizes (up to 100,000). With an descending array, it somehow breaks (Windows gives me a message, that the program has stopped working) at a certain array size (32,506). Is there any error in my code (for example any wrong memory allocation - I'm not sure if I got this right) or does C have a limit in recursive calls or anything else?

Edit:
I know that my Quicksort implementation is rather naive and that it behaves terribly with this sort of Input, but I didn’t expect it to crash.

I am using GCC with MinGW on the command prompt on Windows 10. I’m not sure how to find out what happens exactly because I’m not really getting any specified error message despite of Windows telling me that my program has stopped working.

#include <stdio.h>
#include <stdlib.h>

int partition(int *a, int lo, int hi) {
int i = lo; int j = hi+1; int v,t;
v = a[lo]; //partition element
while (1) {
while (a[++i] < v) {if (i == hi) break;}
while (v < a[--j]) {if (j == lo) break;}
if (i >= j) break;
t = a[j]; a[j] = a[i]; a[i]= t; //swap
}
t = a[lo]; a[lo] = a[j]; a[j]= t;//swap
return j;
}

void quicksort(int a[], int lo, int hi) {
int j;
if (hi <= lo) return;
j = partition(a, lo, hi);
quicksort(a, lo, j-1);
quicksort(a, j+1, hi);
}

int main() {
int len;
for (len = 32000;len < 40000;len+=100) {
printf("New Arr with len = %d\n",len);
int *arr;
arr = (int*) calloc(len,sizeof(int));
int j;
//create descending Array
for (j = 0; j < len; ++j) {
arr[j] = len-j;
}
printf("start sorting\n");
quicksort(arr,0,len-1);
free(arr);
}
}

Answer

For me, your code fails at much larger sizes (c. 370,000 elements). You are likely running into a platform limit (probably limits to recursion depth due to stack overflow). Without the exact error message, it's hard to be sure, of course.

Your input set is likely a pathological case for your implementation - see What makes for a bad case for quick sort?

You can reduce the recursion depth by a better choice of pivot - a common technique is to take the median of the first, central and last elements. Something like this:

int v0 = a[lo], v1 = a[(lo+hi+1)/2], v2 = a[hi];
/* pivot: median of v0,v1,v2 */
int v = v0 < v1 ? v1 < v2 ? v1 : v0 < v2 ? v2 : v0 : v0 < v2 ? v0 : v1 < v2 ? v2 : v1;

You can also reduce the recursion depth by recursing only for the smaller of the partitions, and using iteration to process the larger one. You may be able to get your compiler's tail-call eliminator to convert the recursion to iteration, but if that doesn't work, you'll need to write it yourself. Something like:

void quicksort(int a[], int lo, int hi) {
    while (lo < hi) {
        int j = partition(a, lo, hi);
        if (j - lo < hi -j) {
            quicksort(a, lo, j-1);
            lo = j+1;
        } else {
            quicksort(a, j+1, hi);
            hi = j-1;
        }
    }
}

With the above changes, I can sort arrays of over a billion elements without crashing (I had to make some performance improvements - see below - and even then, it took 17 seconds).

You may also want to return early when you find a sub-array is already sorted. I'll leave that as an exercise.


P.S. A couple of issues in your main():

You don't test the result of calloc() - and you probably should be using malloc() instead, as you will write every element anyway:

int *arr = malloc(len * sizeof *arr);
if (!arr) return fprintf(stderr, "allocation failed\n"), EXIT_FAILURE;

Full listing

Here's the code I ended up with:

#include <stdio.h>
#include <stdlib.h>
#include <limits.h>

int partition(int *a, int i, int j) {
    int v0 = a[i], v1 = a[(i+j+1)/2], v2 = a[j];
    /* pivot: median of v0,v1,v2 */
    int v = v0 < v1 ? v1 < v2 ? v1 : v0 < v2 ? v2 : v0 : v0 < v2 ? v0 : v1 < v2 ? v2 : v1;
    while (i < j) {
        while (a[i] < v && ++i < j)
            ;
        while (v < a[j] && i < --j)
            ;
        int t = a[j]; a[j] = a[i]; a[i]= t; //swap
    }
    /* i == j; that's where the pivot belongs */
    a[i] = v;
    return j;
}

void quicksort(int a[], int lo, int hi) {
    while (lo < hi) {
        int j = partition(a, lo, hi);
        if (j - lo < hi -j) {
            quicksort(a, lo, j-1);
            lo = j+1;
        } else {
            quicksort(a, j+1, hi);
            hi = j-1;
        }
    }
}

int main()  {
    int len = INT_MAX/2+1;
    printf("New Arr with len = %d\n",len);
    int *arr = malloc(len * sizeof *arr);
    if (!arr) return fprintf(stderr, "allocation failed\n"), EXIT_FAILURE;

    /* populate pessimal array */
    for (int j = 0; j < len; ++j) {
        arr[j] = len-j;
    }

    printf("start sorting\n");
    quicksort(arr, 0, len-1);

    /* test - is it sorted? */
    for (int i = 0;  i+1 < len;  ++i)
        if (arr[i] >= arr[i+1])
            return fprintf(stderr, "not sorted\n"), EXIT_FAILURE;
    free(arr);
}
Comments