Black Dahlia1147 Black Dahlia1147 - 2 months ago 20
C Question

C loop optimization help for final assignment

So for my final assignment in my Computer Systems class, we need to optimize these forloops to be faster than the original. The basic grade is under 7 seconds and the full grade is under 5 seconds with our linux server. This code that I have right here gets about 5.6 seconds. I am thinking I may need to use pointers with this in some way to get it to go faster but I'm not really sure. Could anyone offer any tips or options that I have? Thank you so much!

QUICKEDIT: The file must remain 50 lines or less and I am ignoring those commented lines the instructor has included.

#include <stdio.h>
#include <stdlib.h>

// You are only allowed to make changes to this code as specified by the comments in it.

// The code you submit must have these two values.
#define N_TIMES 600000
#define ARRAY_SIZE 10000

int main(void)
double *array = calloc(ARRAY_SIZE, sizeof(double));
double sum = 0;
int i;

// You can add variables between this comment ...
register double sum1 = 0, sum2 = 0, sum3 = 0, sum4 = 0, sum5 = 0, sum6 = 0, sum7 = 0, sum8 = 0, sum9 = 0;
register int j;
// ... and this one.

printf("CS201 - Asgmt 4 - \n");

for (i = 0; i < N_TIMES; i++)
// You can change anything between this comment ...
for (j = 0; j < ARRAY_SIZE; j += 10)
sum += array[j];
sum1 += array[j + 1];
sum2 += array[j + 2];
sum3 += array[j + 3];
sum4 += array[j + 4];
sum5 += array[j + 5];
sum6 += array[j + 6];
sum7 += array[j + 7];
sum8 += array[j + 8];
sum9 += array[j + 9];
// ... and this one. But your inner loop must do the same
// number of additions as this one does.

// You can add some final code between this comment ...
sum += sum1 + sum2 + sum3 + sum4 + sum5 + sum6 + sum7 + sum8 + sum9;
// ... and this one.

return 0;


You may be on the right track, though you'll need to measure it to be certain (my normal advice to measure, not guess seems a little superfluous here since the whole point of the assignment is to measure).

Optimising compilers will probably not see much of a difference since they're pretty clever about that sort of stuff but, since we don't know what optimisation level it will be compiling at, you may get a substantial improvement.

To use pointers in the inner loop is a simple matter of first adding a pointer variable:

register double *pj;

then changing the loop to:

for (pj = &(array[0]); pj < &(array[ARRAY_SIZE]); j++) {
        sum += *j++;
        sum1 += *j++;
        sum2 += *j++;
        sum3 += *j++;
        sum4 += *j++;
        sum5 += *j++;
        sum6 += *j++;
        sum7 += *j++;
        sum8 += *j++;
        sum9 += *j;

This keeps the amount of additions the same within the loop (assuming you're counting += and ++ as addition operators, of course) but basically uses pointers rather than array indexes.

With no optimisation1 on my system, this drops it from 9.868 seconds (CPU time) to 4.84 seconds. Your mileage may vary.

1 With optimisation level -O3, both are reported as taking 0.001 seconds so, as mentioned, the optimisers are pretty clever. However, given you're seeing 5+ seconds, I'd suggest it wasn't been compiled with optimisation on.

As an aside, this is a good reason why it's usually advisable to write your code in a readable manner and let the compiler take care of getting it running faster. While my meager attempts at optimisation roughly doubled the speed, using -O3 made it run some ten thousand times faster :-)