Francesco Di Lauro Francesco Di Lauro - 2 months ago 9
C Question

omp parallel for output order

I made this simple program to test the openmp libraries:

#include<omp.h>
#include<stdio.h>
#include<unistd.h>
#include<stdlib.h>
int main()
{
int i;
char buffer[10];
FILE *fp1;
sprintf(buffer, "out.txt");
fp1=fopen(buffer, "a");
#pragma omp parallel for
for(i=0;i<16;i++)
fprintf(fp1, "%d ", i);
}


And i obtain the following results: 8
9
14
15
0
1
6
7
4
5
2
3
12
13
10
11

So I tried a simpler program to just put the numbers on the terminal:

#include<omp.h>
#include<stdio.h>

int main()
{
int i;
#pragma omp parallel for
for(i=0;i<16;i++)
printf("%d ", i);
}


When i Run it here's my output:
12
13
4
5
0
1
6
7
2
3
14
15
10
11
8
9

I'd like to understand why the variable i isn't taken in order by my cores, I expected something like: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16.

Edit: Yes I have 8 cores. Thanks all for the answers, now it's clear.

Answer

Because parallel execution.

Understanding how a parallel program handles output is one of the first concepts you should grasp, before moving on, since it will help you later when you actually develop something cool.


That means that every thread starts and whoever finishes first or/and gets first the output resources (the bus for the stdout buffer in that case), prints first.

Notice the output:

8 9 14 15 0 1 6 7 4 5 2 3 12 13 10 11

You can identify that the numbers are printed in pairs (4,5 for example).

Do you have 8 cores? If so, then you could determine the chunks, 16 numbers diving by 8 cores, thus 2 numbers per core, which would explain why the output is printed in pairs.

Notice that by pairs I mean that you can see the sequential behavior in that chunk of numbers. Every thread will print its numbers in order.


Same applies for your simpler program.


Let me put it in other words:

Assuming you have 8 cores, you can imagine that your computer will actually execute 8 programs. Every program will print 16/8 = 2 numbers. Every program's core would actually look like this:

int offset = omp knows what to put here
#pragma omp parallel for
for(i = offset; i < offset + 2; i++)
  printf("%d ", i);

So for the 2nd (sub)program offset will be equal to 2.

Now imagine you have 8 terminals open and you have the above (sub)program compiled and ready to run by the executable name subProgram.

Let's assume you are as fast as a Jolteon and accomplish to start executing ./subProgram in every single terminal, such that the starting time that the terminal/user submitted the program for execution to the OS is the same.

Which subprogram will print first?

You cannot tell! Since you have 8 subprograms that want to use one resource (the screen). So, whoever arrives first at the finish line, the screen, will print first.

That's an analogy to what happens in parallel execution.


Notice that in real-world programs, when the data are bigger (of course they are! Otherwise why bother with ?), even the order of the output of every program is not guaranteed.

That means that the 1st (sub)program may execute the first iteration, but by the time it will execute the second iteration, another (sub)program may have already executed its first execution and claim the resource (screen).