Rajesh Rajesh - 3 months ago 8
Java Question

How does join() work in java? Does it guarantee the execution before main()?

I am trying to understand the code flow with join().

public class Multi extends Thread {

public void run() {
for (int i = 0; i < 5; i++) {
System.out.println(Thread.currentThread().getName());
}
}

public static void main(String[] args) {

Thread t1 = new Multi();
Thread t2 = new Multi();
Thread t3 = new Multi();
Thread t4 = new Multi();

t1.start();
try {
t1.join();
} catch (Exception e) {
}

t2.start();
t3.start();
try {
t3.join();
} catch (Exception e) {
}
t4.start();

System.out.println("........" + Thread.currentThread().getName());

t1.setName("A");
t2.setName("B");
t3.setName("C");
t4.setName("D");
}
}


The output is always, as i observe after running program many times, that
thread t1
executes first and it will
complete it's execution
without any context switching, and whenever
t3 will start it will completes it's execution
. Is my understanding clear?

I observe something that, if no join is used,
main()
executes anywhere b/w the execution of threads, means i see
........main
output in between the outputs of my program, but after
join()
it executes always after thread t3. Here is my doubt as
main()
starts before the
join()
syntax, so it should not follow the t3/t1 thread completion? Does it make sense or something I am missing?

Answer

Short answer

How does join() work in java?

I grant you that the javadoc for join() is a little bit unclear.

It means that calling t.join() makes the caller thread wait for the thread t to finish its execution. The word this in the doc refers to t here, not to the thread that calls the instruction.

Does it guarantee the execution before main()?

[...] if no join is used, main() executes anywhere b/w the execution of threads [...]

You shouldn't consider main() as a whole. Parts of main() are executed before the other threads, parts of it in parallel, and parts of it after. That's actually what start() and join() control. Let me explain below.

What happens in your main()

Here is the sequence of events regarding t1.start() and t1.join(). You can obviously think the same way for t3.

  1. The instructions of main() before t1.start() are executed before t1.run().

  2. t1.start() starts the thread t1 in parallel(*) of the main thread
    Note: t1.run() might not start right away.

  3. The instructions of main() between t1.start() and t1.join() are executed in parallel(*) of t1.run().
    Note: You have none in your example, so it defeats the purpose of multithreading.

  4. t1.join():

    • if t1.run() has already finished, nothing happens and main() keeps going
    • if t1.run() has not finished yet, the main thread stops and waits until t1.run() finishes. Then t1.run() finishes, and then main() resumes.
  5. The instructions of main() after t1.join() are guaranteed to be executed after t1.run().

(*) see below section about parallelism

What I mean by "executed in parallel"

Suppose you have these 2 sets of instructions being executed in 2 threads A and B:

// Thread A                   |     // Thread B
                              | 
System.out.println("A1");     |     System.out.println("B1");
System.out.println("A2");     |     System.out.println("B2");
System.out.println("A3");     |     System.out.println("B3");

If these 2 threads are "executed in parallel", this means 3 things:

  • the order of execution of the instructions of thread A is guaranteed:
    A1 will execute before A2, and A2 before A3.

  • the order of execution of the instructions of thread B is guaranteed:
    B1 will execute before B2, and B2 before B3.

  • however, A's and B's instructions can be interlaced, which means all of the following are possible (and more):

A1, B1, A2, B2, B3, A3

B1, B2, A1, B3, A2, A3

A1, A2, A3, B1, B2, B3 // special case where A's are all executed before B's

B1, B2, B3, A1, A2, A3 // special case where B's are all executed before A's


Note: this section dealt with parallelism as an illusion created by the OS to make the user feel like things run at the same time, where actually there is only one core executing instructions sequentially, jumping from one process/thread to another.

In fact, an A instruction and a B instruction could be executed simultaneously (real parallelism) on 2 separate cores. The 3 bullet points above still stand anyway. As @jameslarge pointed out, usually we model concurrency with a sequence of events, even for multicores. This leaves aside the concept of simultaneity of 2 events, which does not bring anything useful but complications.