bcorso bcorso - 3 months ago 63
Java Question

rxJava Schedulers Use Cases

In RxJava there are 5 different schedulers to choose from:



  1. immediate(): Creates and returns a Scheduler that executes work immediately on the current thread.

  2. trampoline(): Creates and returns a Scheduler that queues work on the current thread to be executed after the current work completes.

  3. newThread(): Creates and returns a Scheduler that creates a new Thread for each unit of work.

  4. computation(): Creates and returns a Scheduler intended for computational work. This can be used for event-loops, processing callbacks and other computational work. Do not perform IO-bound work on this scheduler. Use Schedulers.io() instead.

  5. io(): Creates and returns a Scheduler intended for IO-bound work.
    The implementation is backed by an Executor thread-pool that will grow as needed. This can be used for asynchronously performing blocking IO. Do not perform computational work on this scheduler. Use Schedulers.computation() instead.




Questions:



The first 3 schedulers are pretty self explanatory; however, I'm a little confused about computation and io.


  1. What exactly is "IO-bound work"? Is it used for dealing with streams (
    java.io
    ) and files (
    java.nio.files
    )? Is it used for database queries? Is it used for downloading files or accessing REST APIs?

  2. How is computation() different from newThread()? Is it that all computation() calls are on a single (background) thread instead of a new (background) thread each time?

  3. Why is it bad to call computation() when doing IO work?

  4. Why is it bad to call io() when doing computational work?


Answer

Great questions, I think the documentation could do with some more detail.

  1. io() is backed by an unbounded thread-pool and is the sort of thing you'd use for non-computationally intensive tasks, that is stuff that doesn't put much load on the CPU. So yep interaction with the file system, interaction with databases or services on a different host are good examples.
  2. computation() is backed by a bounded thread-pool with size equal to the number of available processors. If you tried to schedule cpu intensive work in parallel across more than the available processors (say using newThread()) then you are up for thread creation overhead and context switching overhead as threads vie for a processor and it's potentially a big performance hit.
  3. It's best to leave computation() for CPU intensive work only otherwise you won't get good CPU utilization.
  4. It's bad to call io() for computational work for the reason discussed in 2. io() is unbounded and if you schedule a thousand computational tasks on io() in parallel then each of those thousand tasks will each have their own thread and be competing for CPU incurring context switching costs.