I am learning how to use the
What Giulio Franco says is true for multithreading vs. multiprocessing in general.
However, Python* has an added issue: There's a Global Interpreter Lock that prevents two threads in the same process from running Python code at the same time. This means that if you have 8 cores, and change your code to use 8 threads, it won't be able to use 800% CPU and run 8x faster; it'll use the same 100% CPU and run at the same speed. (In reality, it'll run a little slower, because there's extra overhead from threading, even if you don't have any shared data, but ignore that for now.)
There are exceptions to this. If your code's heavy computation doesn't actually happen in Python, but in some library with custom C code that does proper GIL handling, like a numpy app, you will get the expected performance benefit from threading. The same is true if the heavy computation is done by some subprocess that you run and wait on.
More importantly, there are cases where this doesn't matter. For example, a network server spends most of its time reading packets off the network, and a GUI app spends most of its time waiting for user events. One reason to use threads in a network server or GUI app is to allow you to do long-running "background tasks" without stopping the main thread from continuing to service network packets or GUI events. And that works just fine with Python threads. (In technical terms, this means Python threads give you concurrency, even though they don't give you core-parallelism.)
But if you're writing a CPU-bound program in pure Python, using more threads is generally not helpful.
Using separate processes has no such problems with the GIL, because each process has its own separate GIL. Of course you still have all the same tradeoffs between threads and processes as in any other languages—it's more difficult and more expensive to share data between processes than between threads, it can be costly to run a huge number of processes or to create and destroy them frequently, etc. But the GIL weighs heavily on the balance toward processes, in a way that isn't true for, say, C or Java. So, you will find yourself using multiprocessing a lot more often in Python than you would in C or Java.
Meanwhile, Python's "batteries included" philosophy brings some good news: It's very easy to write code that can be switched back and forth between threads and processes with a one-liner change.
If you design your code in terms of self-contained "jobs" that don't share anything with other jobs (or the main program) except input and output, you can use the
concurrent.futures library to write your code around a thread pool like this:
with concurrent.futures.ThreadPoolExecutor(max_workers=4) as executor: executor.submit(job, argument) executor.map(some_function, collection_of_independent_things) # ...
You can even get the results of those jobs and pass them on to further jobs, wait for things in order of execution or in order of completion, etc.; read the section on
Future objects for details.
Now, if it turns out that your program is constantly using 100% CPU, and adding more threads just makes it slower, then you're running into the GIL problem, so you need to switch to processes. All you have to do is change that first line:
with concurrent.futures.ProcessPoolExecutor(max_workers=4) as executor:
The only real caveat is that your jobs' arguments and return values have to be pickleable (and not take too much time or memory to pickle) to be usable cross-process. Usually this isn't a problem, but sometimes it is.
But what if your jobs can't be self-contained? If you can design your code in terms of jobs that pass messages from one to another, it's still pretty easy. You may have to use
multiprocessing.Process instead of relying on pools. And you will have to create
multiprocessing.Queue objects explicitly. (There are plenty of other options—pipes, sockets, files with flocks, … but the point is, you have to do something manually if the automatic magic of an Executor is insufficient.)
But what if you can't even rely on message passing? What if you need two jobs to both mutate the same structure, and see each others' changes? In that case, you will need to do manual synchronization (locks, semaphores, conditions, etc.) and, if you want to use processes, explicit shared-memory objects to boot. This is when multithreading (or multiprocessing) gets difficult. If you can avoid it, great; if you can't, you will need to read more than someone can put into an SO answer.
From a comment, you wanted to know what's different between threads and processes in Python. Really, if you read Giulio Franco's answer and mine and all of our links, that should cover everything… but a summary would definitely be useful, so here goes:
threadingmodule doesn't have some of the features of the
multiprocessingmodule. (You can use
multiprocessing.dummyto get most of the missing API on top of threads, or you can use higher-level modules like
concurrent.futuresand not worry about it.)
* It's not actually Python, the language, that has this issue, but CPython, the "standard" implementation of that language. Some other implementations don't have a GIL, like Jython.
** If you're using the fork start method for multiprocessing—which you can on most non-Windows platforms—each child process gets any resources the parent had when the child was started, which can be another way to pass data to children.