Why Multiprocessing Is Slow
Solution 1:
I'm thinking your method of multiprocessing is poor. Rather than splitting the work into 10 processes and starting them all up at once, you're starting a single process at a time and each one is doing a single unit of work then exiting. Your implementation will create (and then destroy) 50 processes over its lifetime, which creates a lot of overhead.
You're also joining the processes immediately after starting them, which will make it so you're never actually running multiple processes. The join makes it wait for the child process to finish before continuing.
Finally, there's gotta be a better way to return results that using a queue and getting a single value at a time. If you can start each process at once with a set of work to do and then return the results in list to the master thread, you could reduce the overhead of using the Queue.
Solution 2:
Your for i in range()
is waiting for the process to finish when you .join()
it. So, basically, you're spawning a new process, which consumes the queue and reports the result, then your spawning 9 other processes to check an empty queue.
.join()
Block the calling thread until the process whose join() method is called terminates or until the optional timeout occurs.
Pools are an easier way to do the same thing.
Check this answer for using map_async()
with a pool of workers:
Solution 3:
There is a simple rule in multiprocessing: if the work for splitting (create child tasks) + joining (join results, etc) for creating the multiprocessing > sequential time then your 'parallel' version will be inefficient respect to the sequential one. This is your case. Try to generate one million of numbers(keeping in 10 your number of processes) and you will see the difference.
Good coding tips by @Sohcahtoa82. Take them in mind as well.
Post a Comment for "Why Multiprocessing Is Slow"