Skip to content

Commit

Permalink
Rewite LocalExecutor to be simpler, and to shutdown cleanly on Python…
Browse files Browse the repository at this point in the history
… 3.10+ (apache#23944)

Something changed between Python 3.7 and 3.10 meaning that a limited
parallelism LocalExecutor scheduler now doesn't shutdown cleanly on
receiving a signal.

On closer inspection of the limited vs unlimited path it apepars to me that
the code was "over-generalized" and the entire concept of `self.impl` has been
removed hopefully making this code much more direct and easier to understand.

The key things are now:

- When a task needs to be run, we send the message on a mp.SimpleQueue object,
  and increment an internal counter.

  (We use our own counter, not qsize method as that is not portable)

- Inside _check_workers we see if we think there are any outstanding messages,
  and create a worker if there are.

  The reason we do this is the on macOS (where the default mp start method is
  "spawn") a process will be started via `exeucte_async`, but it will take a
  second or two to pull the message of the queue, by which time the scheduler
  will have called `executor.sync()` again, meaning we'd over create workers
  (but never above the limit).

  Avoiding that case is why we keep the internal `_unread_messages`
  counter -- `self.activity_queue.empty()` would return False when the worker
  is booting up.

- We remove the entire use of `multiprocessing.Manager` -- it doesn't seem to
  do anything other than create queue objects but for our use it just adds
  complexity to understanding

- Almost as a side-effect we now only create worker subprocesses on demand,
  instead of pre-launching them.

  We do not currently shut down idle processes, though adding it would be
  quite straight forward if we wanted to in the future

This branch name was "rewrite-local-exexc-concurrentfutures" (sic) as when
originally opened in 2022 for 3.10 that was the plan.

However since then 3.12 has come out and it now starts issuing warnings when
Fork and threads are used, and concurrent.futures uses a thread internally, so
a different approach was used.
  • Loading branch information
ashb authored Nov 20, 2024
1 parent ca971b3 commit 47cdb84
Show file tree
Hide file tree
Showing 2 changed files with 252 additions and 365 deletions.
Loading

0 comments on commit 47cdb84

Please sign in to comment.