Set ThreadPool as default executor #383

kyboi · 2024-11-27T12:51:21Z

There are many workflows that require interweaving async and non async (CPU intensive) blocking code. These cannot each be split up into separate tasks because there are locally stored files involved. The best solution is thus to offload the blocking tasks to the executor so as to not block the asyncio loop.

If I understand correctly, each worker process starts a ThreadPoolExecutor in which sync tasks are run. Being able to access this thread pool instead of making another one would be ideal. Currently we are working around this by having a custom receiver, accessing the instance of the threadpool and storing the reference in the application state.

from taskiq.receiver import Receiver


class CustomReceiver(Receiver):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        # Store the executor in the broker's state for global access
        # This allows us to run CPU-heavy code on the workers
        # without blocking the asyncio loop
        self.broker.state.executor = self.executor

But I believe a much better solution would be to simply set the created threadpool as the default executor for the asyncio loop so it can be used without passing the reference around:

with ThreadPoolExecutor(args.max_threadpool_threads) as pool:
    loop = asyncio.get_event_loop()
    loop.set_default_executor(self.executor)

await asyncio.get_running_loop().run_in_executor(None, func)

Or in addition / at the minimum allow us to get the instance of the executor from the API.

kyboi · 2024-12-12T11:15:08Z

In fact, it doesn't seem like a good idea to have more than one thread in that threadpool if all it is used for is genuine CPU-intensive sync tasks, provided IO tasks are run with asyncio.

As far as I can tell, the default of having many threads in a thread pool predates the widespread use of asyncio. If you are only doing blocking tasks on threads, due to the GIL, it is counter-productive to have more than one thread, and using the --workers options with multiprocessing should instead be used to match the CPU count.

https://docs.python.org/3/library/concurrent.futures.html#concurrent.futures.ThreadPoolExecutor

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Set ThreadPool as default executor #383

Set ThreadPool as default executor #383

kyboi commented Nov 27, 2024 •

edited

Loading

kyboi commented Dec 12, 2024 •

edited

Loading

Set ThreadPool as default executor #383

Set ThreadPool as default executor #383

Comments

kyboi commented Nov 27, 2024 • edited Loading

kyboi commented Dec 12, 2024 • edited Loading

kyboi commented Nov 27, 2024 •

edited

Loading

kyboi commented Dec 12, 2024 •

edited

Loading