Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Research cancelation options #235

Open
tomkis opened this issue Mar 3, 2025 · 1 comment
Open

Research cancelation options #235

tomkis opened this issue Mar 3, 2025 · 1 comment

Comments

@tomkis
Copy link
Collaborator

tomkis commented Mar 3, 2025

We need to figure out what are our options of cancelling running agents.

Let's investigate typical options user have using

  • Bee Framework
  • Crew AI
  • Autogen
  • Langchain

Let's also investigate all the out of the box agents we support now (eg. gpt-researcher).

The cancelation should be immediate and no trailing processes / resources should be left.

@jezekra1
Copy link
Contributor

jezekra1 commented Mar 3, 2025

Some comments:

  • Asyncio-native frameworks can be cancelled by cancelling all their async tasks - if there is no background synchronous work offloaded to threads
  • Threads cannot be terminated or cancelled in python without cooperation of the code inside the thread, we do not have control over the threads that libraries spawn.
  • We should not restrict users to use asyncio, it is still not widely adopted in the research community.

The most radical option is to use a subprocess for each agent run. The subprocess can be easily killed with force even if it does not cooperate, however the tradeoff is a higher memory usage and loading time, because the entire python interpreter and all libraries are loaded separately in a subprocess. The official python documentation marks the fork method (which would mitigate both of these issues) unsafe on MacOS:

Changed in version 3.8: On macOS, the spawn start method is now the default. The fork start method should be considered unsafe as it can lead to crashes of the subprocess as macOS system libraries may start threads. See bpo-33725.

We can try it nonetheless as we are out of options. Loading times are notoriousily high for some machine learning libraries and I assume also agentic frameworks since they often bring tons of dependencies, that would cause a delay of 1 to 10 seconds for every agent run unless we keep a pool of "ready" processes open (which would consume tons of memory).

Another valid choice IMO is to just cancel the tasks on the asyncio level and let any background work finish (that's how MCP does it at the moment). I think we can manage to stop the running process eventually (maybe after a tool call finishes, depending on the granularity level of the callback of the framework, crewAI for example has no means of cancellation and all issues on their github are closed as "not planned")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants