Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Airflow Scheduler #335

Open
wants to merge 3 commits into
base: lyft-stable-2.3.4
Choose a base branch
from

Conversation

luisglft
Copy link

@luisglft luisglft commented Oct 21, 2024

Using PostgreSQL ROW_NUMBER, in combination with getting the real dag limit (max_active_tasks - running_tasks) we are able to query for the task_instances that can actually be executed, instead of getting all of them and then filtering those who can't be executed.

This PR improves the way airflow scheduler takes batches of TIs to execute, so now we won't be seeing this message:
Not executing %s since the number of tasks running or queued from DAG %s is >= to the DAG's max_active_tasks limit of %s.

This issue gets more visible when we run a DAG with many tasks (close to the max_tis_per_query param), and with a high priority weight. This DAG will take all the scheduler slots most of the time, even when only 1 task from it can be executed, leading to having many task stuck in scheduled with lower priority.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant