Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-30960] Fix a corner case oom error #77

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

bs352
Copy link

@bs352 bs352 commented Nov 3, 2023

I fixed a corner case out-of-memory error when we use a timed flushing interval, where the scheduler thread could repeatly throw a new RuntimeException wrapping the previous one as cause. This could lead to an infinite object reference chain.

Copy link

boring-cyborg bot commented Nov 3, 2023

Thanks for opening this pull request! Please check out our contributing guidelines. (https://flink.apache.org/contributing/how-to-contribute.html)

* catch clause. In that case, we already have
* flushException cached, waiting for the next task
* manager thread's flush call which would throw a new
* FlushingRuntimeException causing job failure.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is another case in this class where there is a flush() and is issues a RuntimeException, should we use the new Exception and catch it there as well; or can this case never loop. There are other calls to flush() - do we think these are not effected?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, the close method calls flush as well. In this case, the scheduler is already shutdown before calling flush, so it will never loop.

It won't hurt if we throw the new FlushingRuntimeException here instead, as it will propagate up to be handled by TaskManager the same way as a regular RuntimeException.

@bs352 bs352 requested a review from davidradl January 4, 2024 15:17
@davidradl
Copy link
Contributor

is this the same as #5

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants