-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pipeline stuck for user because TRIP_SEGMENTATION
did not mark 'completed' nor 'failed'
#1075
Comments
This can also sometime happen if the pipeline takes a long time to run and a new instance is launched before the first is complete. At least on AWS, new scheduled tasks should be launched only when the previous one completes, but apparently sometimes this doesn't happen. I typically use |
I reset the pipeline for this user, and it looked like it failed again during |
We checked, and the pipeline for this user is still running, stuck in trip segmentation.
In parallel, even if another run is launched, maybe because AWS thought that this one was "stuck", that run will fail in the trip segmentation stage, but the Some potential fixes:
|
The user has a 2-week backlog of draft trips, since May 7. I searched for error messages for the user in CloudWatch with the filter expression
%ERROR.*uuidOfTheUser%
I found this error, which occurs repeatedly (presumably on every pipeline run):
ERROR:139822526199616:Found error curr_state.curr_run_ts = 1715124566.4760673 while processing pipeline for user █████, skipping
The error occurs shortly after
INFO:139822526199616:**********UUID █████: segmenting into trips**********
.The first time the error occurred was
2024-05-08T01:49:41.377Z
.My understanding of this is that there's a
curr_run_ts
left over from the previous time theTRIP_SEGMENTATION
stage ran. Which means it exited improperly without being marked as 'completed' or 'failed'.I checked logs of the previous run of the pipeline, but I couldn't find anything out of the ordinary.
The text was updated successfully, but these errors were encountered: