Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

loki.source.docker: positions removed on transient docker error #5797

Closed
FerdinandvHagen opened this issue Nov 16, 2023 · 1 comment · Fixed by #5798
Closed

loki.source.docker: positions removed on transient docker error #5797

FerdinandvHagen opened this issue Nov 16, 2023 · 1 comment · Fixed by #5798
Labels
bug Something isn't working frozen-due-to-age Locked due to a period of inactivity. Please open new issues or PRs if more discussion is needed.

Comments

@FerdinandvHagen
Copy link
Contributor

FerdinandvHagen commented Nov 16, 2023

What's wrong?

When a transient docker error occurs the position will be removed from the positions file leading to the agent re-ingesting the docker logs from the beginning.

According to this comment the position is removed from the file if it is missing from the "new targets".

This function is called from here where the targets are being created based on the target input - the targets array is filled within the loop where we call syncTargets as well. This obviously causes the targets in syncTargets to be incomplete and therefore cause the targets to be deregistered - and restarts the entire log ingestion as this automatically resets the position file.

I would expect that this line should instead be called after the for loop - when the targets slice is fully populated.

Steps to reproduce

Unclear how to reproduce. Also not entirely sure why this was never reported as a bug as I would expect that anybody using the docker source would have to experience this issue of regular restarts of the docker log ingestion.

System information

No response

Software version

at leat from v0.37.1

Configuration

No response

Logs

ts=2023-11-16T15:22:16.382180877Z level=error msg="could not set up a wait request to the Docker client" target=b50f59e857184efafc04d640a74b10ee26a854e311c39a80c0fee6baa40ee251 component=loki.source.docker.default error=""
ts=2023-11-16T15:22:16.382302835Z level=warn msg="could not transfer logs" component=loki.source.docker.default target=docker/b50f59e857184efafc04d640a74b10ee26a854e311c39a80c0fee6baa40ee251 written=0 container=b50f59e857184efafc04d640a74b10ee26a854e311c39a80c0fee6baa40ee251 err="context canceled"
ts=2023-11-16T15:22:16.382410919Z level=info msg="removing entry from positions file" component=loki.source.docker.default path=cursor-b50f59e857184efafc04d640a74b10ee26a854e311c39a80c0fee6baa40ee251 labels="{__address__=\"172.17.0.2\", __meta_docker_container_id=\"b50f59e857184efafc04d640a74b10ee26a854e311c39a80c0fee6baa40ee251\", __meta_filepath=\"/etc/agent/loki.json\"}"
ts=2023-11-16T15:22:16.382527627Z level=info msg="finished node evaluation" controller_id="" node_id=loki.source.docker.default duration=712.333µs
@FerdinandvHagen FerdinandvHagen added the bug Something isn't working label Nov 16, 2023
@MichaelSasser
Copy link

A reproducible example could be #4403.

@github-actions github-actions bot added the frozen-due-to-age Locked due to a period of inactivity. Please open new issues or PRs if more discussion is needed. label Feb 21, 2024
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 21, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working frozen-due-to-age Locked due to a period of inactivity. Please open new issues or PRs if more discussion is needed.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants