Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat (processor/k8sattributes): wait for synced when starting k8sattributes processor. #32622

Merged
merged 8 commits into from
Nov 17, 2024

Conversation

h0cheung
Copy link
Contributor

@h0cheung h0cheung commented Apr 23, 2024

Description:

When starting k8sattributes processor, block until an initial synchronization has been completed. This solves #32556

Link to tracking Issue:

fix #32556

Testing:

Tested in a cluster with constant high span traffic, no more spans with unassociated k8s metadata after adding this pr.

Documentation:

Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added the Stale label May 10, 2024
@github-actions github-actions bot removed the Stale label May 17, 2024
@orloffv
Copy link

orloffv commented May 30, 2024

@h0cheung Hi there, I was wondering if you have the time to finalize and merge the pull request. Looking forward to your update. Thanks!

Copy link
Contributor

@fatsheep9146 fatsheep9146 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your patience!

I have an advice about the two new options start_timeout and error_when_timeout.

The reason for this enhancement is to make sure the k8sattributes processor will start consuming the incoming traces or metrics until the initial synchronization are completed,
otherwise, some traces/metrics/logs may be unassociated with k8s metadata.

But I think some users may care more about the start rate of collector than the completeness of the k8s metadata.

So how about we change this two fields to

  • force_wait_for_cache_synced bool default is false
  • wait_for_synced_timeout time.Duration

The users who care more about the completeness could set force_wait_for_cache_synced bool to true and if the synced is still not completed after timeout wait_for_synced_timeout, the collector should directly return error instead of running, since the completeness could not be guaranteed.

@h0cheung @TylerHelmuth

@h0cheung
Copy link
Contributor Author

h0cheung commented Jun 11, 2024

Thanks for your patience!

I have an advice about the two new options start_timeout and error_when_timeout.

The reason for this enhancement is to make sure the k8sattributes processor will start consuming the incoming traces or metrics until the initial synchronization are completed, otherwise, some traces/metrics/logs may be unassociated with k8s metadata.

But I think some users may care more about the start rate of collector than the completeness of the k8s metadata.

So how about we change this two fields to

  • force_wait_for_cache_synced bool default is false
  • wait_for_synced_timeout time.Duration

The users who care more about the completeness could set force_wait_for_cache_synced bool to true and if the synced is still not completed after timeout wait_for_synced_timeout, the collector should directly return error instead of running, since the completeness could not be guaranteed.

@h0cheung @TylerHelmuth

I agree. What are your thoughts? @TylerHelmuth

@orloffv
Copy link

orloffv commented Jun 24, 2024

@TylerHelmuth can you see, plz?

@TylerHelmuth
Copy link
Member

@fatsheep9146 I like that idea. I suggest adjusting naming it wait_for_metadata and wait_for_metadata_timeout to hide some of the details from the end user. I believe we use the term metadata already to refer to the pod data that is gathered and held.

Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added the Stale label Jul 17, 2024
@h0cheung
Copy link
Contributor Author

This PR was marked stale due to lack of activity. It will be closed in 14 days.

remove stale

@h0cheung
Copy link
Contributor Author

@fatsheep9146 I like that idea. I suggest adjusting naming it wait_for_metadata and wait_for_metadata_timeout to hide some of the details from the end user. I believe we use the term metadata already to refer to the pod data that is gathered and held.

Thanks. These two configuration items look simple and clear. I think we can use them.

@mx-psi mx-psi removed the Stale label Jul 25, 2024
@valner
Copy link

valner commented Aug 7, 2024

@h0cheung Hi, could you take some time to work out the code and merge the pull request? I’d really appreciate your update on this. Thanks a lot

Copy link
Member

@ChrsMark ChrsMark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One minor wording improvement :).

@h0cheung h0cheung force-pushed the feat-k8sblock branch 4 times, most recently from 767326c to 73065a6 Compare November 7, 2024 03:08
@ChrsMark
Copy link
Member

ChrsMark commented Nov 7, 2024

@fatsheep9146 @dmitryax could you take a look as well? Otherwise this should be good to go.

@TylerHelmuth
Copy link
Member

@fatsheep9146 @dmitryax I plan to merge by end of week, please comment if you have objections

Copy link
Contributor

@fatsheep9146 fatsheep9146 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks

@h0cheung h0cheung force-pushed the feat-k8sblock branch 3 times, most recently from 26a7ef7 to 6db3f20 Compare November 15, 2024 11:33
@TylerHelmuth TylerHelmuth merged commit 97659b5 into open-telemetry:main Nov 17, 2024
158 checks passed
@github-actions github-actions bot added this to the next release milestone Nov 17, 2024
RutvikS-crest pushed a commit to RutvikS-crest/opentelemetry-collector-contrib that referenced this pull request Dec 9, 2024
…ibutes processor. (open-telemetry#32622)

**Description:** <Describe what has changed.>
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->

When starting `k8sattributes` processor, block until an initial
synchronization has been completed. This solves open-telemetry#32556

**Link to tracking Issue:** <Issue number if applicable>

fix open-telemetry#32556

**Testing:** <Describe what testing was performed and which tests were
added.>

Tested in a cluster with constant high span traffic, no more spans with
unassociated k8s metadata after adding this pr.

**Documentation:** <Describe the documentation added.>

---------

Co-authored-by: Christos Markou <[email protected]>
sbylica-splunk pushed a commit to sbylica-splunk/opentelemetry-collector-contrib that referenced this pull request Dec 17, 2024
…ibutes processor. (open-telemetry#32622)

**Description:** <Describe what has changed.>
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->

When starting `k8sattributes` processor, block until an initial
synchronization has been completed. This solves open-telemetry#32556

**Link to tracking Issue:** <Issue number if applicable>

fix open-telemetry#32556

**Testing:** <Describe what testing was performed and which tests were
added.>

Tested in a cluster with constant high span traffic, no more spans with
unassociated k8s metadata after adding this pr.

**Documentation:** <Describe the documentation added.>

---------

Co-authored-by: Christos Markou <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
processor/k8sattributes k8s Attributes processor
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Some data couldn't be associated with k8s metadata when the agent was just started.
7 participants