Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

k8sattributes doesn't work with both pod UID and container name in sources #32104

Closed
paulius-valiunas opened this issue Apr 2, 2024 · 17 comments
Labels
bug Something isn't working closed as inactive help wanted Extra attention is needed processor/k8sattributes k8s Attributes processor Stale

Comments

@paulius-valiunas
Copy link

paulius-valiunas commented Apr 2, 2024

Component(s)

processor/k8sattributes

What happened?

Description

If I provide pod UID, container name, and namespace name, to the k8sattributes processor, it doesn't seem to work.

Steps to Reproduce

  k8sattributes/filelog:
    pod_association:
      - sources:
        - from: resource_attribute
          name: k8s.pod.uid
        - from: resource_attribute
          name: k8s.container.name
        - from: resource_attribute
          name: k8s.namespace.name
    extract:
      metadata:
        - k8s.node.name
        - k8s.pod.uid
        - k8s.container.name
        - k8s.namespace.name

Expected Result

Should add k8s.node.name to the resource

Actual Result

Doesn't add anything. Doesn't log anything useful either. With debug logs enabled:

2024-03-29T16:27:32.474Z	debug	[email protected]/processor.go:108	evaluating pod identifier	{"kind": "processor", "name": "k8sattributes/filelog", "pipeline": "logs", "value": [{"Source":{"From":"resource_attribute","Name":"k8s.pod.uid"},"Value":"814b693a-dbc8-44aa-b7ad-22f919e5569c"},{"Source":{"From":"resource_attribute","Name":"k8s.container.name"},"Value":"pod-619863854748671e213bb333e56981de795eeeb0"},{"Source":{"From":"resource_attribute","Name":"k8s.namespace.name"},"Value":"imodeljs"},{"Source":{"From":"","Name":""},"Value":""}]}

After removing k8s.container.name from sources, the processor started adding the node name to my resources.

Collector version

OpenTelemetry Operator 0.42.3

Environment information

Environment

OS: (e.g., "Ubuntu 20.04")
Compiler(if manually compiled): (e.g., "go 14.2")

OpenTelemetry Collector configuration

No response

Log output

No response

Additional context

No response

@paulius-valiunas paulius-valiunas added bug Something isn't working needs triage New item requiring triage labels Apr 2, 2024
Copy link
Contributor

github-actions bot commented Apr 2, 2024

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the processor/k8sattributes k8s Attributes processor label Apr 2, 2024
@swiatekm
Copy link
Contributor

swiatekm commented Apr 2, 2024

I think this is because k8s.container.name is not actually added to pod attributes that association can check. If so, this should either be documented, or ideally should result in an error.

@paulius-valiunas
Copy link
Author

Here's an excerpt from the documentation:

Additional container level attributes can be extracted provided that certain resource attributes are provided:

  1. If the container.id resource attribute is provided, the following additional attributes will be available:
    • k8s.container.name
    • container.image.name
    • container.image.tag
  2. If the k8s.container.name resource attribute is provided, the following additional attributes will be available:
    • container.image.name
    • container.image.tag
  3. If the k8s.container.restart_count resource attribute is provided, it can be used to associate with a particular container
    instance. If it's not set, the latest container instance will be used:
    • container.id (not added by default, has to be specified in metadata)

This led me to believe that k8s.container.name is supported, and that I can add it in addition to the other attributes. Sounds like that's incorrect and the README should be updated...

@TylerHelmuth
Copy link
Member

@paulius-valiunas condition 1. states that k8s.container.name can be extracted as long as the container.id resource attribute is already present. Does your data have the resource attribute container.id?

This feels like a possible duplicate of #31509, do you agree?

@TylerHelmuth
Copy link
Member

If you're trying to associate the telemetry to pod data using k8s.container.name it must be present in the resource attributes already - you cannot use the processor to add k8s.container.name and use it for association at the same time.

@TylerHelmuth TylerHelmuth added discussion needed Community discussion needed and removed needs triage New item requiring triage labels Apr 4, 2024
@swiatekm
Copy link
Contributor

swiatekm commented Apr 5, 2024

@TylerHelmuth I don't think this works either way. If you look at the implementation of Pod cache updates, it goes like so:

  1. Extract Pod attributes (but not container attributes) and put them in the Attributes field.
  2. Extract container data (but not attributes, all the data) and put it in the Containers field.
  3. Determine Pod identity based on association configuration. This can use arbitrary attributes, but only ones in the Attributes field.
  4. Save Pod data under all the determined identities.

Container attributes are not calculated during this process, but instead at runtime based on the Containers field:

kp.addContainerAttributes(resource.Attributes(), pod)
.

@paulius-valiunas
Copy link
Author

@paulius-valiunas condition 1. states that k8s.container.name can be extracted as long as the container.id resource attribute is already present. Does your data have the resource attribute container.id?

This feels like a possible duplicate of #31509, do you agree?

I'm not trying to extract the container name, I already have it. I'm just trying to get the image name and tag.

@paulius-valiunas
Copy link
Author

paulius-valiunas commented Apr 5, 2024

Also, the documentation says:

Only attribute names from metadata should be used for pod_association's resource_attribute, because empty or non-existing values will be ignored.

That's why I added the "already known" fields to both metadata and pod_association. Seems like this sentence is incorrect or in need of clarification, because I later removed everything except node name from metadata and it still worked.

@TylerHelmuth
Copy link
Member

@swiatekm-sumo you're right. We should add to the README that container fields can't be used for association.

Seems like this sentence is incorrect or in need of clarification, because I later removed everything except node name from metadata and it still worked.

Yes I agree, I think that sentence is wrong and could be removed.

I'm not trying to extract the container name, I already have it. I'm just trying to get the image name and tag.

Similar to #31509 (comment), can you check that the container name on the telemetry matches the container name the processor is recording? If they don't match the other container details cannot be extracted.

I also suggest you make the pod_association section look like this:

    pod_association:
    - sources:
      - from: resource_attribute
        name: k8s.pod.ip
    - sources:
      - from: resource_attribute
        name: k8s.pod.uid
    - sources:
      - from: connection

As @swiatekm-sumo pointed out k8s.container.name won't work as an association and k8s.namespace.name isn't unique to a pod.

@paulius-valiunas
Copy link
Author

We should add to the README that container fields can't be used for association.

In that case, you should remove this section too:

  1. If the k8s.container.name resource attribute is provided, the following additional attributes will be available:
    • container.image.name
    • container.image.tag

If I may ask, how could you possibly get container.image.name using this processor?

Similar to #31509 (comment), can you check that the container name on the telemetry matches the container name the processor is recording? If they don't match the other container details cannot be extracted.

Yes, I'm getting the correct container name, everything comes from Kubernetes log files.

I also suggest you make the pod_association section look like this:

    pod_association:
    - sources:
      - from: resource_attribute
        name: k8s.pod.ip
    - sources:
      - from: resource_attribute
        name: k8s.pod.uid
    - sources:
      - from: connection

I'm getting all my logs from a filelog receiver so the connection/IP-related sources won't work.

As @swiatekm-sumo pointed out k8s.container.name won't work as an association and k8s.namespace.name isn't unique to a pod.

You have a point, k8s.namespace.name isn't unique to a pod, it's only there because I was previously using it in combination with k8s.pod.name. Only the combination of both is unique. I later replaced k8s.pod.name with k8s.pod.uid but didn't remove the namespace.

@TylerHelmuth
Copy link
Member

TylerHelmuth commented Apr 10, 2024

@paulius-valiunas while container fields do nothing in the pod_association section, they have meaning in the extract.metadata section.

This section of the README is accurate:

Additional container level attributes can be extracted provided that certain resource attributes are provided:

1. If the `container.id` resource attribute is provided, the following additional attributes will be available:
   - k8s.container.name
   - container.image.name
   - container.image.tag
2. If the `k8s.container.name` resource attribute is provided, the following additional attributes will be available:
   - container.image.name
   - container.image.tag
3. If the `k8s.container.restart_count` resource attribute is provided, it can be used to associate with a particular container
   instance. If it's not set, the latest container instance will be used:
   - container.id (not added by default, has to be specified in `metadata`)

Although I think we should update the word provided to something clearer. The README is trying to convey that the listed attributes, when configured in extract.metadata, can only be extracted when the named resource attribute is present in the telemetry's resource.

In your situation, you''ll be able to associate the pod logs you're collecting via the filelog receiver to the pod metadata via k8s.pod.uid. If you're using the Collector Helm Chart logsCollection preset, the filelogreceiver configuration will extract this resource attribute for you: https://github.com/open-telemetry/opentelemetry-helm-charts/blob/abe4c7dc5a5f384d5f847a0aa7bfeb12dca9ef90/charts/opentelemetry-collector/templates/_config.tpl#L276-L278 (as well as the k8s.container.name.

Your k8sattributesprocessor config can look like this:

  k8sattributes/filelog:
    pod_association:
      - sources:
        - from: resource_attribute
          name: k8s.pod.uid
    extract:
      metadata:
        - "container.image.name"
        - "container.image.tag"

@paulius-valiunas
Copy link
Author

Yes, the documentation needs to be a lot clearer. What exactly does "resource attributes are provided" mean? Do they have to be in the resource but not referenced by pod_association?

In addition, I don't think "[they] do nothing in the pod_association section" is accurate - if you include them, the association doesn't work anymore, so they do something, just not what you'd expect.

Overall, some things might be hard to explain in words, but it would be great to have more examples in the documentation.

@TylerHelmuth TylerHelmuth added help wanted Extra attention is needed and removed discussion needed Community discussion needed labels Apr 11, 2024
Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Jun 11, 2024
@bacherfl
Copy link
Contributor

@TylerHelmuth if this issue is still relevant I would be happy to look into this

@bacherfl
Copy link
Contributor

I have created a PR for the readme in an attempt to make things a bit clearer and highlight potential limitations for association rules: #33800 please let me know if there is something you think needs further elaboration

TylerHelmuth pushed a commit that referenced this issue Aug 7, 2024
…tributes processor (#33800)

**Description:** This PR attempts to improve the documentation of the
k8sattributes processor. Being relatively new to this component I tried
to make the documentation a bit easier to understand for new users, and
highlight some limitations I found while going through the
implementation.

**Link to tracking Issue:** #32104

**Testing:** Trying out different configurations and association rules
to get a better understanding

**Documentation:** Extended the readme of this component

---------

Signed-off-by: Florian Bacher <[email protected]>
Co-authored-by: Tiffany Hrabusa <[email protected]>
Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Aug 28, 2024
Copy link
Contributor

This issue has been closed as inactive because it has been stale for 120 days with no activity.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Oct 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working closed as inactive help wanted Extra attention is needed processor/k8sattributes k8s Attributes processor Stale
Projects
None yet
Development

No branches or pull requests

4 participants