Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Sending raw logs via Elasticsearch exporter when format option is set #26647

Closed
raghu999 opened this issue Sep 12, 2023 · 9 comments · Fixed by #29619 or #32171
Closed

Support Sending raw logs via Elasticsearch exporter when format option is set #26647

raghu999 opened this issue Sep 12, 2023 · 9 comments · Fixed by #29619 or #32171
Assignees

Comments

@raghu999
Copy link

Component(s)

No response

Is your feature request related to a problem? Please describe.

The current exporter automatically converts the logs into plog format which adds attribute prefixes to all the fields that break the existing dashboards and alerts for enterprises who are using the elastic search exporter. Please see the below debug logs from the Otel collector

Event received by otel

ObservedTimestamp: 2023-03-29 17:47:46.28812868 +0000 UTC
Timestamp: 2023-03-29 17:47:46.071000064 +0000 UTC
SeverityText:
SeverityNumber: Unspecified(0)
Body: Map({"message":"{\"host\":\"17.47.219.133\",\"user-identifier\":\"shaneIxD\",\"datetime\":\"29/Mar/2023:17:47:46\",\"method\":\"HEAD\",\"request\":\"/do-not-access/needs-work\",\"protocol\":\"HTTP/1.1\",\"status\":\"500\",\"bytes\":28854,\"referer\":\"https://for.us/user/booperbot124\"}","source_type":"demo_logs"})
Attributes:
     -> protocol: Str(HTTP/1.1)
     -> host: Str(17.47.219.133)
     -> datetime: Str(29/Mar/2023:17:47:46)
     -> method: Str(HEAD)
     -> request: Str(/do-not-access/needs-work)
     -> status: Str(500)
     -> bytes: Double(28854)
     -> referer: Str(https://for.us/user/booperbot124)
     -> user-identifier: Str(shaneIxD)
Trace ID:
Span ID:
Flags: 0
	{"kind": "exporter", "data_type": "logs", "name": "logging"}

Elastic view

  "_source": {
   "@timestamp": "2023-03-29T17:46:42.069999872Z",
   "Attributes.bytes": 39389,
   "Attributes.datetime": "29/Mar/2023:17:46:42",
   "Attributes.host": "48.247.3.152",
   "Attributes.method": "PATCH",
   "Attributes.protocol": "HTTP/1.1",
   "Attributes.referer": "https://some.net/controller/setup",
   "Attributes.request": "/observability/metrics/production",
   "Attributes.status": "550",
   "Attributes.user-identifier": "benefritz",
   "Body.message": "{\"host\":\"48.247.3.152\",\"user-identifier\":\"benefritz\",\"datetime\":\"29/Mar/2023:17:46:42\",\"method\":\"PATCH\",\"request\":\"/observability/metrics/production\",\"protocol\":\"HTTP/1.1\",\"status\":\"550\",\"bytes\":39389,\"referer\":\"https://some.net/controller/setup\"}",
   "Body.source_type": "demo_logs",
   "SeverityNumber": 0,
   "TraceFlags": 0
 }

Describe the solution you'd like

According to https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/exporter/elasticsearchexporter/model.go#L51 the elasticsearch exporter automatically converts the data into plog format before sending the event to Elastic. The only way we can handle this is to add ingest pipleine on Elastic to strip off the attribute field.

We would like to see a RAW format option like loki and kafka exporter where we can send the raw logs to elasticsearch.

https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/exporter/kafkaexporter

The following encodings are valid only for logs.
raw: if the log record body is a byte array, it is sent as is. Otherwise, it is serialized to JSON. Resource and record attributes are discarded.

https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/exporter/lokiexporter

The following formats are supported:
logfmt: Write logs as logfmt lines.
json: Write logs as JSON objects. It is the default format if no hint is present.
raw: Write the body of the log message as string representation.

Describe alternatives you've considered

No alternatives found except for adding ingest pipelines on the Elastic

Additional context

No response

@raghu999 raghu999 added enhancement New feature or request needs triage New item requiring triage labels Sep 12, 2023
@github-actions
Copy link
Contributor

Pinging code owners for exporter/elasticsearch: @JaredTan95. See Adding Labels via Comments if you do not have permissions to add labels yourself.

@JaredTan95
Copy link
Member

makes sense, I think we can support it in es exporter.

@raghu999
Copy link
Author

Hi @JaredTan95,

Thank you for your response. It's great to hear that you think we can support this feature in the Elasticsearch exporter. We're eager to see this feature implemented.

Is there any update on the progress of this feature request? If there's anything we can do to help or contribute to its development, please feel free to let us know. We're excited to see this feature added to the project and are willing to assist in any way we can.

Looking forward to your response and the future improvements to the project.

@JaredTan95
Copy link
Member

Hi @JaredTan95,

Thank you for your response. It's great to hear that you think we can support this feature in the Elasticsearch exporter. We're eager to see this feature implemented.

Is there any update on the progress of this feature request? If there's anything we can do to help or contribute to its development, please feel free to let us know. We're excited to see this feature added to the project and are willing to assist in any way we can.

Looking forward to your response and the future improvements to the project.

Feel free to contribute if you have time~

@ycombinator
Copy link
Contributor

Hi @JaredTan95, I'd like to work on this issue unless you're already working on it. Thanks!

@JaredTan95 JaredTan95 assigned ycombinator and unassigned JaredTan95 Dec 1, 2023
@JaredTan95
Copy link
Member

Hi @JaredTan95, I'd like to work on this issue unless you're already working on it. Thanks!

It's yours~

@ycombinator
Copy link
Contributor

@raghu999 @JaredTan95 I've implemented this feature, albeit via a more explicit setting name, mapping.omit_attributes_prefix, in #29619. Please let me know what you think. Thanks!

@raghu999
Copy link
Author

raghu999 commented Dec 2, 2023

This is a cleaner approach than the one I was planning to implement. Thanks @ycombinator

Copy link
Contributor

github-actions bot commented Feb 1, 2024

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Feb 1, 2024
MovieStoreGuy pushed a commit that referenced this issue Feb 6, 2024
…#29619)

**Description:** 

This PR adds a new configuration option, `mapping.mode: raw`, to the
Elasticsearch exporter. When set, the Elasticsearch exporter will not
prefix log or span attributes with `Attributes.` when forming the
Elasticsearch document field names for these fields. Additionally, the
exporter will also not prefix span events with `Events.*` with forming
the Elasticsearch document field names for these fields.

**Link to tracking Issue:** Resolves
#26647

**Testing:** 

Besides adding/updating relevant unit tests in this PR, I also tested
the changes in this PR against a local Elasticsearch cluster, using the
following collector configurations:

1. Without the new `mapping.mode: raw` setting.
   ```yaml
   receivers:
     tcplog:
       listen_address: "0.0.0.0:54545"
   
   processors:
     attributes:
       actions:
         - action: insert
           key: first_attribute
           value: one
         - action: insert
           key: second_attribute
           value: two
   
   exporters:
     debug:
       verbosity: detailed
     elasticsearch:
       endpoints: [ "https://localhost:9200" ]
       user: elastic
       password: XXXXXXXX
       logs_index: otel-logs
       tls:
         insecure_skip_verify: true
       flush:
         interval: 1s
   
   service:
     pipelines:
       logs:
         receivers: [tcplog]
         processors: [attributes]
         exporters: [debug,elasticsearch]
   ```

   _Resulting document in Elasticsearch:_
   ```json
   {
     "_index": "otel-logs",
     "_id": "l1E5J4wBD9bb2EmZJuDR",
     "_score": 1,
     "_source": {
       "@timestamp": "1970-01-01T00:00:00.000000000Z",
       "Attributes": {
         "first_attribute": "one",
         "second_attribute": "two"
       },
       "Body": "bar",
       "Scope": {
         "name": "",
         "version": ""
       },
       "SeverityNumber": 0,
       "TraceFlags": 0
     }
   }
   ```

2. With the new `mapping.mode: raw` setting.
   ```yaml
   receivers:
     tcplog:
       listen_address: "0.0.0.0:54545"
   
   processors:
     attributes:
       actions:
         - action: insert
           key: first_attribute
           value: one
         - action: insert
           key: second_attribute
           value: two
   
   exporters:
     debug:
       verbosity: detailed
     elasticsearch:
       endpoints: [ "https://localhost:9200" ]
       user: elastic
       password: XXXXXXXX
       logs_index: otel-logs
       tls:
         insecure_skip_verify: true
       flush:
         interval: 1s
       mapping:
         mode: raw
   
   service:
     pipelines:
       logs:
         receivers: [tcplog]
         processors: [attributes]
         exporters: [debug,elasticsearch]
   ```

   _Resulting document in Elasticsearch:_
   ```json
   {
     "_index": "otel-logs",
     "_id": "jlE4J4wBD9bb2EmZp-Cd",
     "_score": 1,
     "_source": {
       "@timestamp": "1970-01-01T00:00:00.000000000Z",
       "Body": "foo bar baz",
       "Scope": {
         "name": "",
         "version": ""
       },
       "SeverityNumber": 0,
       "TraceFlags": 0,
       "first_attribute": "one",
       "second_attribute": "two"
     }
   }      
   ```

**Documentation:** Documented the new configuration option in the
Elasticsearch exporter's `README.md`.

---------

Co-authored-by: Andrzej Stencel <[email protected]>
mx-psi pushed a commit that referenced this issue Apr 5, 2024
**Description:** 

This PR proposes adding @ycombinator as a codeowner for the
`elasticsearch` exporter component, being an [employee of
Elastic](https://www.linkedin.com/company/elastic-co/people/?keywords=shaunak)
and also meeting the codeowner
[requirements](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/CONTRIBUTING.md#requirements):

1. [Be a member of the OpenTelemetry
organization.](https://github.com/open-telemetry/community/blob/main/community-membership.md#member)
   * https://github.com/orgs/open-telemetry/people?query=ycombinator
2. (Code Owner Discretion) It is best to have resolved an issue related
to the component, contributed directly to the component, and/or review
component PRs. How much interaction with the component is required
before becoming a Code Owner is up to any existing Code Owners.
* Resolved
#26647
via
#29619
* Reviewed
#31553
* Contributed
#31694
as follow up to
#31553
* Reviewed
#31848
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment