Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loki Exporter - Adding a feature for loki exporter to encode JSON for log entry #5846

Merged
merged 28 commits into from
Dec 7, 2021

Conversation

gillg
Copy link
Contributor

@gillg gillg commented Oct 20, 2021

Originaly from #3874

Description:

The PR creates a feature for the Loki Exporter that allows you to use a JSON representation of the LogRecord as described by https://developers.google.com/protocol-buffers/docs/proto3#json.

It creates a Loki LogEntry using JSON instead of the current body string.

The format option allows you to choose between json or the body string, the default is the body string.

This feature allows us to handle structured logging the Loki exporter and use structured logging in our logging pipeline.

Link to tracking Issue: This may fix the issue: #2529

Testing: TestConvert - Ensuring the json encoding is providing expected responses.

TestJsonLoadConfig - Ensuring the format config option is set correctly.

TestExporter_convertLogtoJsonEntry - Ensuring the Log Entry is converted to a JSON Entry.

Documentation: Comments added to the functions and README options updated.

@gillg
Copy link
Contributor Author

gillg commented Oct 20, 2021

@crearys I recreated your PR, added some little changes to "finish" a v1, so even if it's probably not perfect it covers probably almost 100% of use cases.
I would prefer merge it than wait for solving all conceptual questions in the other PR.

In the meantime I try to find a maintener for the exporter, and be able to merge !

@gillg
Copy link
Contributor Author

gillg commented Oct 20, 2021

PR in draft, I'm searching for a Loki maintainer on "loki-dev" slack

@crearys
Copy link
Contributor

crearys commented Oct 21, 2021

@gillg Apologies for the slow reply, i've been really busy recently. I totally agree as we are still using a forked version with this feature. It would be good to get this merged once we have a new maintainer.

@jpkrohling jpkrohling self-requested a review October 21, 2021 12:11
@gillg gillg marked this pull request as ready for review October 25, 2021 09:05
@gillg gillg requested a review from a team October 25, 2021 09:05
@gillg
Copy link
Contributor Author

gillg commented Oct 25, 2021

@jpkrohling proposed to adopt the orphan exporter Loki 👏

Copy link
Member

@jpkrohling jpkrohling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple of minor items, but looks good to me overall!

exporter/lokiexporter/config_test.go Outdated Show resolved Hide resolved
exporter/lokiexporter/config_test.go Outdated Show resolved Hide resolved
exporter/lokiexporter/README.md Outdated Show resolved Hide resolved

extensions:
health_check:
pprof:
zpages:
memory_ballast:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you actually need this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, but was to be consistent with the previous exemple. We can remove it if you think it's better.
Does memory ballast shouldn't always be used when we receive pushed data like logs ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We recommend that people always use things like the batch processor and memory ballast, but I find it confusing to add something not relevant to what is being demonstrated.

exporter/lokiexporter/exporter.go Outdated Show resolved Hide resolved
@@ -195,9 +210,20 @@ func (l *lokiExporter) convertAttributesToLabels(attributes pdata.AttributeMap,
return ls
}

func convertLogToLokiEntry(lr pdata.LogRecord) *logproto.Entry {
func convertLogToLokiEntry(lr pdata.LogRecord, res pdata.Resource) (*logproto.Entry, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function's name is now a bit strange. How about renaming it to convertLogToPlainEntry or convertLogToStringEntry?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, why not convertLogBodyToEntry ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like "Body" needs the reader to have a bigger context than simply "String", for instance.

Copy link
Contributor Author

@gillg gillg Oct 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hum, I don't understand you you mean, but by the way.... I'm thinking the body can be a complex structure in OTEL, I already had a case here #4955
I was trying to use the syslog receiver, but all structured data are put in the body instead of attributes, and attributes are empty.

As opposite, fluentforward receiver set an empty body (except if it finds an arbitrary key "message" or "log") and put all structured log data in attributes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, sounds good, convertLogBodyToEntry it is!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK !
But does lr.Body().StringVal() will handle a structured body ? (case where you use syslog receiver)
If not we need to serialize it anyway :/
For the JSON encoding case it's implicit, but for this "raw string" output it's not obvious. Does a logfmt serializer exists ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a test and see what happens?

@gillg
Copy link
Contributor Author

gillg commented Oct 25, 2021

Last question @jpkrohling.
Maybe we can validate this PR to move forward (once review fixed), but I have some conceptual points from my previous review in the old PR.

With the format "body", the loki log entry will be something like : {"name":"name","body":"Example log","traceid":"01020304000000000000000000000000","spanid":"0506070800000000","severity":"error","attributes":{"attr1":"1","attr2":"2"},"resources":{"attr1":"1","attr2":"2"}} so we have nested structures for attributes and resources.
I didn't tested it, it seems supported by Loki (it flattens everything with json pipe) and Grafana (the rendering seems supported ?).
So in theory something like {...} | json | attributes_attr1="1" is possible, but is it a good practice to push a JSON with nested structures or should we flaten it before sending it ? The final query will be the same, but I was wondering for perfs reasons.

Note : Indexed labels are not related to the log entry, we create the mapping manualy from pure OTEL attributes and resources.

@cyriltovena
Copy link

Last question @jpkrohling. Maybe we can validate this PR to move forward (once review fixed), but I have some conceptual points from my previous review in the old PR.

With the format "body", the loki log entry will be something like : {"name":"name","body":"Example log","traceid":"01020304000000000000000000000000","spanid":"0506070800000000","severity":"error","attributes":{"attr1":"1","attr2":"2"},"resources":{"attr1":"1","attr2":"2"}} so we have nested structures for attributes and resources. I didn't tested it, it seems supported by Loki (it flattens everything with json pipe) and Grafana (the rendering seems supported ?). So in theory something like {...} | json | attributes_attr1="1" is possible, but is it a good practice to push a JSON with nested structures or should we flaten it before sending it ? The final query will be the same, but I was wondering for perfs reasons.

Note : Indexed labels are not related to the log entry, we create the mapping manualy from pure OTEL attributes and resources.

No problem with JSON, I prefer when logs are human readable but that seems to be a valid option to push json if we want to keep metadata inside each line.

May be we should add another format like logfmt too in the future which is more human readable.

@gillg
Copy link
Contributor Author

gillg commented Oct 26, 2021

No problem with JSON, I prefer when logs are human readable but that seems to be a valid option to push json if we want to keep metadata inside each line.

May be we should add another format like logfmt too in the future which is more human readable.

Thanks @cyriltovena !
I aggree JSON is not ideal, but it's not so bad at the end until we have something structured. I use format_line() on dashboards and the result is clear.
Maybe a YAML format could be a compromise between JSON and logfmt... As an exemple, a stacktrace in logfmt seems not more easy to read without carriage returns or indentation

@@ -150,6 +178,8 @@ func (l *lokiExporter) logDataToLoki(ld pdata.Logs) (pr *logproto.PushRequest, n
}
}

l.logger.Error("some logs has been dropped", zap.Error(errs))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be lost here. Shouldn't you wrap it in a conditional?

}
return &logproto.Entry{
Timestamp: time.Unix(0, int64(lr.Timestamp())),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious about what is the behavior when the timestamp is bigger than the upper boundary for int64, given that lr.Timestamp is a uint64, assuming that lr.Timestamp is user input. Not for this PR, as this pattern has been used before in this codebase already.

@github-actions
Copy link
Contributor

github-actions bot commented Nov 4, 2021

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions github-actions bot added the Stale label Nov 4, 2021
@jpkrohling
Copy link
Member

@gillg, do you need any help progressing with this PR?

@github-actions github-actions bot removed the Stale label Nov 5, 2021
@gillg
Copy link
Contributor Author

gillg commented Nov 5, 2021

@gillg, do you need any help progressing with this PR?

Hello, yes I have a lack of time and... I'm sure in fact that the actual body serialization is not valid. It could work as a v1 but knowing it doesn't cover use cases where your body is a complex object and not a simple string. 😕

What do you want ? We assume I just fix remaining details or we target the "perfect" version ?

P.S: I can make functional changes, but I'm probably not confortable enought with Go to create many tests for different complex use tests with basic attributes, basic resources, string body, complex body, etc. I don't have any idea how to serialize in a generic way a pdata because it can contains string, AttributeMap, ints, etc.

@jpkrohling
Copy link
Member

I aggree JSON is not ideal, but it's not so bad at the end until we have something structured. I use format_line() on dashboards and the result is clear.

I think this is the key: you are adding a feature that solves a problem for you, and it seems reasonable to me. Let's get this first version done, and iterate from there.

About further testing: create an issue to track it and assign it to me. I can take a look and contribute some tests, tagging you to review my work. How does that sound?

@github-actions
Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions github-actions bot added the Stale label Nov 16, 2021
@github-actions
Copy link
Contributor

Closed as inactive. Feel free to reopen if this PR is still being worked on.

@github-actions github-actions bot closed this Nov 23, 2021
@gillg
Copy link
Contributor Author

gillg commented Dec 4, 2021

Hello !
Sorry I was completely unavailable these last weeks...
I just made a new commit to fix some remaining things, @jpkrohling can you reopen the PR please ? 😅

@gillg
Copy link
Contributor Author

gillg commented Dec 4, 2021

By the way have you some informations about this otep ? open-telemetry/oteps#188
It seems abandoned, but I don't really understand why.
Force some consistency seems not a bad idea... As exemple the syslog receiver is putting all metadata in a body map, but there is no processor able to work with body attributes, and probably 80% of exporters will don't handle them.

@mx-psi mx-psi reopened this Dec 6, 2021
Copy link
Member

@jpkrohling jpkrohling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking good! Just a question, but I think it's ready to be merged.

exporter/lokiexporter/exporter.go Outdated Show resolved Hide resolved
@jpkrohling
Copy link
Member

@cyriltovena, would you like to take a final look?

Copy link
Member

@jpkrohling jpkrohling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I'll wait until tomorrow afternoon (UTC) to see if @cyriltovena wants to give a final review, otherwise I'll merge this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants