Add decoder for Docker Fluentd [wip] #1666

carlanton · 2015-08-06T10:46:39Z

Add decoder for Docker Fluentd [wip]

Hi!

In the upcoming release of Docker 1.8 they have added a logging driver for Fluentd, making it possible to send container logs over TCP encoded in the Fluentd format. This PR adds support for decoding those messages in the Heka sandbox. Heka can already pull logs directly from Docker with the DockerLogInput plugin but there are some advantages of doing it this way:

It doesn't require direct access to the Docker endpoint (better from a security perspective)
Easier to run Heka and Docker on different hosts
Ability to use different logging settings for different containers

Here is some more info about the logging driver: https://github.com/docker/docker/blob/master/docs/reference/logging/fluentd.md

Example usage

First you need to upgrade to the latest release candidate of Docker: curl -sSL https://test.docker.com/ | sh or just download https://test.docker.com/builds/Linux/x86_64/docker-1.8.0-rc2

Then start Heka with the following config:

[FluentdInput]
type = "TcpInput"
address = ":24224"
splitter = "MessagePackSplitter"
decoder = "DockerFluentdDecoder"

[MessagePackSplitter]
# No config

[DockerFluentdDecoder]
type = "SandboxDecoder"
filename = "lua_decoders/docker_fluentd.lua"

    [DockerFluentdDecoder.config]
    type = "docker-fluentd"

[RstEncoder]

[LogOutput]
message_matcher = "TRUE"
encoder = "RstEncoder"

Start a hello world container:

docker run \
    --log-driver fluentd \
    --log-opt fluentd-address=YOUR_HEKA_HOST:24224 \
    -d busybox sh -c 'while true; do echo HELLO WORLD; sleep 2; done'

The log output should look something like this:

:Timestamp: 2015-08-03 22:37:21 +0000 UTC
:Type: docker-fluentd
:Hostname: 192.168.59.103:60095
:Pid: 0
:Uuid: a5ab07fd-f86c-4eed-a768-83e815503dfe
:Logger: stdout
:Payload: HELLO WORLD
:EnvVersion: 
:Severity: 7
:Fields:
    | name:"container_name" type:string value:"/high_mccarthy"
    | name:"tag" type:string value:"docker.b1991665e77b"
    | name:"container_id" type:string value:"b1991665e77be13bc799ac412c207bdadcf0b778521e90830ad5fce7a9011c85"

Implementation / Questions

The plugin seems to work pretty well, but I had some problems understanding Heka's internals:

The data that the logging driver uses is encoded in MessagePack (It's like JSON. but fast and small), and since I thought it made sense to reuse the TcpInput plugin I've added a MessagePackSplitter plugin to find the boundaries of the received messages. Since the data is "raw messagepack", should the splitter use use_message_bytes = true? Is is unsafe to store those bytes in the Payload field?

I presumed that you often want to use an additional decoder to parse the actual log line. Therefore, the decoder is implemented in Lua using lua-MessagePack, with the intent that it would be easier to chain decoders/modules without leaving the sandbox. I haven't done any benchmarks yet, so maybe it's better to do everything in Go and then use the MultiDecoder instead... Doing it in Go would also not require two different MessagePack libs. What do you think?

rafrombrc · 2015-08-19T20:20:09Z

pipeline/splitters.go

+
+func (m *MessagePackSplitter) ConfigStruct() interface{} {
+	return &MessagePackSplitterConfig{
+		UseMsgBytes: false,


The UseMsgBytes config option is automatically supported by Heka for every splitter, and you're not changing the default, so I don't think there's any reason for you to have a UseMsgBytes option. Which in turn means you don't even need a custom config struct.

salekseev · 2015-09-17T12:32:45Z

@carlanton this looks great. My only concern is that you have this specifically written for docker_fluentd.lua while most fluent messages are the same (1: string tag 2: long? time 3: object record, from https://github.com/fluent/fluentd/blob/master/lib/fluent/plugin/in_forward.rb). It would be great to have a generic fluent format decoder as other projects use this as well. For example https://github.com/saltstack/salt/blob/develop/salt/log/handlers/fluent_mod.py.

carlanton · 2015-09-17T17:21:31Z

@salekseev I haven't had time to finish all of this, but I did some benchmarks between this version and another version in Go: carlanton@930b2d8
The Go version was much faster (~5-6x), not Docker specific and uses https://github.com/fluent/fluent-logger-golang/ to decode messages.
One limitation is that it requires the "record" object to be a map[string]string. It's not a problem for the format that Docker uses, but maybe we need to be more flexible for the general case?

trink · 2015-09-17T17:57:25Z

Any reason you aren't using the native lib https://github.com/kengonakajima/lua-msgpack-native? The performance would be much better. Also, I would rather see an external dependency instead of having to maintain a copy and paste version of the much slower pure lua version of a msgpack API.

carlanton · 2015-09-17T19:15:41Z

@trink Hmm, I don't think so. Maybe that's the best option :-) Would that require a PR to mozilla-services/lua_sandbox?

trink · 2015-09-17T19:29:07Z

mozilla-services/lua_sandbox#97

No it would just be a external dependency that the user would install since
it is not sandbox specific.

On Thu, Sep 17, 2015 at 12:15 PM, Anton Lindström [email protected]
wrote:

@trink https://github.com/trink Hmm, I don't think so. Maybe that's the
best option :-) Would that require a PR to mozilla-services/lua_sandbox?

—
Reply to this email directly or view it on GitHub
#1666 (comment)
.

salekseev · 2015-09-18T12:40:04Z

@carlanton I'm loving carlanton@930b2d8 implementation. But a generic fluent/messagepack decoder in Lua is also very useful as people could use brokers (like Kafka) to connect fluentd and hekad for example where they wouldn't talk directly to each other. Great work. 👍

carlanton added 2 commits August 6, 2015 10:47

Add MessagePack splitter

50da42d

Add Docker Fluentd decoder

93d063f

rafrombrc reviewed Aug 19, 2015
View reviewed changes

salekseev mentioned this pull request Oct 24, 2015

Add Fluent protocol decoder and encoder #1775

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add decoder for Docker Fluentd [wip] #1666

Add decoder for Docker Fluentd [wip] #1666

carlanton commented Aug 6, 2015

rafrombrc Aug 19, 2015

salekseev commented Sep 17, 2015

carlanton commented Sep 17, 2015

trink commented Sep 17, 2015

carlanton commented Sep 17, 2015

trink commented Sep 17, 2015

salekseev commented Sep 18, 2015

Add decoder for Docker Fluentd [wip] #1666

Are you sure you want to change the base?

Add decoder for Docker Fluentd [wip] #1666

Conversation

carlanton commented Aug 6, 2015

Add decoder for Docker Fluentd [wip]

Example usage

Implementation / Questions

rafrombrc Aug 19, 2015

Choose a reason for hiding this comment

salekseev commented Sep 17, 2015

carlanton commented Sep 17, 2015

trink commented Sep 17, 2015

carlanton commented Sep 17, 2015

trink commented Sep 17, 2015

salekseev commented Sep 18, 2015