Skip to content
This repository has been archived by the owner on Apr 2, 2024. It is now read-only.

Add decoder for Docker Fluentd [wip] #1666

Open
wants to merge 2 commits into
base: dev
Choose a base branch
from

Conversation

carlanton
Copy link
Contributor

Add decoder for Docker Fluentd [wip]

Hi!

In the upcoming release of Docker 1.8 they have added a logging driver for Fluentd, making it possible to send container logs over TCP encoded in the Fluentd format. This PR adds support for decoding those messages in the Heka sandbox. Heka can already pull logs directly from Docker with the DockerLogInput plugin but there are some advantages of doing it this way:

  • It doesn't require direct access to the Docker endpoint (better from a security perspective)
  • Easier to run Heka and Docker on different hosts
  • Ability to use different logging settings for different containers

Here is some more info about the logging driver: https://github.com/docker/docker/blob/master/docs/reference/logging/fluentd.md

Example usage

First you need to upgrade to the latest release candidate of Docker: curl -sSL https://test.docker.com/ | sh or just download https://test.docker.com/builds/Linux/x86_64/docker-1.8.0-rc2

Then start Heka with the following config:

[FluentdInput]
type = "TcpInput"
address = ":24224"
splitter = "MessagePackSplitter"
decoder = "DockerFluentdDecoder"

[MessagePackSplitter]
# No config

[DockerFluentdDecoder]
type = "SandboxDecoder"
filename = "lua_decoders/docker_fluentd.lua"

    [DockerFluentdDecoder.config]
    type = "docker-fluentd"

[RstEncoder]

[LogOutput]
message_matcher = "TRUE"
encoder = "RstEncoder"

Start a hello world container:

docker run \
    --log-driver fluentd \
    --log-opt fluentd-address=YOUR_HEKA_HOST:24224 \
    -d busybox sh -c 'while true; do echo HELLO WORLD; sleep 2; done'

The log output should look something like this:

:Timestamp: 2015-08-03 22:37:21 +0000 UTC
:Type: docker-fluentd
:Hostname: 192.168.59.103:60095
:Pid: 0
:Uuid: a5ab07fd-f86c-4eed-a768-83e815503dfe
:Logger: stdout
:Payload: HELLO WORLD
:EnvVersion: 
:Severity: 7
:Fields:
    | name:"container_name" type:string value:"/high_mccarthy"
    | name:"tag" type:string value:"docker.b1991665e77b"
    | name:"container_id" type:string value:"b1991665e77be13bc799ac412c207bdadcf0b778521e90830ad5fce7a9011c85"

Implementation / Questions

The plugin seems to work pretty well, but I had some problems understanding Heka's internals:

The data that the logging driver uses is encoded in MessagePack (It's like JSON. but fast and small), and since I thought it made sense to reuse the TcpInput plugin I've added a MessagePackSplitter plugin to find the boundaries of the received messages. Since the data is "raw messagepack", should the splitter use use_message_bytes = true? Is is unsafe to store those bytes in the Payload field?

I presumed that you often want to use an additional decoder to parse the actual log line. Therefore, the decoder is implemented in Lua using lua-MessagePack, with the intent that it would be easier to chain decoders/modules without leaving the sandbox. I haven't done any benchmarks yet, so maybe it's better to do everything in Go and then use the MultiDecoder instead... Doing it in Go would also not require two different MessagePack libs. What do you think?


func (m *MessagePackSplitter) ConfigStruct() interface{} {
return &MessagePackSplitterConfig{
UseMsgBytes: false,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The UseMsgBytes config option is automatically supported by Heka for every splitter, and you're not changing the default, so I don't think there's any reason for you to have a UseMsgBytes option. Which in turn means you don't even need a custom config struct.

@salekseev
Copy link

@carlanton this looks great. My only concern is that you have this specifically written for docker_fluentd.lua while most fluent messages are the same (1: string tag 2: long? time 3: object record, from https://github.com/fluent/fluentd/blob/master/lib/fluent/plugin/in_forward.rb). It would be great to have a generic fluent format decoder as other projects use this as well. For example https://github.com/saltstack/salt/blob/develop/salt/log/handlers/fluent_mod.py.

@carlanton
Copy link
Contributor Author

@salekseev I haven't had time to finish all of this, but I did some benchmarks between this version and another version in Go: carlanton@930b2d8
The Go version was much faster (~5-6x), not Docker specific and uses https://github.com/fluent/fluent-logger-golang/ to decode messages.
One limitation is that it requires the "record" object to be a map[string]string. It's not a problem for the format that Docker uses, but maybe we need to be more flexible for the general case?

@trink
Copy link
Contributor

trink commented Sep 17, 2015

Any reason you aren't using the native lib https://github.com/kengonakajima/lua-msgpack-native? The performance would be much better. Also, I would rather see an external dependency instead of having to maintain a copy and paste version of the much slower pure lua version of a msgpack API.

@carlanton
Copy link
Contributor Author

@trink Hmm, I don't think so. Maybe that's the best option :-) Would that require a PR to mozilla-services/lua_sandbox?

@trink
Copy link
Contributor

trink commented Sep 17, 2015

mozilla-services/lua_sandbox#97

No it would just be a external dependency that the user would install since
it is not sandbox specific.

On Thu, Sep 17, 2015 at 12:15 PM, Anton Lindström [email protected]
wrote:

@trink https://github.com/trink Hmm, I don't think so. Maybe that's the
best option :-) Would that require a PR to mozilla-services/lua_sandbox?


Reply to this email directly or view it on GitHub
#1666 (comment)
.

@salekseev
Copy link

@carlanton I'm loving carlanton@930b2d8 implementation. But a generic fluent/messagepack decoder in Lua is also very useful as people could use brokers (like Kafka) to connect fluentd and hekad for example where they wouldn't talk directly to each other. Great work. 👍

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants