Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: kafka container #44

Merged
merged 14 commits into from
Dec 1, 2023
Merged

Conversation

Argonus
Copy link
Contributor

@Argonus Argonus commented Nov 10, 2023

Add support for kafka test container.

What was added

  • kafka supports three modes
    • embedded zookeeper where it starts zookeeper in the same container as kafka itself. Using that, its should be fairly easy to setup single broker connection.
    • external zookeeper where it uses external zookeeper provided. It should make easy to setup cases with n brokers & one zookeeper
    • kraft a little experimental, as I haven't used it yet. But it uses kafka as a source of quorum selection instead of zookeeper.

Future plans of mine.

  • add network support for docker. This is required to: connect multiple kafka nodes, add zookeeper, schema registry & kafka connect as containers.
  • zookeeper container support
  • schema registry container support
  • kafka-connect container support
  • kraft with multiple nodes quorum support

@Argonus Argonus marked this pull request as ready for review November 10, 2023 14:57
@Argonus Argonus marked this pull request as draft November 10, 2023 14:58
@jarlah
Copy link
Member

jarlah commented Nov 10, 2023

I have added a "couple of changes" since you forked, but I dont think it will be too much problems given your PR is only additions :) Best to update from upstream, where each container now implements a ContainerBuilder, which is necessary to work with start_container.

config :kafka_ex,
  disable_default_worker: true

in config/test.exs
will solve problem with starting tests.

test "provides a ready-to-use kafka container", %{kafka: kafka} do
  uris = [{"localhost", Container.mapped_port(kafka, 9092)}]

will fix problem with port being wrong, unless you set fixed port in KafkaContainer which shouldn't done.

but update from upstream first :) 🥳

@jarlah
Copy link
Member

jarlah commented Nov 10, 2023

updated branch, rebased with upstream main

@jarlah
Copy link
Member

jarlah commented Nov 10, 2023

@Argonus I see Testcontainers-java uses KRaft. Should we do the same ? if we only set up cp-kafka, we will only get a broker without the rest of the required facilities to work with fullscale kafka projects.

@Argonus
Copy link
Contributor Author

Argonus commented Nov 10, 2023

Hi, thanks for response. I will take a look at java project too later this weekend. But it depends what we want to achieve by 'full kafka project'
This one here, should be enough for kafka communication, where it lack connect and other additions.

But I'd rather create them as separate containers.

Ok, I've checked it. I think I'll add support for both to follow Java project style.

@jarlah
Copy link
Member

jarlah commented Nov 11, 2023

Dont follow them line by line. Not even sure if its possible. Best is to check how other functional libraries does. Like go(semi functional), Python (if they have Kafka) or some other lang. I think the point is to not reinvent the wheel :) awesome you are picking it up 🙌

@Argonus
Copy link
Contributor Author

Argonus commented Nov 11, 2023

@jarlah
It should be, its just a bunch of kafka libs that are allowing new functionalities like lack of zookeeper. In summary all in kafka.
I've checked other test containers for kafka, and it really depends on when it was created & by whom.
In my ideal world, I'd like to have both options as Java has. Sooner or later, it will be standard for confluent kafka distribution & it's also a little lighter as you don't need to run zookeeper.

@jarlah
Copy link
Member

jarlah commented Nov 14, 2023

Plans for new release is to wait for this Kafka module. Then we will have Cassandra, Minio and Kafka as new container modules. In addition we also now have the option to send :persistent_volume_name option to Ecto module functions (postgres_container/1, mysql_container/1), to enable using the Ecto module not only for test but also for dev ;)

@jarlah jarlah added this to the v1.4 milestone Nov 14, 2023
@jarlah jarlah added the enhancement New feature or request label Nov 14, 2023
@jarlah jarlah removed this from the v1.4 milestone Nov 15, 2023
@Argonus
Copy link
Contributor Author

Argonus commented Nov 15, 2023

@jarlah
I won't be able to finish it this week, personal stuff after work. I can try to finish to next week, but if you are in hurry feel free to use what is here.

@jarlah
Copy link
Member

jarlah commented Nov 15, 2023

No problem @Argonus i released v1.4 with a nice set of changes. Planning this to be in v1.5. In about one or two weeks. No stress. This is OS 💪

@Argonus Argonus force-pushed the kafka-container branch 3 times, most recently from e6d5dff to 2c2a394 Compare November 29, 2023 12:28
@Argonus Argonus marked this pull request as ready for review November 29, 2023 16:22
@jarlah
Copy link
Member

jarlah commented Nov 29, 2023

interesting. I will take a look soon. About fixed ports, "meh", but I will dig into it and see how you have solved it :) I have seen in other implementations that they update the broker host after container start. if that isnt done here this could explain the need for fixed ports. But I will know more when I have reviewed the code :)

@jarlah jarlah self-requested a review November 29, 2023 16:32
Copy link
Member

@jarlah jarlah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont fancy the fixed ports approach, so I have some suggestions on how we can solve that.

lib/container/kafka_container.ex Show resolved Hide resolved
config/test.exs Outdated Show resolved Hide resolved
@jarlah
Copy link
Member

jarlah commented Nov 30, 2023

Anyway, as mentioned in #44 (comment) I made a PR that adds put file functionality. Which can be used to upload a cmd file to the kafka container. With basically same content you have in the cmd script now. Then change cmd to wait for the uploaded file to exist and then run it.

If you make the script inside is_starting, you can get mapped ports on the container, and dont need fixed ports anymore.

@Argonus
Copy link
Contributor Author

Argonus commented Nov 30, 2023

Looks good, I'll add this later this week. With IP address, I'd rather leave it there. I've had some issues with connecting between zookeeper & kafka in older distributions & my lovely M1 machine... Don't really remember what, as java logs are just a little more pleasant to read than elrang ones but IP trick worked 🤷

In summary, kafka <7.0 is not really stable on M1

@Argonus Argonus requested a review from jarlah November 30, 2023 10:10
Copy link
Member

@jarlah jarlah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work 🥳 Its well structured and easy to read, and it works 🚀

However, I have been giving the zookeeper_external consensus strategy some thought, and I have landed on that I dont really want that strategy included due its need for fixed zookeeper port. A long term strategy of mine is to not have any examples or tests with fixed ports. Because this can cause a lot of issues, both locally when running or debugging tests, or in continuous integration.

We have two very good options for kafka, zookeeper_embedded and kraft consensus strategies, and I really think that's enough for a first version. I think trying to get external zookeeper working is like beating a dead horse. Plus, zookeeper should have a minimum of (N + 1) / 2 nodes. So having just one embedded or one external zookeeper doesn't really represent a valid production scenario. Its valid of course when there is only one kafka broker, but it doesn't add value.

So, great work :) But I would like you to remove support for the external zookeeper mode and its accompanying test, unless you have a very good reason to keep it. We can keep the ip_address on Container though and the mapping of it, even though there is not much need for it atm.

@jarlah
Copy link
Member

jarlah commented Nov 30, 2023

btw I will ask contributors in Testcontainers-java how they did it for zookeeper external, if they have it. But even if they do have it and use fixed port too, that wont budge me since the overall recommendation for testcontainers in general is to not use fixed ports :)

@Argonus
Copy link
Contributor Author

Argonus commented Nov 30, 2023

@jarlah There is already External Zookeeper
In java, and more or less it works the same, as here. You have to provide a single connection string there. Here, I've just splitted it into host & port variables.

Yep, i have a good reason, quite often you want to validate multiple kafka nodes, to make it work, you want to run zookeeper as a separated container or use zookeeper from other kafka container.
Tbf most common scenarios in docker compose I've seen was single zookeeper & multiple kafka nodes, so I'd rather keep it here.

@Argonus
Copy link
Contributor Author

Argonus commented Nov 30, 2023

@jarlah Fixed port removed, tbf I've just forgot about that one. Maybe we should mark fixed_port as deprecated or sth. It makes quite hard to contribute don't knowing long term plan

@Argonus Argonus requested a review from jarlah November 30, 2023 13:33
@jarlah
Copy link
Member

jarlah commented Nov 30, 2023

@jarlah Fixed port removed, tbf I've just forgot about that one. Maybe we should mark fixed_port as deprecated or sth. It makes quite hard to contribute don't knowing long term plan

yeah sorry about that. hasn't been that much contributors lately, since this repo is quite fresh :) So dont stress about that. Unless you are psychic you couldn't know ;)

@jarlah
Copy link
Member

jarlah commented Nov 30, 2023

@jarlah Fixed port removed, tbf I've just forgot about that one. Maybe we should mark fixed_port as deprecated or sth. It makes quite hard to contribute don't knowing long term plan

yeah sorry about that. hasn't been that much contributors lately, since this repo is quite fresh :) So dont stress about that. Unless you are psychic you couldn't know ;)

btw the tests doesn't pass because of the line

|> KafkaContainer.with_zookeeper_host("host.docker.internal")

This is whats causing issues when you need to talk cross containers. "host.docker.internal" isnt a host in GitHub actions or on my machine, because I dont use docker desktop. I use either orbstack or testcontainers desktop with embedded runtime. So any custom solutions for docker desktop wont be cross platform :)

@jarlah
Copy link
Member

jarlah commented Nov 30, 2023

personally I would have just removed the zookeeper container and the kafka tests for external zookeeper and called it a day. The way forward is not zookeeper, it's obsolete. Embedded and Kraft is yay :)
here is a patch for removing zookeeper form this PR:
remove_zookeeper_container.patch
I have learned the hard way to not mess with others PRs 🤣

@Argonus
Copy link
Contributor Author

Argonus commented Nov 30, 2023

Ok, I'll add it for now.
But stil, in future I'd like to add external zookeeper, just as a separated PR, in a week or two once I'll have a time.
I'm working with kafka everyday, and ability to have a multiple brokers is more than usefull.

And kraft, requires newest possible kafka version, not all libraries know who to work with that, so I'd like to cover most common scenarios here, instead of just newest ones :D
And I'll anyway have to find a solution for host to ad schema registry support.

Sounds good?

@jarlah
Copy link
Member

jarlah commented Nov 30, 2023

Ok, I'll add it for now. But stil, in future I'd like to add external zookeeper, just as a separated PR, in a week or two once I'll have a time. I'm working with kafka everyday, and ability to have a multiple brokers is more than usefull.

And kraft, requires newest possible kafka version, not all libraries know who to work with that, so I'd like to cover most common scenarios here, instead of just newest ones :D And I'll anyway have to find a solution for host to ad schema registry support.

Sounds good?

I have a fair suggestion: We can leave everything except for the zookeeper container and zookeeper container test. Then you can specify a generic zookeeper container in your tests (just make a module and defstruct and implement ContainerBuilder) and configure kafka to use this zookeeper. This way we dont have to maintain an extra container besides KafkaContainer. And there is no fixed port approach by default (and no test for it either, other than maybe asserting the container builder builds things correctly). We can make a readme on how to do this too. It's basically creating a test helper module zookeeper_container.ex in test support, and using it like any other container. The external zookeeper can even be actually external, like in docker compose or running on the computer.

@Argonus
Copy link
Contributor Author

Argonus commented Nov 30, 2023

Sounds good, let me do this then later today!

@jarlah
Copy link
Member

jarlah commented Nov 30, 2023

Sounds good, let me do this then later today!

btw, in slack on testcontainers they say we can use network alias for zookeeper. If we can do that, random port or fixed port works same.

https://testcontainers.slack.com/archives/C9EJ7TVT7/p1701348087842769

I am guessing we might need to adjust testcontainers for elixir a bit to allow for specifying a network alias, but it should be doable.

it doesn't need to change our plan of action however. But we can whip up a second PR that adds zookeeper with dynamic port and network alias ;)

@Argonus
Copy link
Contributor Author

Argonus commented Nov 30, 2023

@jarlah
Ok, for now I've dropped support for zookeeper as a container.
Network support will be required for other kafka famility containers, I'll think about it later. But let's make this work with basic setup for now.

lib/testcontainers.ex Outdated Show resolved Hide resolved
Copy link
Member

@jarlah jarlah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a tiny oversight that can be removed, otherwise good 👍

@jarlah
Copy link
Member

jarlah commented Nov 30, 2023

I love that you actually put it into test helper in testcontainers 😆 I was actually talking about making a generic container in the using app, but its much better having it documented in tests, even if its not recommended. And you actually will have to copy the zookeeper code from test code to use it in your app, since its not condoned.

@jarlah
Copy link
Member

jarlah commented Nov 30, 2023

btw @Argonus I usually squash PRs. Do you have any preferences on that ? In this case I dont care about my ninja commit in the middle of the commit history. So if it's fine for you this PR will squashed. Alternatively squash, rebase and force push locally.

@jarlah jarlah changed the title Kafka container feat: kafka container Nov 30, 2023
@jarlah jarlah merged commit af9b00d into testcontainers:main Dec 1, 2023
2 checks passed
@Argonus Argonus deleted the kafka-container branch December 1, 2023 08:50
@Argonus
Copy link
Contributor Author

Argonus commented Dec 1, 2023

@jarlah sorry, i was already off :)
For future, squash it! I've nth against it. Thank you for you help, and expect new things around kafka soon :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants