-
Notifications
You must be signed in to change notification settings - Fork 278
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lightweight transactions fail due to incorrect IP binding #168
Comments
Wow, this is hairy. This is partially Cassandra's fault for being so overzealous about exact explicit IP addresses everywhere (to the point of making them part of the protocols directly), but also partly Docker's fault for providing such complicated networking that it becomes somewhere between very hard and impossible to determine "the container's IP address". Just to hopefully help anyone who wants to dig into this more, I've reproduced this by doing the following simple steps: $ docker network create --driver overlay --attachable test
$ docker service create --network test --name test --publish 1234:1234 cassandra Then I did root@d1d0dec303e6:/# ip address
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
191580: eth0@if191581: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default
link/ether 02:42:0a:ff:00:04 brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet 10.255.0.4/16 brd 10.255.255.255 scope global eth0
valid_lft forever preferred_lft forever
191582: eth2@if191583: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
link/ether 02:42:ac:14:00:03 brd ff:ff:ff:ff:ff:ff link-netnsid 2
inet 172.20.0.3/16 brd 172.20.255.255 scope global eth2
valid_lft forever preferred_lft forever
191584: eth1@if191585: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default
link/ether 02:42:0a:00:00:04 brd ff:ff:ff:ff:ff:ff link-netnsid 1
inet 10.0.0.4/24 brd 10.0.0.255 scope global eth1
valid_lft forever preferred_lft forever
root@d1d0dec303e6:/# ip route
default via 172.20.0.1 dev eth2
10.0.0.0/24 dev eth1 proto kernel scope link src 10.0.0.4
10.255.0.0/16 dev eth0 proto kernel scope link src 10.255.0.4
172.20.0.0/16 dev eth2 proto kernel scope link src 172.20.0.3 Additionally, here's the relevant section from {
"Networks": {
"ingress": {
"IPAMConfig": {
"IPv4Address": "10.255.0.4"
},
"Links": null,
"Aliases": [
"d1d0dec303e6"
],
"NetworkID": "88zat916cnaig63d03u095rc5",
"EndpointID": "d3290f8fd19a82c01cd4d29cffbaa87568a8ceed2d02821f3b8c2585704aefb4",
"Gateway": "",
"IPAddress": "10.255.0.4",
"IPPrefixLen": 16,
"IPv6Gateway": "",
"GlobalIPv6Address": "",
"GlobalIPv6PrefixLen": 0,
"MacAddress": "02:42:0a:ff:00:04",
"DriverOpts": null
},
"test": {
"IPAMConfig": {
"IPv4Address": "10.0.0.4"
},
"Links": null,
"Aliases": [
"d1d0dec303e6"
],
"NetworkID": "lr0ptm8trd7ceh70277s2ggtm",
"EndpointID": "aaca3019d270b6caf9f4b89023a0702be7f66d171f66c3b06efaf8d93fc2156f",
"Gateway": "",
"IPAddress": "10.0.0.4",
"IPPrefixLen": 24,
"IPv6Gateway": "",
"GlobalIPv6Address": "",
"GlobalIPv6PrefixLen": 0,
"MacAddress": "02:42:0a:00:00:04",
"DriverOpts": null
}
}
} So in short, our container has no less than three candidate IP addresses, and there's really not any way I can see for us to differentiate them in an automated way:
All of those CIDRs are configurable/modifiable, and there's not really anything "telling" about each one on the interfaces themselves (the I'm not sure what we should do here. 😞 😕 |
I wish I had a good idea to contribute, but I haven't thought of one. We're going to work around it for now with some hackery in our Docker Compose file, since within our environment we can assume the Swarm network is using the default IP settings. It would be nice if Docker had a magic IP you could query, like the 169.254.169.254 one that Amazon has in EC2, to get metadata... |
I found this YAML to work:
tasks.cassandra-n in service name (instead of just cassandra-n) does the trick. Not sure exactly why. docker version Server:
|
@gsliskov you are great. What does tasks.cassandra-n mean? How you find this guy? |
On an overlay network each service is also assigned a virtual IP by docker that is then load-balanced across all tasks (containers) of the service. If you don't want docker to create this extra network abstraction, just change the endpoint mode to version: '3.3'
services:
cassandra-1:
image: cassandra
deploy:
endpoint_mode: dnsrr
placement:
constraints:
- node.labels.application==cassandra1
environment:
CASSANDRA_BROADCAST_ADDRESS: "cassandra-1"
ports:
- 7000
volumes:
- "/volume/cassandra:/var/lib/cassandra"
networks:
- cassandra
cassandra-2:
image: cassandra
deploy:
endpoint_mode: dnsrr
placement:
constraints:
- node.labels.application==cassandra2
environment:
CASSANDRA_BROADCAST_ADDRESS: cassandra-2
CASSANDRA_LISTEN_ADDRESS: cassandra-2
CASSANDRA_SEEDS: "cassandra-1"
depends_on:
- "cassandra-1"
ports:
- 7000
volumes:
- "/volume/cassandra:/var/lib/cassandra"
networks:
- cassandra
networks:
cassandra:
external:
name: cassandra-net |
We also run docker swarm-mode. I get the container ip address with the hostname of the container. I replaced the _ip_address function in the docker-entrypoint.sh with this: |
I've been having the same issue. I have 3 Cassandra nodes on my development machine. This was helpful:
Adding the
Do I need to maintain two separate compose files, one for Docker Swarm and one for Docker Compose? |
This is very similar to #150 , but #151 didn't fix it for this case.
I'm using Cassandra with
docker stack deploy
in swarm mode. If I specify aports:
section in the stack YAML,_ip_address
gives a different address than when I don't haveports:
. Withports:
, lightweight transactions (LWT) fail, without it, they're fine. I locally hacked the_ip_address
function to only find addresses that start with10.0.
, and that seemed to fix things up for me, but I don't really know if that's the way to go.The text was updated successfully, but these errors were encountered: