You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We've been recently receiving complaints that resources like docker-image and registry-image have been failing with "429 Too Many Requests".
While we did introduce retries at the resource-type level for registry-image, (see concourse/registry-image-resource#69) those using
docker-image (or trying to reach dockerhub directly) would still suffer from the
limit being place on our IP.
My hypothesis is that by removing the NAT machine that we have in the bosh
network (which ends up making every request from any of the 40+ machines we
have going out from that single IP), we can then get rid of the problems we're
currently facing w/ regards to limits on the number of requests (aside from
reducing one hop and a single point of failure).
Last week, I naively tried just removing the routes that we have set at the
network level
Should we do that? I think so - if we don't have the requirement of having those
machines completely unreachable at all (not really true in our case), I think we
should just drop it.
Thanks!
The text was updated successfully, but these errors were encountered:
We could use firewall rules & tags to ensure only outbound requests are allowed from the workers.
However, we don't have anyway of enforcing that those remain in place. For example, someone would be able to remove those rules or inadvertently change the tags/network name etc and we wouldn't know about it.
However, we don't have anyway of enforcing that those remain in place
yeah, while I do agree that that's indeed true and easy to misconfigure, I think it's just inevitable that our move to "protect the endpoints as if you were already compromised", and this can be a motivator to getting better at this (w/ e.g. issues like concourse/concourse#2415 and not exposing endpoints w/out auth in general) 🤔
(my point being that by forcing ourselves to rely less on a "perimeter of protection", we can be even more motivated to get our infra better protected to any scenario)
Hey,
We've been recently receiving complaints that resources like
docker-image
andregistry-image
have been failing with "429 Too Many Requests".While we did introduce retries at the resource-type level for registry-image, (see
concourse/registry-image-resource#69) those using
docker-image (or trying to reach dockerhub directly) would still suffer from the
limit being place on our IP.
My hypothesis is that by removing the NAT machine that we have in the
bosh
network (which ends up making every request from any of the 40+ machines we
have going out from that single IP), we can then get rid of the problems we're
currently facing w/ regards to limits on the number of requests (aside from
reducing one hop and a single point of failure).
Last week, I naively tried just removing the routes that we have set at the
network level
prod/iaas/bosh.tf
Lines 135 to 153 in 92cf177
but that didn't really work as expected as the machines that we create in the
bosh
network do not assign ephemeral external IPs:(from https://cloud.google.com/vpc/docs/vpc#internet_access_reqs)
prod/bosh/cloud_config.yml
Lines 29 to 36 in 92cf177
Given that we're on GCP, we can overcome that by using the
ephemeral_external_ip
property - see https://bosh.io/docs/google-cpi/#networks.Should we do that? I think so - if we don't have the requirement of having those
machines completely unreachable at all (not really true in our case), I think we
should just drop it.
Thanks!
The text was updated successfully, but these errors were encountered: