-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GTC-2958 Fix reference to non-existent ECR image #584
Conversation
The container_registry terraform module has the functionality that it will create a new docker image with the specified tag, but only if the docker contents have changed (as computed by its hash script, which understands .dockerignore, etc.) We use as a container tag the git SHA, which is always different when we deploy a change. But if only terraform is being changed, no new docker image will be created, so there will be no image with that tag, leading to the bug. Since the ECR registries are separated by GFW account, and within GFW-dev by the terraform.workspace name (branch), I believe we can instead change our tag reference in data.tf:template_file.container_definition to "latest" (which always exists), rather than the GIT sha tag, which may not exist. Other possible solutions: - always create a new docker image for every GIT change - make the container tag be the MD5 of the docker contents, which would mean replicating the hash script in gfw-data-api, so we can compute it ahead of time. Then we would always refer to a tag of an image that already exists or was just created.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great but it'd be nice to keep a few versions of images in ECR which allows for a quick revert when needed. Should we push an image both with the git hash and latest tags to preserve the ability to revert?
I'm not changing anything about how the images are created, so there will be just as many images as before. Each image (that is created when the docker files change) still has the git hash tag. All I did was change the reference in container_definition to use the 'latest' tag, which is always available and refers to the most recent image. If we want to revert, we would delete the current latest image, so the previous one is now the latest. If we actually do a git revert, then presumably a new docker would be created if it had changed, so that new image would be latest.a |
Oh got it, I wasn't aware |
By revert in this case, I meant manually pointing the ecs task definition to a different image in case of emergencies when it may not be desirable to wait for a github action to complete deployment. But, the only time I did that was with this issue of nonexistent image and with that solved, this may not be an issue and like you say we can revert on github and redeploy. |
Codecov ReportAll modified and coverable lines are covered by tests ✅
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## develop #584 +/- ##
========================================
Coverage 81.12% 81.12%
========================================
Files 128 128
Lines 5821 5821
========================================
Hits 4722 4722
Misses 1099 1099
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
OK, once I was able to deploy (after I destroyed a bunch of IAM roles), I see that it couldn't start the fargate task because of bad image URL. It looks like "latest" doesn't actually work as a tag in ECR, even though there are various examples (and ChatGPT) that say it is usable. I'll try the different approach where we use the docker has as the tag. |
@danscales I had missed this - I think the problem is that the image hasn't been pushed with the latest tag and is still using container_tag: https://github.com/wri/gfw-data-api/blob/master/terraform/main.tf#L38 so it may be that you have to explicitly set the |
The container_registry terraform module has the functionality that it will create a new docker image with the specified tag, but only if the docker contents have changed (as computed by its hash script, which understands .dockerignore, etc.)
We use as a container tag the git SHA, which is always different when we deploy a change. But if only terraform is being changed, no new docker image will be created, so there will be no image with that tag, leading to the bug.
Since the ECR registries are separated by GFW account, and within GFW-dev by the terraform.workspace name (branch), I believe we can instead change our tag reference in data.tf:template_file.container_definition to "latest" (which always exists), rather than the GIT sha tag, which may not exist.
Other possible solutions: