-
-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(assets): Use entity-tags to revalidate cached remote images #12426
base: main
Are you sure you want to change the base?
Conversation
🦋 Changeset detectedLatest commit: c63f48d The changes in this PR will be included in the next version bump. Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is blocked because it contains a minor
changeset. A reviewer will merge this at the next release if approved.
JSON.stringify({ | ||
data: Buffer.from(resultData.data).toString('base64'), | ||
expires: resultData.expires, | ||
etag: resultData.etag, | ||
}), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
JSON.stringify
and Buffer.from
can be both fail at runtime, we should handle the errors somehow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've wrapped this in a try catch block which will print a warning and no-op, rather than throwing an Error. The asset should just skip the cache in this case.
|
||
if (!res.ok && res.status != 304) { | ||
throw new Error( | ||
`Failed to revalidate cached remote image ${src}. The request did not return a 200 OK / 304 NOT MODIFIED response. (received ${res.status}))`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This error isn't actionable. If a user sees that, they don't know what to do in order to fix it. We should provide a better error that gives the user an action.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that this is a bad error, however this is handled the same way in the existing loadRemoteImage
function I based the revalidateRemoteImage
function off of, which provides similarly un-actionable errors.
I can potentially handle common cases like 404 with custom messages although there are many status codes where it would be difficult to give any actionable advice, like 5xx errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for providing more context. Looking at the current use case, we don't need to throw an error because we are attempting to revalidate the image, so I assume we already have the image.
Maybe we could log a warning instead—for any non-200 status code—and inform the user that Astro couldn't revalidate the image and will use the existing one (we could add more info e.g. status code). What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah a warning would be a better solution. I've added the status text (e.g. NOT FOUND/FORBIDDEN) to the error, which should make it more understandable without needing to look up the status code first.
I've added a try catch block in generate.ts
to handle this error and errors from Request itself, which should fall back to using the stale cache and print a warning in the log.
Example warning:
[WARN] An error was encountered while revalidating a cached remote asset. Proceeding with stale cache. Error: Failed to revalidate cached remote image https://<domain>/test.jpg. The request did not return a 200 OK / 304 NOT MODIFIED response. (received 403 Forbidden)
▶ /_astro/test_Z2qHSPG.webp (reused cache entry) (+295ms) (31/33)
CodSpeed Performance ReportMerging #12426 will not alter performanceComparing Summary
|
|
||
if (!res.ok && res.status != 304) { | ||
throw new Error( | ||
`Failed to revalidate cached remote image ${src}. The request did not return a 200 OK / 304 NOT MODIFIED response. (received ${res.status}))`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for providing more context. Looking at the current use case, we don't need to throw an error because we are attempting to revalidate the image, so I assume we already have the image.
Maybe we could log a warning instead—for any non-200 status code—and inform the user that Astro couldn't revalidate the image and will use the existing one (we could add more info e.g. status code). What do you think?
* @returns An ImageData object containing the asset data, a new expiry time, and the asset's etag. The data buffer will be empty if the asset was not modified. | ||
*/ | ||
export async function revalidateRemoteImage(src: string, etag: string) { | ||
const req = new Request(src, { headers: { 'If-None-Match': etag } }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be worth also supporting If-Modified-Since
, and storing the Last-Modified
or Date
for image responses that don't include an etag.
return await fs.promises.writeFile( | ||
cachedFileURL, | ||
JSON.stringify({ | ||
data: Buffer.from(resultData.data).toString('base64'), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realise that this is what we were already doing, but it does seem a bit wasteful to be saving images as JSON-encoded base64 strings, and this could be a good time to fix it. I think it would be better to store the image in a binary, and then use a separate JSON file for metadata, probably with the same filename alongside it but with something like an appended .json
const remoteImage = await loadRemoteImage(path); | ||
return { | ||
data: remoteImage.data, | ||
expires: remoteImage.expires, | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no idea what happened here, but thank you for cleaning it, ha
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't try locally or anything, but the logic makes sense to me in the code. I agree with Matt's suggestions, but I'd be okay with those to be tackled separately personally.
I agree, the base64 encoding does seem inefficient. I would be happy to implement the sidecar file approach but I think it would be better off as a separate PR. I would also be happy to implement |
I think the best bet is to add the if-modified to this one as it's related and not much more code, and then you or someone else can follow up later with the other change |
Added revalidation with |
Changes
Store the entity tag of cached remote images, and use them to revalidate the cache when it goes stale to prevent re-downloading. Improves build time and bandwidth usage for sites with stale cached assets.
Build with fully stale cache:
Build with fully stale cache and etag revalidation:
Testing
Tested with the astro base template with a single remote image with 3 densities added, and successfully ran
pnpm --filter astro run test
.Docs
Current caching behaviour does not seem to be documented, so no documentation to update. This being said I think a new section in the astro asset docs outlining the behaviour of the asset cache would be useful.