Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(assets): Use entity-tags to revalidate cached remote images #12426

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

oliverlynch
Copy link
Contributor

Changes

Store the entity tag of cached remote images, and use them to revalidate the cache when it goes stale to prevent re-downloading. Improves build time and bandwidth usage for sites with stale cached assets.

Build with fully stale cache:
Screenshot 2024-11-14 at 15 33 24

Build with fully stale cache and etag revalidation:
Screenshot 2024-11-14 at 15 36 01

Testing

Tested with the astro base template with a single remote image with 3 densities added, and successfully ran pnpm --filter astro run test.

Docs

Current caching behaviour does not seem to be documented, so no documentation to update. This being said I think a new section in the astro asset docs outlining the behaviour of the asset cache would be useful.

Copy link

changeset-bot bot commented Nov 14, 2024

🦋 Changeset detected

Latest commit: c63f48d

The changes in this PR will be included in the next version bump.

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@github-actions github-actions bot added the pkg: astro Related to the core `astro` package (scope) label Nov 14, 2024
@github-actions github-actions bot added the semver: minor Change triggers a `minor` release label Nov 18, 2024
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is blocked because it contains a minor changeset. A reviewer will merge this at the next release if approved.

packages/astro/src/assets/build/generate.ts Outdated Show resolved Hide resolved
packages/astro/src/assets/build/generate.ts Outdated Show resolved Hide resolved
packages/astro/src/assets/build/generate.ts Outdated Show resolved Hide resolved
packages/astro/src/assets/build/generate.ts Outdated Show resolved Hide resolved
Comment on lines 300 to 304
JSON.stringify({
data: Buffer.from(resultData.data).toString('base64'),
expires: resultData.expires,
etag: resultData.etag,
}),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JSON.stringify and Buffer.from can be both fail at runtime, we should handle the errors somehow

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've wrapped this in a try catch block which will print a warning and no-op, rather than throwing an Error. The asset should just skip the cache in this case.

packages/astro/src/assets/build/remote.ts Outdated Show resolved Hide resolved

if (!res.ok && res.status != 304) {
throw new Error(
`Failed to revalidate cached remote image ${src}. The request did not return a 200 OK / 304 NOT MODIFIED response. (received ${res.status}))`,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error isn't actionable. If a user sees that, they don't know what to do in order to fix it. We should provide a better error that gives the user an action.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this is a bad error, however this is handled the same way in the existing loadRemoteImage function I based the revalidateRemoteImage function off of, which provides similarly un-actionable errors.

I can potentially handle common cases like 404 with custom messages although there are many status codes where it would be difficult to give any actionable advice, like 5xx errors.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for providing more context. Looking at the current use case, we don't need to throw an error because we are attempting to revalidate the image, so I assume we already have the image.

Maybe we could log a warning instead—for any non-200 status code—and inform the user that Astro couldn't revalidate the image and will use the existing one (we could add more info e.g. status code). What do you think?

Copy link
Contributor Author

@oliverlynch oliverlynch Nov 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah a warning would be a better solution. I've added the status text (e.g. NOT FOUND/FORBIDDEN) to the error, which should make it more understandable without needing to look up the status code first.

I've added a try catch block in generate.ts to handle this error and errors from Request itself, which should fall back to using the stale cache and print a warning in the log.

Example warning:

[WARN] An error was encountered while revalidating a cached remote asset. Proceeding with stale cache. Error: Failed to revalidate cached remote image https://<domain>/test.jpg. The request did not return a 200 OK / 304 NOT MODIFIED response. (received 403 Forbidden)
  ▶ /_astro/test_Z2qHSPG.webp (reused cache entry) (+295ms) (31/33)

Copy link

codspeed-hq bot commented Nov 19, 2024

CodSpeed Performance Report

Merging #12426 will not alter performance

Comparing oliverlynch:remote-assets-use-etag (c63f48d) with main (b140a3f)

Summary

✅ 6 untouched benchmarks

packages/astro/src/assets/build/remote.ts Outdated Show resolved Hide resolved

if (!res.ok && res.status != 304) {
throw new Error(
`Failed to revalidate cached remote image ${src}. The request did not return a 200 OK / 304 NOT MODIFIED response. (received ${res.status}))`,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for providing more context. Looking at the current use case, we don't need to throw an error because we are attempting to revalidate the image, so I assume we already have the image.

Maybe we could log a warning instead—for any non-200 status code—and inform the user that Astro couldn't revalidate the image and will use the existing one (we could add more info e.g. status code). What do you think?

packages/astro/src/assets/build/remote.ts Outdated Show resolved Hide resolved
.changeset/red-poems-pay.md Outdated Show resolved Hide resolved
* @returns An ImageData object containing the asset data, a new expiry time, and the asset's etag. The data buffer will be empty if the asset was not modified.
*/
export async function revalidateRemoteImage(src: string, etag: string) {
const req = new Request(src, { headers: { 'If-None-Match': etag } });
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be worth also supporting If-Modified-Since, and storing the Last-Modified or Date for image responses that don't include an etag.

return await fs.promises.writeFile(
cachedFileURL,
JSON.stringify({
data: Buffer.from(resultData.data).toString('base64'),
Copy link
Contributor

@ascorbic ascorbic Nov 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realise that this is what we were already doing, but it does seem a bit wasteful to be saving images as JSON-encoded base64 strings, and this could be a good time to fix it. I think it would be better to store the image in a binary, and then use a separate JSON file for metadata, probably with the same filename alongside it but with something like an appended .json

Comment on lines -282 to -286
const remoteImage = await loadRemoteImage(path);
return {
data: remoteImage.data,
expires: remoteImage.expires,
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea what happened here, but thank you for cleaning it, ha

Copy link
Member

@Princesseuh Princesseuh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't try locally or anything, but the logic makes sense to me in the code. I agree with Matt's suggestions, but I'd be okay with those to be tackled separately personally.

@oliverlynch
Copy link
Contributor Author

I agree, the base64 encoding does seem inefficient. I would be happy to implement the sidecar file approach but I think it would be better off as a separate PR.

I would also be happy to implement If-Modified-Since, either in this PR or a separate one. The main changes to revalidate assets are written already so adding another header shouldn't be too much effort.

@ascorbic
Copy link
Contributor

I think the best bet is to add the if-modified to this one as it's related and not much more code, and then you or someone else can follow up later with the other change

@oliverlynch
Copy link
Contributor Author

Added revalidation with If-Modified-Since, and also a tweak to revalidateRemoteImage to use the stored etag/last-modified header if the server returns 304 Not Modified but does not include the headers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pkg: astro Related to the core `astro` package (scope) semver: minor Change triggers a `minor` release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants