Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update all image urls to base64 encoded images #1875

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

gileswells
Copy link

In theory this should close matryer/xbar#859

I'm tossing this up now but I haven't gone through file by file to make sure my changes didn't change more than expected. If this approach is welcomed by the maintainers I can spend a little more time on it to do some additional due diligence. The main workhorse here is the process-images.sh file that I've included in my commit. It's not necessary but good to see how this was batched.

@leaanthony
Copy link

Thanks so much for taking the time to do this @gileswells! I think it's not a bad solution but would like input from @matryer too 👍

@sprak3000
Copy link
Collaborator

It's an interesting discussion. You are trading off "tracking" via an image URL with the potential for overflow attacks decoding the image string. Also a trade-off of a slight increase to the burden of a plugin developer. Rather than just upload a screenshot image somewhere and drop a link to it, they have to now base64 encode it. Could have the desired effect of people not bothering with an image at all thus eliminating tracking / overflow altogether. 😄

@Raboo
Copy link
Contributor

Raboo commented Nov 24, 2022

@sprak3000 You seem to raise red flags where there are none. @gileswells is not adding a feature to as you put it "potential for overflow attacks decoding the image string", that feature is already there. If you want to do an overflow attack, you simply publish your malicious plugin with your malicious base64 encoded code.
If you are worried that @gileswells has sneak in malicious base64 code then you can simply use his process-images.sh script to reproduce his result and compare it with this pull request.

What @gileswells is doing is creating a very visible way of taking all the image links and converting them to embedded base64 encoded images. Also he is not burdening any of the plugin developers.

Personally I think this is a good idea, hosted images on the Internet tend to disappear over time plus you avoid having a tool making unnecessary network calls all over the Internet.
My plugins images is hosted on imgur. What if I decide to remove my account there in 2 years time or they decide to shut down?

As a improvement this could be made into a github action that sends a new base64 encoded image PR whenever someone publishes a plugin with an image link, thus not burdening any developers in the future.

@sprak3000
Copy link
Collaborator

@sprak3000 You seem to raise red flags where there are none. @gileswells is not adding a feature to as you put it "potential for overflow attacks decoding the image string",

If you are worried that @gileswells has sneak in malicious base64 code then you can simply use his process-images.sh script to reproduce his result and compare it with this pull request.

Let me provide a bit of context around my comment. This PR is spun off of an issue where I found the URL in question was an image reference in one plugin's metadata. I expressed my opinion it was not a malicious attempt at tracking, just a simple hosting of a screen shot for their plugin. There was discussion around avoiding this concern entirely by embedding the image in the plugin potentially via base64 encoding. @gileswells came across the issue and opened this PR.

My comment here was born out of that initial issue thread. That thread was expressing concerns around potential security and tracking issues with images in the metadata. I am in no way implying @gileswells is attempting something malicious with this PR. If anything, I commend them for doing yeoman's work to address the concerns of a particular segment of this community.

My comment was pointing out base64 libraries have had security issues (c.f., CVE-2017-1000430 as one example from the Rust community). Most encoding / decoding libraries have had these sort of issues. Perhaps my original comment should have included my thought the potential for attack is low. However, it is still a concern / trade-off to consider.

Personally I think this is a good idea, hosted images on the Internet tend to disappear over time plus you avoid having a tool making unnecessary network calls all over the Internet.

This is a wonderful point in favor of this change, the sort of discussion I was hoping to spark with my original comment.

As a improvement this could be made into a github action that sends a new base64 encoded image PR whenever someone publishes a plugin with an image link, thus not burdening any developers in the future.

Again, a wonderful suggestion... I would suggest one change to it and this PR. Rather than have the existing metadata tag changed to be base64, introduce a new metadata tag for the base64 version of the image. The GitHub action would flow like this:

  • Image metadata tag detected. Download the image and convert to base64.
  • If no base64 metadata tag, add one.
  • If there is a base64 metadata tag already in the plugin, compare against downloaded version.
    • If they are the same, nothing to do.
    • If they differ, update the tag with the downloaded version.

This would allow plugin developers to just use a URL, something they already have at hand. xbar though would have an encoded version to use avoiding concerns around network calls and tracking. Best of both worlds perhaps?

Hope this clarifies things.

@gileswells
Copy link
Author

I don't think I've had so many @ messages in a single PR before. 🎉

Moving this processing from process-images.sh to a GitHub Action would absolutely be something that I'd recommend building out. Make sure it's one process that's building them out and have it be automated for future PRs sounds like a big win.

I would suggest one change to it and this PR. Rather than have the existing metadata tag changed to be base64, introduce a new metadata tag for the base64 version of the image.

I agree with @sprak3000 with my only exception being to name it something like imageBase64 so that it's not ambiguous to anyone just looking at the code for the first time.

If you are worried that @gileswells has sneak in malicious base64 code then you can simply use his process-images.sh script to reproduce his result and compare it with this pull request.
Personally I think this is a good idea, hosted images on the Internet tend to disappear over time plus you avoid having a tool making unnecessary network calls all over the Internet.

It's the missing images that caused the most issues with my script though larger images did too. Some of the missing images, due to DNS or other issues, I had to manually set to a 404 page that would actually respond with the 404 before re-processing and have it pull in the default no image found base64 string. There were also some other slight variations with how the line was commented that was had to be manually corrected. Overall, the script I wrote got me most of the way there across the 474 plugins it had to process through.

Also he is not burdening any of the plugin developers.

A GitHub Action would absolutely be key here and I have a GitHub Action I've implemented on a repo for work which already gets me halfway there. Basically use a GitHub Action to grab a list of the PR's changed files and simply find which files have an image property but don't include the imageBase64 property and create it similarly to how process-images.sh does but obviously take the time to make it less of a sledgehammer solution and more of a proper solution that will always work.

The GitHub action would flow like this:

I don't fully agree with @sprak3000's flow here because it doesn't account for when an image returns a 404. Especially if we already have an imageBase64 tag then we wouldn't want to replace that with the image not found base64 string as the previous image is likely better. Also, do we fully need to compare what we get via downloading their image vs what the developer themselves provided? Couldn't they bypass this "validity check" of sorts by simply not including the image url?

What if the images were just included within the plugins repo from the start? Has that been explored?

@sprak3000
Copy link
Collaborator

I don't fully agree with @sprak3000's flow here because it doesn't account for when an image returns a 404. Especially if we already have an imageBase64 tag then we wouldn't want to replace that with the image not found base64 string as the previous image is likely better. Also, do we fully need to compare what we get via downloading their image vs what the developer themselves provided? Couldn't they bypass this "validity check" of sorts by simply not including the image url?

It's not necessarily a validity check but more of a "is the meta tag data the latest?" check. Now that I type that out loud, I think my assumption was authors manually updating the base64 data themselves and not doing that for a later version despite using a new image. Would probably make more sense just to say "Our GH action will take your latest image URL and do the rest every single time." A 404 going forward would mean the action strips the base64 tag out.

What if the images were just included within the plugins repo from the start? Has that been explored?

I would be interested to hear the opinions of the original issue commenters on that. What is the boundary on their tracking concerns? I don't particularly have an issue with the images being hosted in general -- repo or not. YMMV 🤷🏽

@gileswells
Copy link
Author

The biggest concern that I saw in the original issue was the calling out to other domains, especially the one that had a subdomain related to it of https://bigdata.bihell.com/. Hosting the images within the repo or requiring use of a base64 string should both be viable options from my original read through as neither calls out to a remote domain.

@nekoniaow
Copy link

nekoniaow commented May 29, 2023

What if the images were just included within the plugins repo from the start? Has that been explored?

I would be interested to hear the opinions of the original issue commenters on that. What is the boundary on their tracking concerns? I don't particularly have an issue with the images being hosted in general -- repo or not. YMMV 🤷🏽

As far as I am concerned this would be great. My concern was indeed that those external URLs allowed for tracking of xbar user on a per-installation basis.
With the images stored on a dedicated repository accessed mostly through GitHub actions from the main repository the effectiveness of any possible tracking would be seriously decreased.

@leaanthony
Copy link

@matryer you good for this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

What is it? Why xbar connect to bihell.com on startup?
5 participants