Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove distracting and unnecessary tags #122

Open
rien333 opened this issue Aug 31, 2019 · 6 comments
Open

Remove distracting and unnecessary tags #122

rien333 opened this issue Aug 31, 2019 · 6 comments

Comments

@rien333
Copy link

rien333 commented Aug 31, 2019

SVGs often render way too big on most websites (see e.g. github and the mozilla docs, see also the screenshot below), providing quite the distraction. Moreover, they are generally non-informative parts of a website (I'm no web developer, but one common use is providing structural rather than contentful information, such as scaleable buttons/ui elements). If someone knows websites that use a lot of non-distracting SVGs removing or altering them is obviously not a good idea.

broken svg

Other HTML tags that don't add anything to the readability of a page are <input> and <button> tags. I therefore propose to delete them. I'm willing to create a PR, given that I've already tried this for a script in qutebrowser.

Alternatively, SVGs could be scaled to make them less distracting. Looking at the SVG documentation, I suppose one could do something with the width attribute of SVGs. Not really sure how you should infer a good width value for a given SVG and browser window size though (someone who knows something about web development might, however). For reference, I've included a screenshot of the SVG above with a changed width value. To be honest, even though the SVG looks much better, I do not see in what way it enhances readability (at best, it's not distracting, but without much purpose).

smaller SVG

@rien333
Copy link
Author

rien333 commented Sep 7, 2019

@buriy No opinion?

@buriy
Copy link
Owner

buriy commented Sep 7, 2019

well, this is not the goal of the project.
I mean, you can take HTML output and strip offf any part you don't like, or set resolution for the images.
If you need to display it correctly -- maybe it's even better to use CSS for that.
The library won't be able to guess what is needed and what is not for a specific use case, that's why it shouldn't have an opinion on that.

@rien333
Copy link
Author

rien333 commented Sep 7, 2019

Understood, sorry for bothering. Still, I don't think SVGs enhance readability in the general case.

@rien333 rien333 closed this as completed Sep 7, 2019
@buriy
Copy link
Owner

buriy commented Sep 7, 2019

Sorry, maybe I haven't expressed my thoughts clearly.
If you think that based on your experience it's a typical case, we can have an option for that: (fix_svg='remove' / 'leave' / '160x120' or just strip_svg=True/False).
If you will make a PR, I'll pull it.
But if it's a really rare case or if there are obvious counterexamples, and most users won't like the defaulting it to "True" -- then maybe it's not the proper way of solving this.
However, you can still have an option "strip_svg=", defaulting to False.
Alternatively, we can have: strip_images=True , strip_images=['png', 'svg', 'ico'] , etc.
So please do some analysis of the use cases and make a PR, I'll accept it.
At the moment you have shown one example and made no analysis on how often it happens and whether True is a good default for this, you just made an argument that you think (means, it's true only for you) that SVG is not enhancing readability in the general case.
I think a good library shouldn't have opinions.
But sensible defaults (that you could edit) make the library easier to use for end-users, so it's a good idea to make them.

@rien333
Copy link
Author

rien333 commented Sep 7, 2019

Fair, I suppose I could craft a more thorough PR. Your suggestions on how to implement this seem fine.

The reason for intially proposing to remove SVGs came from googling around on typical use cases for SVGs. The results made it seem as if they are not that informative - icons, logos, lines and animations seem to be the most typical uses. So my argument for removing them was mostly "a priori", so to speak. I really tried to find counter examples, but SVGs in the bodies of documents seem to be fairly rare I guess? That's also part of the reason why I've listed only two examples.

Also note that this issue mentions <button> and <input> tags, two tags I don't really see a place/logical role for in this library.

@rien333 rien333 reopened this Sep 7, 2019
@buriy
Copy link
Owner

buriy commented Sep 7, 2019

In real projects I just use https://bleach.readthedocs.io/en/latest/
and remove all tags but [b, i, a, h1-h6, em, s, p, br] (after processing with readability). You can also specify what attributes to keep.
It doesn't have img extension filter, but I would use it to filter out button and input if I needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants