-
-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FR] (possibly heresy) Shared Resources #858
Comments
Thank for the suggestion. I think this issue would more suited for SingleFileZ than SingleFile. To be honest, I have a problem with the idea that Single File saves web pages in several files... |
Really? I would've guessed the opposite. The extension seems more suited for backing up things that one simply comes across, while the CLI tool seems more suited for long-term use, as backups keep building up.
Well, that might depend on one's interpretation. Making multiple page saves share resources without clobbering each other is difficult without using something like warc. So, in essence, rather than having a copy of the same resource for each page, each instance of a resource is a single file that is shared throughout. |
Having thought about it, particularly due to Windows filename restrictions and, potentially, maximum filepath length issues with long URLs, it would probably be better to have all the files named after only their hash and extension, and for them all to be in the same folder. SingleFile already offers the option to indicate what the filename of Base64 images originally was, so perhaps the same could be done for every other resource downloaded in this manner. I don't know if every tag has a supported property like that, but it wouldn't necessarily have to be a supported property so long as the user could look at the source or use inspect element to figure it out. This would also potentially prevent duplicates from cropping up due to the website changing where the image is being retrieved from. This can be a real issue with sites that make heavy use of CDNs especially. |
I'm closing this issue because it's of scope. However, I recommend you to take a look at this issue: gildas-lormeau/SingleFileZ#119. |
There are certain pages I like to make backups of whenever my script detects they've been changed. However, over the past couple of years, the amount of disk space my backup folder is taking up is a lot larger than I originally anticipated.
While it's antithetical to the primary intent of the project, I'm wondering if you'd be willing to implement an option to instead 'localize' resources into a 'SingleFile' subfolder with an MD5 hash appended to it (using SparkMD5 for fast generation). For instance, a resource like
https://ajax.googleapis.com/ajax/libs/jquery/1.12.4/jquery.min.js
would be saved to:SingleFile/ajax.googleapis.com/ajax/libs/jquery/jquery.min-4F252523D4AF0B478C810C2547A63E19.js
This would allow multiple pages to all utilize the same resources without causing any file collisions due to changes in the resources.
The text was updated successfully, but these errors were encountered: