-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Seeking remote file with HTTP Range header #479
Comments
So I just went ahead and tried it. The below script extracts libwidevinecdm.so directly in-memory with HTTP range requests. It allows to trade-off the metrics "number of HTTP requests to Google" and "memory consumption". With minor modifications it would also allow caching on disc instead of in-memory. Running the code with different cache sizes means (all with a 0MB free disc space requirement except of course for the final libwidevinecdm.so). The peak memory usage is calculated by fil-profile (https://pythonspeed.com/fil/docs/fil/trying.html): Cache size 3MB: 365 HTTP Range requests to Google in 26 seconds, 86MB peak memory usage I have very fast Internet though. I'm not entirely sure if making "more HTTP request" to Google is really an issue, because the TCP response size is also large when requesting one big file. The overhead of the HTTP requests is negligible compared to the download size. I would say this is at least worth a try for users that don't have 1GB of disc space left. But you could also consider it for all users. I think this would really be worth it, because we search for a 8MB file in a 1G remote zip. I wonder if we would change the ext2 parsing code to further optimize the reads on the file (and therefore the HTTP requests). It currently looks like the file is read twice, so the optimal approach would be to additionally cache the chunks on disc (if there is space, if there is no space then just proceed with download chunks via Range). Note that the main point of the script is the HTTPFile class, the rest is more or less glue code I borrowed from your project to demonstrate how it works (and I changed a couple of things so this works standalone, because I don't have a proper InputStreamHelper dev environment):
|
Thanks for coming up with this interesting proof of concept! However I see some problems to make this the main approach in our add-on for getting the Widevine CDM on ARM devices:
Feel free to come up with a PR implementing this as an option for "expert users". After some more testing, I guess this can be merged. |
Thanks for considering. It is currently only slower than downloading a single 1GB image because I haven't implemented caching on disk and I haven't implemented multi-chunk caching (currently only caches one chunk). Therefore it downloads large parts of the zip file twice, which is of course not optimal. I have a different view on it: As the implementation allows to decided what happens (use memory, disc or more connections), we can just make the default behave just the same as now. How about:
That would probably make thinks faster compared to now for people who have enough memory (e.g. Raspberry Pi 4 with 4 or 8 GB RAM). I guess it should be no issue to "resort to the next strategy" as a fallback if something goes wrong in the approach chosen. I have a couple of questions:
I'm currenlty still thinking about how I could visualize which chunks are necessary from the zip file at all. |
This is not implemented but it should be possible using a standard linux command and execute this with
There is no easy development environment.
No, I use a text editor and a symlink from a local git repo to a real Kodi installation
And I enabled debug logging in advancedsettings.xml in ~/.kodi/userdata/
To speed up testing on a real Kodi installation you can automatically execute add-on functions on startup.
Automatically install Widevine with autoexec
|
Hi there,
Awesome project, was nerd-sniped when my disc didn't have enough space to download the ChromeOS.
I was wondering if we could work around the necessity of downloading the ChromeOS image to disk or to memory and rather only download the parts we need in each "read".
Technical feasability
As I see it, the main mechanism that is used in inputstreamhelper, is feeding a file-like object into ZipFile:
script.module.inputstreamhelper/lib/inputstreamhelper/widevine/arm_chromeos.py
Line 322 in b21b228
ZipFile will only read some metadata, such as the end of central directory by using seek/tell/read:
https://github.com/python/cpython/blob/ffa505b580464d9d90c29e69bd4db8c52275280a/Lib/zipfile.py#L1343
You then call the open() function on the ZipFile object which returns an object of type ZipExtFile. Again, ZipExtFile does only read some metadata from the file at this point.
On the ZipExtFile object your code calls seek/read/close etc. and the ZipExtFile does zip-specific things, but it only calls seek/tell/read on the originally given file-like object as well when asked.
Summarised: Any file-like object should work with the current ZipFile approach.
Proposal
Create a file-like HttpFile class that implements seek/tell/read etc. and uses the HTTP Range feature to only fetch certain parts of the zip file from the Google servers. I guess the class will need to be clever about the chunks it caches (e.g. it always keeps a 100MB chunk in memory), so that not every read() call will result in an HTTP request to the Google servers. Instead of downloading the ChromeOS image to disk, pass the HttpFile object into ZipFile.
I just checked and the Google servers where the ChromeOS images are downloaded do support HTTP Range.
Obviously this would need some testing (e.g. with a proxy to see how many HTTP request go out and what a good cache chunk size is).
Pro/Cons
Pro:
Con:
Alternatively, it would also be possible to only use this approach if less than the necessary disk space is available.
Was something like this approached before? What do you think?
The text was updated successfully, but these errors were encountered: