-
-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
yaydl can't download from YouTube playlists yet. #6
Comments
Hmm. Yes, indeed. There are several TODOs for this:
I hope I'll find the time to work on this soon. Contributions are welcome. |
I had the idea of making a Rust downloader like this that is fully interoperable with the extractors from youtube-dl. The main draw of youtube-dl is the community that is constantly adding extractors for new sites. The rest of the code, especially the cpu-bound stuff, can and should be rewritten in Rust. But in the meantime, before we have extractors for every site, we could take advantage of the existing solutions. Support for that is something I would be interested in looking into if it fits within the goals of the project. |
That would require Python support in yaydl, wouldn’t it? |
Hmm it appears that it might. I envisioned using Py03 to call python extractors using Rust bindings and it looks like it uses an embedded interpreter to make that happen. I think the harder part would be that they usually make calls to other python utils that the youtube dl library provides. I guess we'd have to make python bindings that call our Rust code the other way? Probably would take a lot longer than just updating the regex in the existing Rust extractor, but if it worked it would add a lot of functionality that could later be oxidized. |
I would actually like to see a “generic” extractor like youtube-dl’s in yaydl which would solve most problems if done right…? |
Looking at the youtube-dl generic extractor it looks like the main guts of it start at the _real_extract function. Looks like it checks a bunch of common things to look for the video and then checks for playlist files like m3u and xspf. Funnily enough, it will do a bunch of fallback checks for embedded videos using the existing extractors for other sites. Would be interesting to see what the upper limit is for generic extractor effectiveness. |
Hmm. A generic "look for anything m3u(8) and fetch everything in it" extractor should already be doable with yaydl's built-in methods and the site scraper crate. I'll be (mostly) off the keyboard over the weekend, so I probably won't look at this ticket before next week (presumably, also the weekend). Thank you for your ideas so far! |
PSA: Playlists are still left as an exercise to ... uh ... me, I guess. |
A similar error occurs if we use a shorts link (e.g. https://youtube.com/shorts/<video_id>) However, if we convert it into a normal video link (https://youtube.com/watch/?v=<video_id>) then it works So, maybe you can handle that in the regex as well because other wise it panics due to an unwrap() on the capture groups |
Try this: let id_regex = Regex::new(r"(?:v=|\.be/|shorts/)(.*?)(&.*)*$").unwrap(); Does it work? If so, I'll push an update... |
No, its does not work. I tried with the following link https://www.youtube.com/shorts/HVcVhfq1SVY |
I pushed a 0.12.0 upstream that seems to detect the URL, at least...? |
Here's a link I tried downloading:
yaydl can't parse this. It's a bit of a complicated link because it's part of a playlist and there's a timestamp attached. However, it can parse this link
I think with a little bit of regex, the extractor could parse these links better. Probably in the future, it would be good to parse it and recognize it's part of a list and give the option to download the whole playlist, but an easy solution for now is to throw everything past the watch ID out and send it to the downloader. The timestamp can probably be thrown out in almost all cases.
The text was updated successfully, but these errors were encountered: