yaydl can't download from YouTube playlists yet. #6

wkrettek · 2022-06-26T18:26:15Z

Here's a link I tried downloading:

https://www.youtube.com/watch?v=F8sZRBdmqc0&list=WL&index=10&t=1040s

yaydl can't parse this. It's a bit of a complicated link because it's part of a playlist and there's a timestamp attached. However, it can parse this link

https://www.youtube.com/watch?v=F8sZRBdmqc0

I think with a little bit of regex, the extractor could parse these links better. Probably in the future, it would be good to parse it and recognize it's part of a list and give the option to download the whole playlist, but an easy solution for now is to throw everything past the watch ID out and send it to the downloader. The timestamp can probably be thrown out in almost all cases.

The text was updated successfully, but these errors were encountered:

dertuxmalwieder · 2022-06-26T18:38:38Z

Hmm. Yes, indeed. There are several TODOs for this:

Add a flag to yaydl to switch between playlists and non-playlists (e.g. -p).
If -p is not supplied, the &list part will be skipped.
Otherwise, youtube.rs needs playlist support.

I hope I'll find the time to work on this soon. Contributions are welcome.

wkrettek · 2022-06-30T00:54:18Z

I had the idea of making a Rust downloader like this that is fully interoperable with the extractors from youtube-dl. The main draw of youtube-dl is the community that is constantly adding extractors for new sites. The rest of the code, especially the cpu-bound stuff, can and should be rewritten in Rust. But in the meantime, before we have extractors for every site, we could take advantage of the existing solutions. Support for that is something I would be interested in looking into if it fits within the goals of the project.

dertuxmalwieder · 2022-06-30T07:50:25Z

That would require Python support in yaydl, wouldn’t it?

wkrettek · 2022-06-30T14:40:11Z

Hmm it appears that it might. I envisioned using Py03 to call python extractors using Rust bindings and it looks like it uses an embedded interpreter to make that happen. I think the harder part would be that they usually make calls to other python utils that the youtube dl library provides. I guess we'd have to make python bindings that call our Rust code the other way? Probably would take a lot longer than just updating the regex in the existing Rust extractor, but if it worked it would add a lot of functionality that could later be oxidized.

dertuxmalwieder · 2022-06-30T15:33:30Z

I would actually like to see a “generic” extractor like youtube-dl’s in yaydl which would solve most problems if done right…?

wkrettek · 2022-06-30T22:22:15Z

Looking at the youtube-dl generic extractor it looks like the main guts of it start at the _real_extract function. Looks like it checks a bunch of common things to look for the video and then checks for playlist files like m3u and xspf. Funnily enough, it will do a bunch of fallback checks for embedded videos using the existing extractors for other sites. Would be interesting to see what the upper limit is for generic extractor effectiveness.

dertuxmalwieder · 2022-06-30T22:26:08Z

Hmm. A generic "look for anything m3u(8) and fetch everything in it" extractor should already be doable with yaydl's built-in methods and the site scraper crate.

I'll be (mostly) off the keyboard over the weekend, so I probably won't look at this ticket before next week (presumably, also the weekend). Thank you for your ideas so far!

dertuxmalwieder · 2022-06-30T22:41:35Z

PSA:
I pushed yaydl 0.10.1 to crates.io, this repository will be updated in a minute or two, only addressing the "broken" (= incomplete) regex in your original bug report.

Playlists are still left as an exercise to ... uh ... me, I guess.
(Actually, to anyone.)

akshettrj · 2023-02-27T07:39:25Z

A similar error occurs if we use a shorts link (e.g. https://youtube.com/shorts/<video_id>)

However, if we convert it into a normal video link (https://youtube.com/watch/?v=<video_id>) then it works

So, maybe you can handle that in the regex as well because other wise it panics due to an unwrap() on the capture groups

dertuxmalwieder · 2023-02-27T09:58:39Z

Try this:

let id_regex = Regex::new(r"(?:v=|\.be/|shorts/)(.*?)(&.*)*$").unwrap();

Does it work? If so, I'll push an update...

akshettrj · 2023-02-27T12:58:21Z

No, its does not work.

I tried with the following link https://www.youtube.com/shorts/HVcVhfq1SVY

dertuxmalwieder · 2023-02-27T14:40:36Z

I pushed a 0.12.0 upstream that seems to detect the URL, at least...?

dertuxmalwieder changed the title ~~yaydl can't parse youtube links with extended URLs~~ yaydl can't download from YouTube playlists yet. Jun 30, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

yaydl can't download from YouTube playlists yet. #6

yaydl can't download from YouTube playlists yet. #6

wkrettek commented Jun 26, 2022

dertuxmalwieder commented Jun 26, 2022

wkrettek commented Jun 30, 2022

dertuxmalwieder commented Jun 30, 2022

wkrettek commented Jun 30, 2022

dertuxmalwieder commented Jun 30, 2022

wkrettek commented Jun 30, 2022

dertuxmalwieder commented Jun 30, 2022

dertuxmalwieder commented Jun 30, 2022

akshettrj commented Feb 27, 2023

dertuxmalwieder commented Feb 27, 2023

akshettrj commented Feb 27, 2023

dertuxmalwieder commented Feb 27, 2023

yaydl can't download from YouTube playlists yet. #6

yaydl can't download from YouTube playlists yet. #6

Comments

wkrettek commented Jun 26, 2022

dertuxmalwieder commented Jun 26, 2022

wkrettek commented Jun 30, 2022

dertuxmalwieder commented Jun 30, 2022

wkrettek commented Jun 30, 2022

dertuxmalwieder commented Jun 30, 2022

wkrettek commented Jun 30, 2022

dertuxmalwieder commented Jun 30, 2022

dertuxmalwieder commented Jun 30, 2022

akshettrj commented Feb 27, 2023

dertuxmalwieder commented Feb 27, 2023

akshettrj commented Feb 27, 2023

dertuxmalwieder commented Feb 27, 2023