You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I wrote a downloader using youtube-dlp, but a lot of the IPs get blocked after ~ 10K or so downloads. I'm surprised people are successfully downloading the dataset using the provided downloading script on a single machine, as I would strongly expect YouTube to block after a few gigabytes of data are downloaded.
Are there any proxies / tools / tricks used to download the entire dataset and avoid Youtube blocking?
The text was updated successfully, but these errors were encountered:
Hi @vedantroy,
Thanks for your interest about this dataset!
Unfortunately, this is a quite common issue. You can check some discussions like this one.
The best solution is: use VPN and get different IPs once you detect your IP is banned.
If you don't have a VPN, you can try to slow down the download speed by reducing processes_count and thread_count in the config file and also set a sleep counter after a few downloading steps.
Hope this information is helpful!
Hi @vedantroy, Thanks for your interest about this dataset! Unfortunately, this is a quite common issue. You can check some discussions like this one. The best solution is: use VPN and get different IPs once you detect your IP is banned. If you don't have a VPN, you can try to slow down the download speed by reducing processes_count and thread_count in the config file and also set a sleep counter after a few downloading steps. Hope this information is helpful!
@tsaishien-chen I have been troubled by this IP block issue for quite some time. Is there a template available for implementing a 'sleep counter' after a few download steps?
I wrote a downloader using youtube-dlp, but a lot of the IPs get blocked after ~ 10K or so downloads. I'm surprised people are successfully downloading the dataset using the provided downloading script on a single machine, as I would strongly expect YouTube to block after a few gigabytes of data are downloaded.
Are there any proxies / tools / tricks used to download the entire dataset and avoid Youtube blocking?
The text was updated successfully, but these errors were encountered: