Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does "DownloadDuplicatedMedia" work? #658

Open
lukaeber opened this issue Dec 23, 2024 · 0 comments
Open

Does "DownloadDuplicatedMedia" work? #658

lukaeber opened this issue Dec 23, 2024 · 0 comments

Comments

@lukaeber
Copy link

First, I just want to say thank you to @sim0n00ps and everyone else that contributes here and is willing to help. This tool is amazing and a huge timesaver.

I'm having an issue trying to scrape an account with a lot of media and I'm running into an issue I've had before with other large accounts that I never found a great solution to. The account has 929 videos, but when I try to scrape the whole account, it failed to download around 200 videos. I figured most of those 200 skipped files were duplicates, but I've had some issues in the past with the downloader skipping over non-duplicate files, so I deleted all the media and metadata for the first scrape and ran it again with "DownloadDuplicatedMedia" set to "true" and got the exact same results--about 200 video files were skipped.

So I thought maybe there are just too many files to handle, so I picked a date around the middle of the time the model has been posted and ran 2 scrapes (one before and one after). That method did pick up a few of the video files that were skipped when I tried to scrape the whole account, but there were still close to 200 video files missing.

Then I went through all the videos that were posted since the beginning of this year and found that there were 46 videos that did not get downloaded. So I ran the scrape again set to download only media posted after 1/1/2024 and it picked up all but 14 of those videos. I then did another re-scrape set to content after 3/1/2024 and got the missing videos from 2024 down to 8.

It is clear that some of these skipped files are, indeed, duplicated media (I haven't verified that they all are, but they could be), even though I have "DownloadDuplicatedMedia" set to "true." Is this feature broken, or am I misunderstanding something about how it is supposed to work?

Here is the config from the last scrape attempt, if it is helpful:

"DownloadAvatarHeaderPhoto": false,
"DownloadPaidPosts": true,
"DownloadPosts": true,
"DownloadArchived": true,
"DownloadStreams": true,
"DownloadStories": true,
"DownloadHighlights": true,
"DownloadMessages": true,
"DownloadPaidMessages": true,
"DownloadImages": true,
"DownloadVideos": true,
"DownloadAudios": true,
"IncludeExpiredSubscriptions": false,
"IncludeRestrictedSubscriptions": false,
"SkipAds": false,
"DownloadPath": "D:/DRM",
"PaidPostFileNameFormat": "{postedAt}{username}{text}{mediaid}",
"PostFileNameFormat": "{postedAt}
{username}{text}{mediaid}",
"PaidMessageFileNameFormat": "{createdAt}{username}{text}{mediaid}",
"MessageFileNameFormat": "{createdAt}
{username}{text}{mediaid}",
"RenameExistingFilesWhenCustomFormatIsSelected": true,
"Timeout": -1,
"FolderPerPaidPost": false,
"FolderPerPost": false,
"FolderPerPaidMessage": false,
"FolderPerMessage": false,
"LimitDownloadRate": false,
"DownloadLimitInMbPerSec": 10,
"DownloadOnlySpecificDates": true,
"DownloadDateSelection": "after",
"CustomDate": "2024-03-01",
"ShowScrapeSize": true,
"DownloadPostsIncrementally": true,
"NonInteractiveMode": false,
"NonInteractiveModeListName": "",
"NonInteractiveModePurchasedTab": false,
"FFmpegPath": "D:/Scrapers/! OF/OFDLV1.7.83/ffmpeg.exe",
"BypassContentForCreatorsWhoNoLongerExist": false,
"CreatorConfigs": {},
"DownloadDuplicatedMedia": true,
"IgnoredUsersListName": "",
"LoggingLevel": "Debug",
"IgnoreOwnMessages": false

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant