Skip to content
This repository has been archived by the owner on Jul 5, 2024. It is now read-only.

[FEATURE] save Unsupported_URLs as csv instead of txt and preserve old values #864

Open
baccccccc opened this issue Mar 30, 2024 · 1 comment
Assignees
Labels
enhancement New feature or request

Comments

@baccccccc
Copy link

baccccccc commented Mar 30, 2024

Is your feature request related to a problem? Please describe.
After I finish downloading a particular forum thread, I usually take some time to go through the Unsupported_URLs.txt manually. For each link, I decide what to do. Often the solution is to go there with a browser and download it manually. Sometimes the link is just something not really valuable or helpful.

This is of course time consuming, and I would very much prefer to not have to do it again and again for the links I already investigated.

But at some time, I will re-attempt downloading the same forum thread. And there will be new entries in Unsupported_URLs.txt. I will try to investigate them as well. But it's hard to remember where you stopped last time and what you did to this or that link. So I will end up either doubting myself or going through the same list from the top and checking again the links I already investigated.

Describe the solution you'd like

  1. Save Unsupported_URLs as a CSV instead of TXT.
  2. Each time I do something to a particular link, I would add a free form comment to a separate column in this file. (Can be anything, but think of examples such as “downloaded manually on yyyy-mm-dd” or “dead link”, etc.)
  3. Each time the script runs, it will append the file instead of overwriting it.
    3a. Each time script encounters an unsupported URL, it should check whether there's already an entry for the same URL in the CSV.
    3b. If there's an existing entry for the same URL, keep it intact and do not add a new one. (More specifically, keep both the URL and whatever other columns might be added after it.)
    3c. If an existing entry is not found, add it to the end, and keep the comment blank.
  4. Either way, the script should only care about the 1st column in the file (URL) and ignore any other columns in case they contain user comments.
  5. Under no circumstances script should remove entries from this file. (Unless, I guess, a previously unsupported URL becomes supported and you download it successfully. Then it's fine to remove.)
@baccccccc baccccccc added the enhancement New feature or request label Mar 30, 2024
@ClonedBoy
Copy link

You can just import the TXT as CSV and do it separately. What I would usually do is create a separate file and then remove dups (with LibreOffice for example). +1 for the appending option, but the other things defeat the purpose of a downloader vs a dead-links tracker, imho. But if it is easy to code for the dev, then why not.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants