Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix x3vid filenames (remove colons) because Windows is fragile #72

Open
youngdarwin opened this issue Apr 14, 2020 · 3 comments
Open

fix x3vid filenames (remove colons) because Windows is fragile #72

youngdarwin opened this issue Apr 14, 2020 · 3 comments

Comments

@youngdarwin
Copy link

GG gets images from x3vid.com "full image" pages, but the filenames I see on Windows are goofy because (I think) they have colons in them. [I don't know if this affects all filenames on x3vid. FWIW, the URL is: https://x3vid.com/gallery_pics/3424569/Public_nudity_35?page=1]

The HTML snippet is below and I see GG saving the image to the name used by the website as-is, but Windows displays this image as 'HQEQHE~F.JPG'. OTOH, Chrome removes the colon (silently) and then the filename is sensible. Any clues how I could hack GG to also silently remove the colons?

<a href="/i42727683/Public_nudity_35?page=1&amp;source=gallery">
  <figure>
  <img id="42727683" data-p="1" class="img-box thumb" alt="Public nudity 35 (1/10)" src="/images/14242/https:__ep5.xhcdn.com_000_146_605_573_1000.jpg" />
  </figure>
</a>
@youngdarwin
Copy link
Author

This one-line change (below, find the line marked "added THIS LINE") seems to be working. My first attempt to modify the write_to_file() method caused copy_image() to crash. I'm still not sure why.

    def copy_image(self, info):
        info.attempts += 1

        file_name = info.destination_filename()
        file_name = re.sub(r"[:]", "", file_name) # added THIS LINE
        try:
            file_info = urlopen_safe(info.path)
        except:
            return False

        try:
            modtimestr = file_info.headers['last-modified']
            modtime = time.strptime(modtimestr, '%a, %d %b %Y %H:%M:%S %Z')
        except:
            modtime = None

        if self.can_skip(file_name, file_info):
            print("Skipping existing file: " + info.path)
            return True

        if info.attempts == 1:
            print("%s -> %s" % (info.path, file_name))

        if not info.write_to_file(file_info, file_name):
            return False

        if modtime is not None:
            lastmod = calendar.timegm(modtime)
            os.utime(file_name, (lastmod, lastmod))
        return os.path.getsize(file_name) > 4096

@regosen
Copy link
Owner

regosen commented Apr 16, 2020

If x3vid.com doesn't use filenames worth preserving, I would recommend this instead:

  1. go to the gallery_plugins directory, make a copy of plugins_generic.py and call it plugins_x3vid.py (keep it in that same folder)
  2. change the last line in plugins_x3vid.py to same_filename = False

Now when you re-run it should say "Using x3vid plugin", and you should get filenames that look like 001.jpg, 002.jpg, etc.

If you're happy with the result, feel free to open a pull request with your addition of the plugin!

@youngdarwin
Copy link
Author

youngdarwin commented Apr 16, 2020

This is a good suggestion, but I have a ticket about how limited that functionality is. It recycles the numbers across pages on some sites. (I posted a patch that fixes this, but it's probably kludgy; for example, if there are two threads, the numbering starts from 3 0003, 0004, ... and I don't know why).

I was looking for a way to just remove characters that Windows considers illegal in filenames and I wish it was easier to add things to the plugins so I can add remove_colons = true in a plug-in and then add code to GG to handle that special feature of a website.

A significant problem is that I can code in other languages, but I'm almost completely ignorant of python3...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants