Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTML conversion clears image alt text and original image filename #10

Closed
nvillahermosa opened this issue Oct 31, 2017 · 6 comments
Closed
Assignees

Comments

@nvillahermosa
Copy link

Converting to HTML clears any indication of the original image filename. Renamed images also don't correspond to the "image0, image1, etc" convention used when exporting to HTML from Google Docs, so there isn't a clean way to figure out which image file goes with which broken element.

ex, with a Google Docs filename starting with "Deploying":

<img src="images/Deploying4.png" width="" alt="alt_text" title="image_tooltip">

The HTML export directly from Google Docs contains the alt text:

<img alt="object_id.png" src="images/image3.png" ...

Unless I'm overlooking something, preserving the alt attribute seems necessary for fixing an exported file.

@evbacher
Copy link
Owner

evbacher commented Nov 1, 2017

Ahh, I hadn't considered using the HTML download to export the images. I can change so that the image markup would be:

<img src="images/image1.png" width="" alt="image1.png" title="image1.png">

so that it would match up with the files in the zipped image directory.

Thanks for filing the bug! I'll try to get to this reasonably soon.

@evbacher evbacher self-assigned this Nov 1, 2017
@romanruns
Copy link

romanruns commented Nov 1, 2017 via email

@evbacher
Copy link
Owner

evbacher commented Nov 2, 2017

nvillahermosa,

Apparently, when you download a Doc as a zipped web page (along with images), Docs assigns a filename to each image and drawing to create it and put it in the images/ dir of the zip file.

Unfortunately, when we're converting a Doc to Markdown or HTML, the images and drawings do not have any file name assigned -- getAltTitle() and getAltDescription() generally return null. I'll investigate some more, but it's not going to be as easy as I had hoped :(

@nvillahermosa
Copy link
Author

nvillahermosa commented Nov 3, 2017

Well nuts! I know the default File > Download as > Web page operation assigns alt text based on the original image file name, but it's unfortunate that the information isn't in there under the covers already. Thanks for taking the time to look into this, the script is a huge help.

@evbacher
Copy link
Owner

evbacher commented May 7, 2020

Note that issue 49 (#49) also addresses this problem with images and the image path.

@evbacher
Copy link
Owner

As of today's update (1.0β25), the image path is now in the form images/image1.png, images/image2.jpg, etc. However, beware that the exported zip file from Google Docs does not always have the images in the same order, so you'll still need to check that the images are referenced in the correct place in your converted doc. There does not seem to be any rhyme nor reason about the order in the exported zip file that you get from Download > Web Page.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants