Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rendering issues for GD fieldnotes converted through pandoc #22

Open
njr2128 opened this issue Feb 4, 2022 · 3 comments
Open

Rendering issues for GD fieldnotes converted through pandoc #22

njr2128 opened this issue Feb 4, 2022 · 3 comments
Assignees
Labels
GD-fieldnotes Fieldnotes crafted and converted from Google Drive (GD) rather than the wikidump

Comments

@njr2128
Copy link
Member

njr2128 commented Feb 4, 2022

Problem

For fieldnotes created in GD (rather than from the wiki dump), the files need to be converted from gdocs to html.
NJR has recently been doing this by downloading from GD as .docx, then using pandoc to convert to html, using, ,e.g.:
pandoc --extract-media=. -o fa17_fld_mk_team_lifecasting-burnout-and-pouring.html -f docx -t html fa17_fld_mk_team_lifecasting-burnout-and-pouring.docx

This has worked well to simplify the html and remove clunky, unnecessary junk (which is the problem when one uses GD's built-in "download as html" option - see cu-mkp/fieldnotes-restructuring#4). HOWEVER, a new problem presented itself with some of the rendering of features in the html versions, namely tables.

Tables render as all centered, bolded lines of text.

Example

ORIGINAL:
image

CONVERTED HTML RENDER (https://fieldnotes.makingandknowing.org/pre-2018-Fall/sp18_rosenkranz-uchacz_naomi-tianna_varnishes-in-the-rain/sp18_rosenkranz-uchacz_naomi-tianna_varnishes-in-the-rain-2/sp18_rosenkranz-uchacz_naomi-tianna_varnishes-in-the-rain-2-varnish-making-testing-application.html):
image

HTML:
image

Potential Solutions

  1. add some cookie-cutter CSS to the top of the html file to dictate the rendering of the "table", "header", and "odd" elements --> @gschare
  2. try a different process to convert the file with pandoc: .docx to .md to .html (since pandoc seems to handle docx to md better) --> @njr2128
@njr2128 njr2128 added the GD-fieldnotes Fieldnotes crafted and converted from Google Drive (GD) rather than the wikidump label Feb 4, 2022
njr2128 added a commit that referenced this issue Feb 11, 2022
njr2128 added a commit that referenced this issue Feb 11, 2022
njr2128 added a commit that referenced this issue Feb 11, 2022
@njr2128
Copy link
Member Author

njr2128 commented Feb 11, 2022

downloaded .docx from GD, then used pandoc to convert to md (and extract media):
pandoc --extract-media=. -o sp18_rosenkranz+uchacz_naomi+tianna_varnishes-in-the-rain-2-varnish-making-testing-application.md -f docx -t gfm sp18_rosenkranz+uchacz_naomi+tianna_varnishes-in-the-rain-2-varnish-making-testing-application.docx

Then did some cleanup of the md doc (lbs interpreted as quotations). Then used pandoc to convert to html:
pandoc -o sp18_rosenkranz+uchacz_naomi+tianna_varnishes-in-the-rain-2-varnish-making-testing-application.html -f gfm -t html sp18_rosenkranz+uchacz_naomi+tianna_varnishes-in-the-rain-2-varnish-making-testing-application.md

@njr2128
Copy link
Member Author

njr2128 commented Feb 11, 2022

Result is not that much better:
image

but perhaps the underlying html is? See "even"/"odd" rather than "header"/"odd":
image

@njr2128
Copy link
Member Author

njr2128 commented Feb 25, 2022

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
GD-fieldnotes Fieldnotes crafted and converted from Google Drive (GD) rather than the wikidump
Projects
None yet
Development

No branches or pull requests

2 participants