You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello. This program is great. Setting it all up on Windows with no previous Python experience was an adventure, but once I got everything in place, it's fantastic. Thank you very much for making this.
Recently I've been getting a problem with write_html.py with a particular subredddit capture. Here's the error:
Traceback (most recent call last):
File "write_html.py", line 774, in
generate_html(args.min_score, args.min_comments, hide_deleted_comments)
File "write_html.py", line 119, in generate_html
write_link_page(subs, l, sub, hide_deleted_comments)
File "write_html.py", line 288, in write_link_page
'###BODY###': snudown.markdown(c['body'].replace('>','>')),
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe2 in position 62: invalid
continuation byte
It actually creates the posts in /r/ right up to where it crashes, and with a little trial and error work I was able to isolate the problem to a specific line in a specific .csv file, which was a comment that used a "U+2019 Right Single Quotation Mark" (UTF-8 Encoding: 0xE2 0x80 0x99) as an apostrophe. When I replaced that character with a normal straight single quotation mark in the .csv file, it parsed the file fine. (I don't quite get "position 62" though, the apostrophe was the 45th character on the line.) The really puzzling thing is other comments from the same user have the same character elsewhere in the same .csv file, but those don't cause a problem.
Well, it crashed out again after I fixed that, but in a different place from eight months later and "position 159". I guess I have another buggy character to hunt down. Don't have time right now. I will update later if this second one reveals any further clues.
The text was updated successfully, but these errors were encountered:
I should have said that. Yes, with or without doing that, same issue. I even re-fetch_links.py'd the entire thing because I hadn't done the 65001/utf-8 thing the first time. Didn't help.
I also came back here and grabbed the current copy of write_html.py in case some update since my original download changed things. Nope, same problem.
It's very mysterious! Haven't had time to poke at it more, maybe next week.
Hello. This program is great. Setting it all up on Windows with no previous Python experience was an adventure, but once I got everything in place, it's fantastic. Thank you very much for making this.
Recently I've been getting a problem with write_html.py with a particular subredddit capture. Here's the error:
Traceback (most recent call last):
File "write_html.py", line 774, in
generate_html(args.min_score, args.min_comments, hide_deleted_comments)
File "write_html.py", line 119, in generate_html
write_link_page(subs, l, sub, hide_deleted_comments)
File "write_html.py", line 288, in write_link_page
'###BODY###': snudown.markdown(c['body'].replace('>','>')),
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe2 in position 62: invalid
continuation byte
It actually creates the posts in /r/ right up to where it crashes, and with a little trial and error work I was able to isolate the problem to a specific line in a specific .csv file, which was a comment that used a "U+2019 Right Single Quotation Mark" (UTF-8 Encoding: 0xE2 0x80 0x99) as an apostrophe. When I replaced that character with a normal straight single quotation mark in the .csv file, it parsed the file fine. (I don't quite get "position 62" though, the apostrophe was the 45th character on the line.) The really puzzling thing is other comments from the same user have the same character elsewhere in the same .csv file, but those don't cause a problem.
Well, it crashed out again after I fixed that, but in a different place from eight months later and "position 159". I guess I have another buggy character to hunt down. Don't have time right now. I will update later if this second one reveals any further clues.
The text was updated successfully, but these errors were encountered: