-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pyth 0.7, with improved partial Python 3 port #44
Open
prechelt
wants to merge
24
commits into
brendonh:master
Choose a base branch
from
prechelt:master
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
modernize automates most of the changes required to make Python 2 code compatible with Python 3. The resulting code will rely on one additional package: six Some more changes will be required for code such as this one that handes both binary data and text data without always being explicit in the original Python 2 code which is which. The file modernize_output.txt captures the diff of these changes as shown by the modernize call. The file modernize_output_strippeddown.txt is a subset of that previous file (as described in the header of itself).
pyth\plugins\rtf15\reader.py pyth\plugins\xhtml\writer.py The former in particular was tricky because most strings have to be handled as bytestrings -- but not all of them. See http://pythonhosted.org/six/ These two now appear to work for ASCII and 8-bit non-ASCII characters in the RTF file, at least for a simple RTF file. Complex files and true Unicode remain to be seen.
It should now work correctly.
They are based on pairs of files with the same basename in directories tests/rtfs (inputfiles) and tests/rtf-as-html (reference outputfiles) The unittest test method for each pair is created dynamically by test_readrtf15.py and will write the actual output to yet another file with that same basename in directory tests/currentoutput (files holding actual outputs). Those files are deleted only if the test succeeds and so can be used for analysis if the test fails.
- performed changes needed to run under Python 3 - fixed one mistake in decodeTable - added blank to the codes list - improved the error message - added two more test cases: - msword-symbol.rtf - wordpad-symbol.rtf - neither of them works correctly, see 'Limitations' section in top-level README - therefore, the corresponding 'correct' outputs are empty files (to make the automated tests fail)
I have thrown out the complex test files (except the interesting zh-cn.rtf) as well as the overly simple or redundant test files and have introduced a set of simple tests (not covering too much functionality at once) and a regular naming of the test files. The file names in tests/rtfs now consist of two parts: program-testcontent.rtf where program is one of - msword: Microsoft Word from Microsoft Office 2013 on Windows 7 - librewriter: Writer from LibreOffice 4.4 on Windows 7 - wordpad: Microsoft Wordpad on Windows 7 and testcontent is a name describing roughly what functionality is tested in the file. The corresponding found-to-be-correct test outputs are in tests/rtf-as-html/*.html where these outputs have been truncated to zero bytes for test cases with errors in their output.
more py3 updates
…-py3 unfortunately, I am surprised by why this merge is necessary.
merge robertour PR of change by Petro-Viron: pyth.plugins.plaintext.writer seems to be working now
No argument is allowed for \super (like for \sub, contrary to \up and \dn). (This has already been fixed in the python2 version.)
Fix signature of handle_super in rtf parser
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hi Brendon,
this replaces PR #33 and contains the same commits plus some more.
I needed to move from the
pyth.zip
I had used for a long time in my setup to a more proper git dependency and decided to consider this a hint I should consolidate mypyth
work some more.test_readrtf15.py
is now a reasonable set of tests with proper skipping and expectedFailures to indicate the state of the RTF reading functionality. This also exercises and hence co-tests theXHTMLWriter
andPlaintextWriter
and so also shows their functionality. This involves 18 new reference output test data files intests/rtf-as-txt
.I have also reworked
README
(and turned it intoREADME.md
)For my purposes, this is a satisfactory package. I have therefore set the version to
0.7
insetup.py
and set theurl
to point to my fork.Feel free to adopt this version and release it to PyPI, although some compatibility testing on Python 2 would probably be a good idea before -- I have not done that.
There is plenty left to do before the package as a whole supports Python 3.
There is still more left to do in terms of functionality. No boredom anywhere in sight.