Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a parser #50

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Add a parser #50

wants to merge 2 commits into from

Conversation

fake-name
Copy link

@fake-name fake-name commented Jul 27, 2017

Fair bit of changes here.

First, there's the ability to optionally (default off) use a proper parser for formatting. This is MUCH more robust, and about 10+ times faster then the pure-regex solution.
It also should be much more correct, in that it much closer approaches how an actual browser understands HTML (it does full parsing).

I made some other minor changes that should probably improve the normal performance a little tiny bit. Mostly, the original codebase was using rawcode_flat_list = re.split('\n', rawcode_flat) in several places. I don't see any reason to use re for that, since your pattern is a fixed single char. I replaced all instances like that with the plain old rawcode_flat_list = rawcode_flat.split('\n'), which should push it down to the pure C layer.

Lastly, I added the ability to change the indentation character. It was previously hard-coded to a literal tab character, and while I personally agree with that, it's probably better to not start another holy war, at least unintentionally.

If the indent_with parameter is specified, each indentation level is created with one instance of the parameter string. Otherwise, it defaults to 4 spaces ( ).

Caveats

This change also requires introducing a dependency, namely on the bs4 package. I don't think this is a big deal, but that's just me.


Hopefully, this contains 100% less asshole bug reports.

@rareyman
Copy link
Owner

Awesome! I am looking forward to checking it out! I'm inundated with work project stuff, but I hope to check it out when I have some time to focus on all this. Thanks, so much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants