-
-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update collation generator #356
Update collation generator #356
Conversation
… update_collation_generator
if value not in results[key]: | ||
results[key][value] = set() | ||
results[key][value].add(label) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be simplified by only keeping the last line. This works because of the previous change you made to initialize results
using defaultdict
like so: results = defaultdict(lambda : defaultdict(list))
. See code above for an example, see here for an explanation.
if value not in results[key]: | |
results[key][value] = set() | |
results[key][value].add(label) | |
results[key][value].add(label) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
if key == 'input_list': | ||
if 'input_size' not in results: | ||
results['input_size'] = {} | ||
else: | ||
results['input_size'].add(len(value)) | ||
if key == 'rules': | ||
value = 'RULE' # A special case to avoid over-characterization | ||
if key not in results: | ||
results[key] = {} | ||
try: | ||
if not results[key].get(value, None): | ||
results[key][value] = set() | ||
results[key][value].add(label) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See if you can assure that wherever the initialization of input_data
occurs uses defaultdict
, and then simplify the code here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trying this now.
@@ -142,7 +142,7 @@ def check_parse_compare(self, line_index, lines): | |||
raw_string2 = is_comparison_match.group(3) | |||
string2 = '' | |||
try: | |||
string2 = raw_string2.encode().decode("unicode_escape") | |||
string2 = raw_string2 # Don't do any unescaping |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, if we're not unescaping the escape sequences, then how are we comparing the intended characters in the string?...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will look at handling these in the test driver, making sure that we do indeed send the right characters to the executors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please fix
Co-authored-by: Elango Cheran <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, filing #358 to capture the unaddressed comments for later follow up
Also update characterizations for collation.