Refactor duplicate code #7

probinso · 2021-03-26T01:53:08Z

I am working with Grammatech to use the DIRE model, and found a fair amount of duplicate code. I am working to refactor this project so that duplicate code is removed and instead the shared modules are used as libraries. In standardizing the duplicate code, I found a few discrepancies in the code and was hoping you could help me resolve them before I refactor further.

I also used isort and black to standardize code formatting.

I simplified all differences that were not operational, however there are some operational differences in the model and utils directories under prediction-plugin and neural-model.

model/attentional_recurrent_subtoken_decocder.py:AttentionalRecurrentSubtokenDecoder.predict() differs by

                # Take `len(beam)` more candidates to account for possible duplicate
                k = min(beam_cont_cand_hyp_scores.numel(), cont_beam_size + len(beam))

and the use of a beam_cnt variable

                beam_cnt = 0
                for i in range(len(beam_new_hyp_positions)):
                    ...
                    beam_cnt += 1
                    ...
                    if beam_cnt >= cont_beam_size:
                        break

model/graph_encoder.py:GraphASTEncoder.unpack_encoding() differs by the specified layer being of bool() or byte() type

            (1.0 - batch_tree_node_masks).bool().unsqueeze(-1), 0.0

utils/evalution.py:Evaluator.average() differs by value set to avg_results["corpus_cer"]

        avg_results["corpus_cer"] = sum(agg_results["edit_distance"]) / sum(
            agg_results["ref_len"]
        )

utils/preprocess.py:is_valid_example() differs by a more complicated is_valid check

        is_valid = (
            example.ast.size < 300
            and len(example.variable_name_map) > 0
            and any(k != v for k, v in example.variable_name_map.items())
        )

If you have any input on these discrepancies, then I would be very appreciative, and can move forward with further refactoring :). I am very excited to be working with this code

edmcman · 2021-03-26T13:26:31Z

Hi @probinso,

That's great to hear someone is working with the code. I've done quite a bit of refactoring on the ed-wip branch. I think we should probably merge this into master and you should probably start from there to avoid duplicating the effort. Plus, I think it will resolve (most?) of the discrepancies you noted.

So, I would think you can start by re-running your code formatters on ed-wip and see if there are any unresolved questions.

I'll make a PR for ed-wip.

probinso · 2021-04-02T19:28:48Z

Sorry for the slow follow up. We had a project come up that needed attention. I will be back on this next week. I'm excited to look at the new branch :)

standardize code

0d0c73a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor duplicate code #7

Refactor duplicate code #7

probinso commented Mar 26, 2021 •

edited

Loading

edmcman commented Mar 26, 2021

probinso commented Apr 2, 2021

Refactor duplicate code #7

Are you sure you want to change the base?

Refactor duplicate code #7

Conversation

probinso commented Mar 26, 2021 • edited Loading

edmcman commented Mar 26, 2021

probinso commented Apr 2, 2021

probinso commented Mar 26, 2021 •

edited

Loading