Replies: 3 comments
-
Oh, I just saw your paper/repo on hierarchical models! You should definitely report those results separately to demonstrate the effectiveness of your technique! 😄 |
Beta Was this translation helpful? Give feedback.
-
Hi @samhavens, the Hierarchical models are trained end-to-end (BERT encoder, and 2x cross-segment Transformer layers). Using hierarchical variants to overcome the text length limitation of BERT is a well-established technique since Chalkidis et al. (2019). There are preliminary results in the appendix with standard BERT (first 512 tokens only) to showcase the importance of the extension. In the near future, the benchmark will be expanded with new tasks (UKLEX, ILDC. ContractNLI, and others), and most probably will classify the task in low/mid/long-document tasks, in which case BERT and hierarchical variants will not co-exist in the very same tables 😄 |
Beta Was this translation helpful? Give feedback.
-
Thank you. Since I haven't seen those variants outside of legal NLP, I feel like it would be helpful for the rest of the NLP community if the non-hierarchical variants were included on the leaderboard. For example, both BERT and RoBERTa are included on the board, and I feel like the difference those is less than the difference between a model and its hierarchical variant. Also, I just noticed, in a recent paper, BudgetLongformer: Can we Cheaply Pretrain a SotA Legal Language Model From Scratch?, it seems that the authors were confused in the same way I was: Also see section 6.3 for an example of what I think is the authors' confusion between models and their hierarchical variants. |
Beta Was this translation helpful? Give feedback.
-
I feel like it would make sense to report results for e.g. both
BERT
andhierarchical-BERT
, as the model used is substantially different. That is,hierarchical-BERT
isn't even really a BERT model, it is a transformer model that uses BERT representations.Also, for the hierarchical models, are the parameters of the encoding model trained as well, or are they frozen?
Beta Was this translation helpful? Give feedback.
All reactions