Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Long documents #3

Open
timsuchanek opened this issue Jun 3, 2020 · 1 comment
Open

Long documents #3

timsuchanek opened this issue Jun 3, 2020 · 1 comment

Comments

@timsuchanek
Copy link

Would it be possible to summarize documents with length > 758 tokens?
Using https://github.com/allenai/longformer could be interesting for that use-case.

@jiacheng-xu
Copy link
Owner

Hi! Thanks for your suggestion. Longformer is great for the long document scenario.
In this project, I can actually change the max_len to any, see

self.tokenizer.max_len = 768

What I do is randomly initializing the extended part and fine-tuning on the downstream tasks. You can also change the num 768 to any number you want.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants