You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After fine-tuning the tokenizer no longer splits these tokens into single characters and keeps it as a single token. However, it no longer assigns the correct NER tag to the token. I have double checked my training data and there are no issues there. So this seems to be an error in the biobert code.
The text was updated successfully, but these errors were encountered:
Goal: Add tokens to the tokenizer for clinical domain to prevent the tokenizer from tokenizing it.
I am using the following code to add a few tokens to the tokenizer:
After fine-tuning the tokenizer no longer splits these tokens into single characters and keeps it as a single token. However, it no longer assigns the correct NER tag to the token. I have double checked my training data and there are no issues there. So this seems to be an error in the biobert code.
The text was updated successfully, but these errors were encountered: