understanding input construction and special tokens: #3

VikasRajashekar · 2020-11-11T09:18:51Z

I see that you define 5 new tokens. bos eos persona_token speaker1 speaker2.

And the input is constructed like:

input_ids :
<persona_token> (persona sentence1) <persona_token> (persona sentence2) ... <speaker1> (history sentence1) <speaker2> <history sentence2> <bos> response <eos>

token_type_ids:
<persona_token> for all personal sentence +<speaker1> and <speaker2> for respective sentences + <bos> for response

lm_labels:
-1 for tokens except for response

My questions are:

Is my understanding correct?
in token_type_ids why is bos tag used as a type for all the response tokens?

Attaching the file I used for this analysis.
xyz_new.txt

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

understanding input construction and special tokens: #3

understanding input construction and special tokens: #3

VikasRajashekar commented Nov 11, 2020

understanding input construction and special tokens: #3

understanding input construction and special tokens: #3

Comments

VikasRajashekar commented Nov 11, 2020