How to deal with logits from position indices in the output layer? #22

xiaoda99 · 2018-08-24T03:05:08Z

Dear guys,

I found that the position embeddings are concatenated with the word embeddings in the embedding layer.

Line 411 in bd1cf7d

    
           init_params[0] = np.concatenate([init_params[1], (np.random.randn(n_special, n_embd)*0.02).astype(np.float32), init_params[0]], 0)

and the output layer also shares weights with this embedding layer, so it outputs logits for both word indices and position indices.

finetune-transformer-lm/train.py

Line 176 in bd1cf7d

lm_logits = tf.matmul(lm_h, we, transpose_b=True)

My questions are:

During lm pretraining, did you mask out the logits from those position indices when computing the loss?
If I use the pretrained model as a LM to generate text, do I need to mask out these position indices' logits before softmax when sampling the next word?

BTW, I used the pytorch code ported by huggingface:
https://github.com/huggingface/pytorch-openai-transformer-lm
FYI, I also posted an issue there describing some details of my experiments:
huggingface/pytorch-openai-transformer-lm#36

Da Xiao

madisonmay · 2018-08-30T16:46:21Z

@xiaoda99 in https://github.com/IndicoDataSolutions/finetune/blob/development/finetune/base.py#L544 masking the positional embeddings helped produce more reasonable generated text for us

xiaoda99 changed the title ~~How to deal with logits from position embeddings in the output layer?~~ How to deal with logits from position indices in the output layer? Aug 24, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to deal with logits from position indices in the output layer? #22

How to deal with logits from position indices in the output layer? #22

xiaoda99 commented Aug 24, 2018 •

edited

Loading

madisonmay commented Aug 30, 2018 •

edited

Loading

How to deal with logits from position indices in the output layer? #22

How to deal with logits from position indices in the output layer? #22

Comments

xiaoda99 commented Aug 24, 2018 • edited Loading

madisonmay commented Aug 30, 2018 • edited Loading

xiaoda99 commented Aug 24, 2018 •

edited

Loading

madisonmay commented Aug 30, 2018 •

edited

Loading