Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Autoregressive mode and embedding calculation addition #62

Merged
merged 10 commits into from
Oct 11, 2023

Conversation

metric-space
Copy link
Contributor

@metric-space metric-space commented Oct 3, 2023

Changes

  1. Add in autoregressive flag to notify AutoModelSequenceEmbeddings and AutoModelForRagE2E that an autoregressive model is being used
  2. Add autoregressive model of computation of embeddings based on last hidden state
  3. Move logic around so mean_pooling is more applicable to both scenarios (clm and mlm)
  4. update cli args
  5. pad_token = eos_token if autorgessive
  6. add is_autoregressive flag to eval portion
  7. fix if to elif in save hook
  8. remove default save statement
  9. unwrap model properly while saving

).hidden_states[-1]
else:
# First element of model_output contains all token embeddings
token_embeddings = self.model(input_ids, attention_mask)[0]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the first element is useful. Cz the attention is from the left to right.

Copy link
Contributor Author

@metric-space metric-space Oct 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that is the existing code, I moved taking the zeroth item from inside of the mean pooling function and put it here

else:
# First element of model_output contains all token embeddings
token_embeddings = self.model(input_ids, attention_mask)[0]
embeddings = self.mean_pooling(token_embeddings, attention_mask)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need a pooling step since we select only a single embedding?

I guess two methods should be,

  1. Selecting the EOS embedding as the representation, since it has seen all the previous.

  2. Getting all the embeddings for every token and pool them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the shape of the input token embeddings is (1,<length of tokenized inputs>, <number of features>)

The output of hidden states flag is of shape [number of layers, 1, length of tokenized inputs, number_of_features]

so I think we're doing number 2 here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. I guess we can do the same trick for encoder-only models as well.

2. add is_autoregressive flag to eval portion
3. fix if to elif in save hook
4. remove default save statement
5. unwrap moel properly while saving
@shamanez
Copy link
Member

I guess we can remove the me embedding addition thing .. with a note we can say we got better results by taking the eos token as the representation ..

@metric-space metric-space marked this pull request as ready for review October 11, 2023 03:31
@metric-space metric-space requested a review from shamanez October 11, 2023 03:31
@metric-space metric-space changed the title [WIP] : Autoregressive mode and embedding calculation addition Autoregressive mode and embedding calculation addition Oct 11, 2023
Copy link
Member

@shamanez shamanez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good

@metric-space metric-space merged commit 4ac31f2 into main Oct 11, 2023
@metric-space metric-space deleted the autoregressive-model branch October 11, 2023 08:27
@Serega6678
Copy link

BGE models require CLS pooling and not the mean pooling

https://huggingface.co/BAAI/bge-large-en#frequently-asked-questions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants