-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Addressed missed out comments in PR #1 #3
Conversation
Signed-off-by: Yu Chin Fabian Lim <[email protected]>
Signed-off-by: Yu Chin Fabian Lim <[email protected]>
Signed-off-by: Yu Chin Fabian Lim <[email protected]>
@mayank31398 i have addressed the comments you left in #1 . Could you take a look and approve if satisfactory? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
put some comments, looks good otherwise
from ...enums import AttentionHeadType | ||
from ...models.gpt_dolomite.config import GPTDolomiteConfig |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this might give circular import error due to stuff from this file being imported in the model class
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it was fine I have tested training the refactored codel
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But if you like, i have moved the config to a top-level config.py
in 8afe81d
from ...enums import AttentionHeadType, PositionEmbeddingType | ||
from ..linear import Linear | ||
from ...models.gpt_dolomite.config import GPTDolomiteConfig |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see comment above
add_bias: bool = True, | ||
position_embedding_type: str = "learned_absolute", | ||
rope_theta: int = 10000, | ||
rope_scaling: dict = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can drop rope_scaling
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
im just worried to drop if, as we have checkpoints trained on https://github.com/ibm-granite/dolomite-engine/tree/main/dolomite_engine, and we may lose backward compatibility if i delete this key
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are you ok if I just keep it first, but will put a comment to eventually drop it at some point?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added the comment in 8afe81d
Signed-off-by: Yu Chin Fabian Lim <[email protected]>
@mayank31398 I have moved the config file. If its ok can you approve? |
There were comments from @mayank31398 that were missed out in #1 , we address them here
linear.py
andembedding.py
CommonConfig
and moved the config to top levelwrapper.py