-
Notifications
You must be signed in to change notification settings - Fork 222
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lm eval upgraded to 0.4.7 #1692
base: main
Are you sure you want to change the base?
Conversation
@12010486 please run |
Co-authored-by: Yaser Afshar <[email protected]>
Co-authored-by: Yaser Afshar <[email protected]>
Co-authored-by: Yaser Afshar <[email protected]>
Hi @yafshar, apologies for the missing make style. Now from a formatting point of view, everything should be there. Btw, in reality 51 more files were edited with make style, but I've seen that Iman already took care in #1693. Shall I already merge his PR in mine, or best to wait for that to be merged in main? |
args: argparse.Namespace, | ||
options: GenerationConfig, | ||
) -> None: | ||
super().__init__(pretrained=args.model_name_or_path, device=args.device) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we initialize the base class with the pretrained model as a string, and HFLM instantiates this model. After that we just override this instance:
self._model = model
To me, it looks like overhead.
We can try pass our model instance instead of str or use another class as a parent
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, sounds like parent constructor HFLM
executes some extra code which are not necessary for habana. Maybe instead of the full parent class HFLM(TemplateLM)
only use TemplateLM
from lm_eval.models.huggingface TemplateLM
and instead of
super().__init__(pretrained=args.model_name_or_path, device=args.device)
TemplateLM.__init__(self)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you do this, you should also call the _get_backend
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, two options here, based on your input.
- Instead of a string (
args.model_name_or_path
), pass themodel
in the init like:
super().__init__(pretrained=model, device=args.device)
This will automatically exclude all thecuda
code, and it is working, both on 1 HPU and 8 HPU. Downside: warnings during the initialization, like this - Change class from HFLM to TemplateLM. This might be lengthier with the re-definition of multiple functions, but I'm working also on this.
@yafshar, @alexey-belyakov, I appreciate any further input on it.
With both options, I can prune the code that might be now redundant. For now I'm moving in parallel
raise NotImplementedError() | ||
|
||
def find_bucket(self, length): | ||
def find_bucket(self, length: int) -> list[int]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def find_bucket(self, length: int) -> list[int]: | |
def find_bucket(self, length: int) -> int: |
json.dump(results, open(args.output_file, "w"), indent=2) | ||
print(json.dumps(results, indent=2)) | ||
|
||
json.dump( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
json.dump( | |
json_str = json.dumps(results, indent=2, default=utils.handle_non_serializable, ensure_ascii=False) | |
with open(args.output_file, "w") as output_file: | |
output_file.write(json_str) | |
if args.show_config: | |
print(json_str) |
@@ -86,22 +87,35 @@ def setup_lm_eval_parser(): | |||
default=["hellaswag", "lambada_openai", "piqa", "winogrande"], | |||
) | |||
parser.add_argument("--limit_iters", type=int, help="limit examples to run that many iterations", default=None) | |||
parser.add_argument( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is https://github.com/EleutherAI/lm-evaluation-harness/blob/4c26a9c176ecfb40b369503ce211e356bb800fdc/lm_eval/evaluator.py#L388 in the interface. Does it somehow similar to what you did here? If it does maybe we can eliminate this and related code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we should look at simple_evaluator, rather than evaluator, but in any case, in the latest version of the testharness, there are really multiple args related to what to print and how, see from this line below https://github.com/EleutherAI/lm-evaluation-harness/blob/4c26a9c176ecfb40b369503ce211e356bb800fdc/lm_eval/__main__.py/#L410
I took only one of them and tried to use it the same way it is expected from the original repo, if I did it right. The true reason why I've added it, is that is printing now really a lot of info, and I was not able to see the information related to HPU without this flag.
Upgraded run_lm_eval.py
Results
SW 1.19.0
Optimum-habana 1.16.0dev
HW Gaudi2
From Examples
Previous results:
This PR
Previous results:
This PR
From Running with FP8
Previous results:
This PR