Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update clip.cpp #9482

Closed
wants to merge 7 commits into from
Closed

Conversation

Tejaakshaykumar
Copy link

@Tejaakshaykumar Tejaakshaykumar commented Sep 14, 2024

Added a comprehensive validation check to ensure that critical hyperparameters (hidden_size, n_head, n_layer, image_size, patch_size, projection_dim, eps) are not set to invalid or zero values.

a throw statement to handle invalid hyperparameter values by raising an invalid_argument exception.

Enhanced the error message within the catch block to log the specific exception message, aiding in debugging.

This is my first contribution in GitHub, and I'm excited to be part of the community!

#7073

Added a comprehensive validation check to ensure that critical hyperparameters (hidden_size, n_head, n_layer, image_size, patch_size, projection_dim, eps) are not set to invalid or zero values.

 a throw statement to handle invalid hyperparameter values by raising an invalid_argument exception.

Enhanced the error message within the catch block to log the specific exception message, aiding in debugging.
Copy link
Collaborator

@Galunid Galunid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Positive hparams.patch_size will cause the program to fail, same for other fields without == 0.

throw std::invalid_argument("Invalid hyperparameter values");
}
} catch (const std::exception& e) {
std::cerr << "Error while loading hyperparameters: " << e.what() << std::endl;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Welcome to the project -- we're happy to have you here!

For consistency with the rest of the codebase, please use fprintf instead of output streams.

Suggested change
std::cerr << "Error while loading hyperparameters: " << e.what() << std::endl;
fprintf(stderr, "Error while loading hyperparameters: %s\n", e.what());

Tejaakshaykumar and others added 2 commits September 17, 2024 15:13
Used fprintf instead of output streams.

Co-authored-by: Clint Herron <[email protected]>
Removed the try-catch block and throw statements
Added validation checks for all hyperparameters, ensuring they are not set to invalid or zero values.
Used fprintf to log error messages when invalid hyperparameters are encountered.
@Tejaakshaykumar
Copy link
Author

Positive hparams.patch_size will cause the program to fail, same for other fields without == 0.

I wanted to clarify your comment about hparams.patch_size and the other fields. Should these fields always be non-zero, or are there cases where a positive value might cause issues?

I've added the == 0 checks but want to ensure I'm handling them correctly. Any further insight would be appreciated!

@HanClinto
Copy link
Collaborator

So generally, one thing that strikes me about this PR is that the new failure mode doesn't add a ton of information. Whether the program fails early, or whether it fails late, it still is a pretty generic fail message, and while it does give you a bit of new information (the user has an invalid hyperparameter), it doesn't tell you which parameter is out of bounds, nor what the valid range is.

A robust solution would perhaps add bounds-checks to individual parameters (to add min and/or max values), and check each of them (perhaps with a #define macro to save time), but as it is, I wonder if there's an elegant way to inform the user which parameter is causing the problem and assist them more in fixing it.

I'm not interested in adding a ton of code, and I'm willing to admit that my dream of an elegant and generic one-size-fits-all-cases solution to this problem may not exist.

As someone who isn't terribly familiar with clip models, would you be willing to share some of the motivation behind this change? I.E., what situation did you run into that prompted this change?

Thank you!

Copy link
Collaborator

@Galunid Galunid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to clarify your comment about hparams.patch_size and the other fields. Should these fields always be non-zero, or are there cases where a positive value might cause issues?

The issue was that in most languages positive values evaluate to true in if statements. In your previous code

if (hparams.hidden_size == 0 || hparams.n_head == 0 || hparams.n_layer == 0 || hparams.n_intermediate || hparams.image_size == 0 || hparams.patch_size || hparams.projection_dim || hparams.eps)

hparams.n_intermediate for example was lacking any sort of comparison, which means it was converted to boolean value. That means if it was bigger than 0 it would become true, and if it was lower or equal to 0 it would become false. Obviously it is incorrect. I think what you want to check is whether all of those values are <= 0.

n_head can be a zero for non-transformer models, but clip doesn't support those so it's fine.

You also have a leftover bracket from try/catch block you removed, and the indentation is incorrect. Please note that in this case try/catch is necessary since models can miss those values, in which case get_u32 will throw an exception.

Another thing is replacing fprintf with LOG_TEE (that's what's used in clip.cpp, llama.cpp uses fprintf) ;)

Just in case, the "request changes" is there to let other maintainers know not to merge this yet and is not meant as a critique of you or the code.

TL;DR
Use <= and bring back try/catch but without throws, replace fprintf with LOG_TEE

@HanClinto
Copy link
Collaborator

Another thing is replacing fprintf with LOG_TEE (that's what's used in clip.cpp, llama.cpp uses fprintf) ;)

Aaah, I was wondering when one should be used vs. the other. Thank you, @Galunid ! :) TIL

@Tejaakshaykumar
Copy link
Author

Thank you both for your feedback!

I appreciate @HanClinto your comments on the clarity of error messages.The motivation for this change was to handle cases where invalid hyperparameters were leading to ambiguous failures, and I’ll focus on making the error reporting more informative to aid in debugging.

Thank you @Galunid for clarifying the issue with the condition checks. I’ll correct the checks to use <= 0 as needed and address the indentation and leftover bracket issues. I’ll also ensure that fprintf is replaced with LOG_TEE to maintain consistency with the rest of the codebase.

I’ll make these updates and resubmit the code. Thanks again for your help :)

used try-catch block.
validation is made by checking all the hyperparameters are are positive and logs an error if they are not.
included detailed logging of hyperparameters if needed for debugging.
@Tejaakshaykumar
Copy link
Author

Hi @HanClinto, @Galunid,

I wanted to follow up on the commit I submitted a few days ago with the changes we discussed. Could you please review it when you have time and let me know if any further updates are needed?

examples/llava/clip.cpp Outdated Show resolved Hide resolved
Copy link
Author

@Tejaakshaykumar Tejaakshaykumar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deleted optional logs

Copy link
Collaborator

@Galunid Galunid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have any more comments, let's see if someone else has something to add and if not I'll merge

examples/llava/clip.cpp Outdated Show resolved Hide resolved
examples/llava/clip.cpp Outdated Show resolved Hide resolved
Tejaakshaykumar and others added 2 commits September 24, 2024 07:27
added space

Co-authored-by: slaren <[email protected]>
added return type in catch block
Copy link
Collaborator

@slaren slaren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even ignoring the numerous errors in the very few lines of code in this PR, it doesn't even solve the linked issue since there are still many cases that can throw an exception.

@Tejaakshaykumar thank you for taking an interest on contributing to llama.cpp, but this is not the place to find a mentor. These kinds of PRs do not contribute anything of value to the project, but still waste a lot of contributor time. Please, take your time to understand the language and the code that you are working with before submitting a PR. If you are looking for somebody to mentor you, please do that in a discussion or an issue instead of a PR.

Comment on lines +1283 to +1289
LOG_TEE("Error: Invalid hyperparameter values\n");
return false;
}
} catch (const std::exception& e) {
LOG_TEE("Error while loading hyperparameters: %s\n", e.what());
return false;
     }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function returns a pointer, not a boolean.

hparams.patch_size = get_u32(ctx, KEY_PATCH_SIZE);
hparams.projection_dim = get_u32(ctx, format(KEY_PROJ_DIM, "vision"));
hparams.eps = get_f32(ctx, format(KEY_LAYER_NORM_EPS, "vision"));
if (hparams.hidden_size <= 0 || hparams.n_head <= 0 || hparams.n_layer <= 0 || hparams.n_intermediate <= 0 || hparams.image_size <= 0 || hparams.patch_size <= 0 || hparams.projection_dim <= 0 || hparams.eps <= 0) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure that these checks are necessary.

@slaren slaren closed this Sep 26, 2024
@Tejaakshaykumar Tejaakshaykumar deleted the patch-2 branch September 26, 2024 14:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants