Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Trace_rate] If trace_rate value is more than 10 digits it crashes the server. #6153

Closed
Kanupriyagoyal opened this issue Aug 7, 2023 · 6 comments

Comments

@Kanupriyagoyal
Copy link

Kanupriyagoyal commented Aug 7, 2023

Description
Upon providing larger “trace_rate” value in the POST body of the “/v2/models/gbm_model/trace/settings”, it was observed that the container stopped or crashed allowing the server to crash.

If the application goes down, the user is not able to access his/her data.
It can overload your server and consume all of its resources which makes the service unavailable.

Triton Information
r23.04

Are you using the Triton container or did you build it yourself?
Using same instructions as build.py

To Reproduce
Steps to reproduce the behavior.

  1. Invoke "POST /v2/models/gbm_model/trace/setting" API with "trace_rate" parameter value "1000" and observed that it is updated the "trace_rate" to "1000" in the response.
  2. Change the "trace_rate" value to "10000000000" and send the request. As the trace_rate digit is more than 10 digits it fails
  3. Post which the container breaks and the APIs becomes inaccessible.

Describe the models (framework, inputs, outputs), ideally include the model configuration file (if using an ensemble include the model configuration file for that as well).

Expected behavior
A clear and concise description of what you expected to happen.
Application should look at the requests. If there is any limitation at the server, it should return a proper error description to user and should not process further requests.
If number of zeros == 9:
Screenshot 2023-08-07 at 3 34 44 PM
If number of zeros == 10 or number of digits more than 10:
Screenshot 2023-08-07 at 3 34 56 PM

@pradghos
Copy link

pradghos commented Aug 7, 2023

@dyastremsky: Any suggestion would help. Thank you !

@GuanLuo
Copy link
Contributor

GuanLuo commented Aug 7, 2023

Probably due to overflow, 10^10 takes more than 32 bits and the type for stroing rate is uint32_t. CC @oandreeva-nv on whether it can be a quick fix

@oandreeva-nv
Copy link
Contributor

I'll take a look

@oandreeva-nv
Copy link
Contributor

Hi @pradghos and @Kanupriyagoyal , could you please briefly describe your workload and what ranges you would like to see supported by our Trace APIs?: https://github.com/triton-inference-server/server/blob/main/docs/user_guide/trace.md#global-settings

I can certainly fix this issue for trace_rate alone, but would like to make sure that other parameters can also support your workflow, if needed

@pradghos
Copy link

pradghos commented Aug 8, 2023

@oandreeva-nv Thanks for looking ! workload and ranges will be very generic as any usage. We would expect out of ranges cases should be handled gracefully.

@oandreeva-nv
Copy link
Contributor

Fixed with this PR: #6173

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

4 participants