-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Global cost tracking #28
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great feature. I’ve added a few comments requesting docstrings to ensure future users can easily understand how to implement it. And I’m approving it.
It works well for now, but it could become misleading if we overlook including the cost_tracker
in any new features.
llmclient/cost_tracker.py
Outdated
|
||
|
||
TRACK_COSTS = contextvars.ContextVar[bool]("track_costs", default=False) | ||
REPORT_EVERY_USD = 1.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few comments here:
- What do you think of making it a
ClassVar
ofCostTracker
? We still get global state, but it's less awkward with theglobal
and setters/getters - Can you make this name more intuitive, and add units to it? It's unclear if
- It's a frequency (Hz)
- It's a dollar threshold (USD)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused on the last bullet - _USD
is the units - how would you describe it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've renamed set_reporting_frequency
-> set_reporting_threshold
to remove the ambiguity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And made both ClassVars.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I went further and made them instance variables - no reason not to.
@@ -73,3 +74,10 @@ def fixture_reset_log_levels(caplog) -> Iterator[None]: | |||
logger = logging.getLogger(name) | |||
logger.setLevel(logging.NOTSET) | |||
logger.propagate = True | |||
|
|||
|
|||
class CILLMModelNames(StrEnum): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We ought to just put this in llmclient
as one enum:
class CommonLLMNames(StrEnum):
# Use these for model defaults
OPENAI_GENERAL = "gpt-4o-2024-08-06" # Cheap, fast, and decent
# Use these in unit testing
OPENAI_TEST = "gpt-4o-mini-2024-07-18" # Cheap and not OpenAI's cutting edge
ANTHROPIC_TEST = "claude-3-haiku-20240307" # Cheap and not Anthropic's cutting edge
Then both the app and unit tests will just use CommonLLMNames
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll leave that for another PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did this in #30
|
||
if self.lifetime_cost_usd - self.last_report > self.report_every_usd: | ||
logger.info( | ||
f"Cumulative llmclient API call cost: ${self.lifetime_cost_usd:.8f}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
f"Cumulative llmclient API call cost: ${self.lifetime_cost_usd:.8f}" | |
f"Cumulative client API call cost: ${self.lifetime_cost_usd:.8f}" |
We will eventually maybe rename from llmclient
, and maybe do things besides just LLMs (e.g. embeddings), so let's just be generally worded here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was intentional wording - the cost tracker only tracks llmclient
costs right now.
image = np.zeros((32, 32, 3), dtype=np.uint8) | ||
image[:] = [255, 0, 0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
image = np.zeros((32, 32, 3), dtype=np.uint8) | |
image[:] = [255, 0, 0] | |
image = np.full((32, 32, 3), [255, 0, 0], dtype=np.uint8) |
Please feel free to ignore this too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I didn't add this code - just mirroring what's in the other tests.
text="What color is this square? Show me your chain of reasoning.", | ||
images=image, | ||
), | ||
] # TODO: It's not decoding the image. It's trying to guess the color from the encoded image string. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you:
- Move this
TODO
to not be at the end of alist
- Clarify what "it" is, when you say "it's not decoding"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above - I copied this from test_llms.py
{ | ||
"model_list": [ | ||
{ | ||
"model_name": "gpt-4o-mini", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use CILLMModelNames.OPENAI
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above - I copied this from test_llms.py
I'd like to have a way to track costs across all LLM API usage - including agents and environments. This PR implements that.
I made a few semi-arbitrary choices here - there are other ways of implementing this and I'm open to suggestions.
First: I decided to capture costs at the lowest level (i.e. all calls to
acompletion
,achat
, etc.). We could alternatively only support functions that returnLLMResult
and useLLMResult.cost
. But we'd run the risk of not catching everythingSecond:
litellm
async streaming calls don't returnAsyncGenerator
s (fixed our type hints), but insteadCustomStreamWrapper
s, which implement__aiter__
and__anext__
. I had to create a shim calledTrackedStreamWrapper
to handle this - open to alternative suggestions (see comment in code about whyasync for ... yield
doesn't work)