Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Global cost tracking #28

Merged
merged 6 commits into from
Jan 2, 2025
Merged

Global cost tracking #28

merged 6 commits into from
Jan 2, 2025

Conversation

sidnarayanan
Copy link
Contributor

@sidnarayanan sidnarayanan commented Dec 31, 2024

I'd like to have a way to track costs across all LLM API usage - including agents and environments. This PR implements that.

I made a few semi-arbitrary choices here - there are other ways of implementing this and I'm open to suggestions.

First: I decided to capture costs at the lowest level (i.e. all calls to acompletion, achat, etc.). We could alternatively only support functions that return LLMResult and use LLMResult.cost. But we'd run the risk of not catching everything

Second: litellm async streaming calls don't return AsyncGenerators (fixed our type hints), but instead CustomStreamWrappers, which implement __aiter__ and __anext__. I had to create a shim called TrackedStreamWrapper to handle this - open to alternative suggestions (see comment in code about why async for ... yield doesn't work)

@sidnarayanan sidnarayanan requested review from maykcaldas and a team December 31, 2024 04:25
Copy link
Collaborator

@maykcaldas maykcaldas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great feature. I’ve added a few comments requesting docstrings to ensure future users can easily understand how to implement it. And I’m approving it.

It works well for now, but it could become misleading if we overlook including the cost_tracker in any new features.

llmclient/cost_tracker.py Outdated Show resolved Hide resolved
llmclient/cost_tracker.py Show resolved Hide resolved


TRACK_COSTS = contextvars.ContextVar[bool]("track_costs", default=False)
REPORT_EVERY_USD = 1.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few comments here:

  • What do you think of making it a ClassVar of CostTracker? We still get global state, but it's less awkward with the global and setters/getters
  • Can you make this name more intuitive, and add units to it? It's unclear if
    • It's a frequency (Hz)
    • It's a dollar threshold (USD)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused on the last bullet - _USD is the units - how would you describe it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've renamed set_reporting_frequency -> set_reporting_threshold to remove the ambiguity.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And made both ClassVars.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I went further and made them instance variables - no reason not to.

llmclient/cost_tracker.py Outdated Show resolved Hide resolved
@@ -73,3 +74,10 @@ def fixture_reset_log_levels(caplog) -> Iterator[None]:
logger = logging.getLogger(name)
logger.setLevel(logging.NOTSET)
logger.propagate = True


class CILLMModelNames(StrEnum):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We ought to just put this in llmclient as one enum:

class CommonLLMNames(StrEnum):
    # Use these for model defaults
    OPENAI_GENERAL = "gpt-4o-2024-08-06"  # Cheap, fast, and decent

    # Use these in unit testing
    OPENAI_TEST = "gpt-4o-mini-2024-07-18" # Cheap and not OpenAI's cutting edge
    ANTHROPIC_TEST = "claude-3-haiku-20240307" # Cheap and not Anthropic's cutting edge

Then both the app and unit tests will just use CommonLLMNames

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll leave that for another PR

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did this in #30


if self.lifetime_cost_usd - self.last_report > self.report_every_usd:
logger.info(
f"Cumulative llmclient API call cost: ${self.lifetime_cost_usd:.8f}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
f"Cumulative llmclient API call cost: ${self.lifetime_cost_usd:.8f}"
f"Cumulative client API call cost: ${self.lifetime_cost_usd:.8f}"

We will eventually maybe rename from llmclient, and maybe do things besides just LLMs (e.g. embeddings), so let's just be generally worded here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was intentional wording - the cost tracker only tracks llmclient costs right now.

Comment on lines +95 to +96
image = np.zeros((32, 32, 3), dtype=np.uint8)
image[:] = [255, 0, 0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
image = np.zeros((32, 32, 3), dtype=np.uint8)
image[:] = [255, 0, 0]
image = np.full((32, 32, 3), [255, 0, 0], dtype=np.uint8)

Please feel free to ignore this too

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I didn't add this code - just mirroring what's in the other tests.

text="What color is this square? Show me your chain of reasoning.",
images=image,
),
] # TODO: It's not decoding the image. It's trying to guess the color from the encoded image string.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you:

  • Move this TODO to not be at the end of a list
  • Clarify what "it" is, when you say "it's not decoding"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above - I copied this from test_llms.py

{
"model_list": [
{
"model_name": "gpt-4o-mini",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use CILLMModelNames.OPENAI here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above - I copied this from test_llms.py

@sidnarayanan sidnarayanan merged commit ab90c31 into main Jan 2, 2025
6 checks passed
@sidnarayanan sidnarayanan deleted the cost-tracking branch January 2, 2025 19:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants