Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(anthropic): update model params + better max_token handling #151

Merged
merged 2 commits into from
Dec 16, 2024

Conversation

0xMochan
Copy link
Contributor

Updates model consts to link to latest models (didn't exist prior). Also adds a function to calculate a default max token size based on the model name when not provided.

Rant

Anthropic unfortuntely has two annoying thorns:

  • Anthropic docs about models only lists the "latest" snapshots for each of the available models. They also only recently added the -latest as an override over the specific snapshot number, though only for the top models. This means that the constants can't reliably keep a history of all of the available snapshots available, even though the docs recommend using snapshots for stability purposes.

    • The current recommendation by Rig should be to use -latest for testing but leverage a specific snapshot for a model for stability purposes in production.
  • Anthropic endpoints for messages requires a max_tokens argument to be specified, which is unlike other providers. This is even more frustrating since, this max_tokens argument that needs to be specified has a different cap per model being used (and specifying too high of a number causes the request to fail).

    • Hardcoding specific models to the specific max_tokens is a non-starter since users using specific snapshot models (as the docs recommend) wouldn't match.
    • Using a lower cap like 4096 would cut off half of the available token space for the most common models.
    • Requiring a max_tokens argument to be specified at compile time (on AgentBuilder and manually when creating CompletionRequestBuilders) is also tough because it would require some really ugly refactoring to enforce that these builders can only build specifically for Anthropic clients (basically a custom AnthropicAgentBuilder and a AnthropicCompletionRequestBuilder).
    • This theoretically is better bc it's code duplication for best compile time DX but this might get refactored soon, I'd rather not add more troubles to that implementation.

The solution to the last thorn is to match the beginning of the model string to the model names and hardcode a default value for token size based on that. The user can override this by specifying max_tokens on AgentBuilder, etc. This would error on agent.completion if a default max token cannot be determined, most likely due to an invalid anthropic model (which probably doesn't exist).


There might be a better solution, but I deemed this "good enough" after going in circles a bit.

@cvauclair cvauclair added this to the v0.6 milestone Dec 16, 2024
@cvauclair cvauclair merged commit 5dfa93b into main Dec 16, 2024
5 checks passed
@github-actions github-actions bot mentioned this pull request Dec 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants