feat(anthropic): update model params + better max_token handling #151

0xMochan · 2024-12-14T02:02:17Z

Updates model consts to link to latest models (didn't exist prior). Also adds a function to calculate a default max token size based on the model name when not provided.

Rant

Anthropic unfortuntely has two annoying thorns:

Anthropic docs about models only lists the "latest" snapshots for each of the available models. They also only recently added the -latest as an override over the specific snapshot number, though only for the top models. This means that the constants can't reliably keep a history of all of the available snapshots available, even though the docs recommend using snapshots for stability purposes.
- The current recommendation by Rig should be to use -latest for testing but leverage a specific snapshot for a model for stability purposes in production.
Anthropic endpoints for messages requires a max_tokens argument to be specified, which is unlike other providers. This is even more frustrating since, this max_tokens argument that needs to be specified has a different cap per model being used (and specifying too high of a number causes the request to fail).
- Hardcoding specific models to the specific max_tokens is a non-starter since users using specific snapshot models (as the docs recommend) wouldn't match.
- Using a lower cap like 4096 would cut off half of the available token space for the most common models.
- Requiring a max_tokens argument to be specified at compile time (on AgentBuilder and manually when creating CompletionRequestBuilders) is also tough because it would require some really ugly refactoring to enforce that these builders can only build specifically for Anthropic clients (basically a custom AnthropicAgentBuilder and a AnthropicCompletionRequestBuilder).
- This theoretically is better bc it's code duplication for best compile time DX but this might get refactored soon, I'd rather not add more troubles to that implementation.

The solution to the last thorn is to match the beginning of the model string to the model names and hardcode a default value for token size based on that. The user can override this by specifying max_tokens on AgentBuilder, etc. This would error on agent.completion if a default max token cannot be determined, most likely due to an invalid anthropic model (which probably doesn't exist).

There might be a better solution, but I deemed this "good enough" after going in circles a bit.

0xMochan added 2 commits December 13, 2024 17:47

feat(anthropic): update model params + better max_token handling

b57c002

test(anthropic): remove max_tokens argument

81ca1e0

cvauclair added this to the v0.6 milestone Dec 16, 2024

cvauclair approved these changes Dec 16, 2024

View reviewed changes

cvauclair merged commit 5dfa93b into main Dec 16, 2024
5 checks passed

github-actions bot mentioned this pull request Dec 17, 2024

chore: release #144

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(anthropic): update model params + better max_token handling #151

feat(anthropic): update model params + better max_token handling #151

0xMochan commented Dec 14, 2024

feat(anthropic): update model params + better max_token handling #151

feat(anthropic): update model params + better max_token handling #151

Conversation

0xMochan commented Dec 14, 2024

Rant