Skip to content

Eval utils

Eval utils #1250

Triggered via pull request December 12, 2024 22:21
Status Success
Total duration 47m 59s
Artifacts

bench.yml

on: pull_request
Fit to window
Zoom out
Zoom in

Annotations

1 warning and 2 notices
benchmark
ubuntu-latest pipelines will use ubuntu-24.04 soon. For more details, see https://github.com/actions/runner-images/issues/10636
Benchmark results: libs/langgraph/eval.py#L1
......................................... fanout_to_subgraph_10x: Mean +- std dev: 60.7 ms +- 1.3 ms ......................................... fanout_to_subgraph_10x_sync: Mean +- std dev: 51.9 ms +- 0.8 ms ......................................... fanout_to_subgraph_10x_checkpoint: Mean +- std dev: 73.9 ms +- 1.2 ms ......................................... fanout_to_subgraph_10x_checkpoint_sync: Mean +- std dev: 94.1 ms +- 1.2 ms ......................................... fanout_to_subgraph_100x: Mean +- std dev: 617 ms +- 29 ms ......................................... fanout_to_subgraph_100x_sync: Mean +- std dev: 509 ms +- 7 ms ......................................... fanout_to_subgraph_100x_checkpoint: Mean +- std dev: 778 ms +- 33 ms ......................................... fanout_to_subgraph_100x_checkpoint_sync: Mean +- std dev: 934 ms +- 17 ms ......................................... react_agent_10x: Mean +- std dev: 30.5 ms +- 0.6 ms ......................................... react_agent_10x_sync: Mean +- std dev: 22.8 ms +- 0.4 ms ......................................... react_agent_10x_checkpoint: Mean +- std dev: 37.6 ms +- 0.6 ms ......................................... react_agent_10x_checkpoint_sync: Mean +- std dev: 36.6 ms +- 0.4 ms ......................................... react_agent_100x: Mean +- std dev: 337 ms +- 6 ms ......................................... react_agent_100x_sync: Mean +- std dev: 273 ms +- 2 ms ......................................... react_agent_100x_checkpoint: Mean +- std dev: 836 ms +- 5 ms ......................................... react_agent_100x_checkpoint_sync: Mean +- std dev: 828 ms +- 6 ms ......................................... wide_state_25x300: Mean +- std dev: 22.7 ms +- 0.5 ms ......................................... wide_state_25x300_sync: Mean +- std dev: 14.6 ms +- 0.1 ms ......................................... wide_state_25x300_checkpoint: Mean +- std dev: 273 ms +- 13 ms ......................................... wide_state_25x300_checkpoint_sync: Mean +- std dev: 274 ms +- 13 ms ......................................... wide_state_15x600: Mean +- std dev: 26.5 ms +- 0.5 ms ......................................... wide_state_15x600_sync: Mean +- std dev: 16.8 ms +- 0.3 ms ......................................... wide_state_15x600_checkpoint: Mean +- std dev: 458 ms +- 14 ms ......................................... wide_state_15x600_checkpoint_sync: Mean +- std dev: 460 ms +- 14 ms ......................................... wide_state_9x1200: Mean +- std dev: 25.9 ms +- 0.7 ms ......................................... wide_state_9x1200_sync: Mean +- std dev: 16.8 ms +- 0.3 ms ......................................... wide_state_9x1200_checkpoint: Mean +- std dev: 302 ms +- 15 ms ......................................... wide_state_9x1200_checkpoint_sync: Mean +- std dev: 300 ms +- 12 ms
Comparison against main: libs/langgraph/eval.py#L1
+-----------------------------------------+---------+-----------------------+ | Benchmark | main | changes | +=========================================+=========+=======================+ | wide_state_15x600_checkpoint | 487 ms | 458 ms: 1.06x faster | +-----------------------------------------+---------+-----------------------+ | wide_state_9x1200_checkpoint | 317 ms | 302 ms: 1.05x faster | +-----------------------------------------+---------+-----------------------+ | wide_state_9x1200 | 26.9 ms | 25.9 ms: 1.04x faster | +-----------------------------------------+---------+-----------------------+ | wide_state_25x300_checkpoint | 283 ms | 273 ms: 1.04x faster | +-----------------------------------------+---------+-----------------------+ | wide_state_15x600_checkpoint_sync | 475 ms | 460 ms: 1.03x faster | +-----------------------------------------+---------+-----------------------+ | wide_state_15x600_sync | 17.3 ms | 16.8 ms: 1.03x faster | +-----------------------------------------+---------+-----------------------+ | wide_state_9x1200_checkpoint_sync | 307 ms | 300 ms: 1.03x faster | +-----------------------------------------+---------+-----------------------+ | wide_state_9x1200_sync | 17.2 ms | 16.8 ms: 1.02x faster | +-----------------------------------------+---------+-----------------------+ | fanout_to_subgraph_100x_checkpoint_sync | 950 ms | 934 ms: 1.02x faster | +-----------------------------------------+---------+-----------------------+ | fanout_to_subgraph_10x_checkpoint | 75.1 ms | 73.9 ms: 1.02x faster | +-----------------------------------------+---------+-----------------------+ | wide_state_15x600 | 26.8 ms | 26.5 ms: 1.01x faster | +-----------------------------------------+---------+-----------------------+ | react_agent_100x | 342 ms | 337 ms: 1.01x faster | +-----------------------------------------+---------+-----------------------+ | react_agent_10x | 30.9 ms | 30.5 ms: 1.01x faster | +-----------------------------------------+---------+-----------------------+ | fanout_to_subgraph_10x_checkpoint_sync | 95.2 ms | 94.1 ms: 1.01x faster | +-----------------------------------------+---------+-----------------------+ | wide_state_25x300 | 23.0 ms | 22.7 ms: 1.01x faster | +-----------------------------------------+---------+-----------------------+ | react_agent_10x_checkpoint | 38.1 ms | 37.6 ms: 1.01x faster | +-----------------------------------------+---------+-----------------------+ | react_agent_10x_checkpoint_sync | 37.0 ms | 36.6 ms: 1.01x faster | +-----------------------------------------+---------+-----------------------+ | fanout_to_subgraph_10x | 61.3 ms | 60.7 ms: 1.01x faster | +-----------------------------------------+---------+-----------------------+ | react_agent_100x_checkpoint_sync | 835 ms | 828 ms: 1.01x faster | +-----------------------------------------+---------+-----------------------+ | wide_state_25x300_sync | 14.7 ms | 14.6 ms: 1.01x faster | +-----------------------------------------+---------+-----------------------+ | react_agent_100x_sync | 275 ms | 273 ms: 1.01x faster | +-----------------------------------------+---------+-----------------------+ | fanout_to_subgraph_10x_sync | 52.2 ms | 51.9 ms: 1.01x faster | +-----------------------------------------+---------+-----------------------+ | react_agent_100x_checkpoint | 840 ms | 836 ms: 1.00x faster | +-----------------------------------------+---------+-----------------------+ | react_agent_10x_sync | 22.9 ms | 22.8 ms: 1.00x faster | +-----------------------------------------+---------+-----------------------+ | fanout_to_subgraph_100x_checkpoint | 769 ms | 778 ms: 1.01x slower | +---------------------------------------