Eval utils #1250
Triggered via pull request
December 12, 2024 22:21
Status
Success
Total duration
47m 59s
Artifacts
–
Annotations
1 warning and 2 notices
benchmark
ubuntu-latest pipelines will use ubuntu-24.04 soon. For more details, see https://github.com/actions/runner-images/issues/10636
|
Benchmark results:
libs/langgraph/eval.py#L1
.........................................
fanout_to_subgraph_10x: Mean +- std dev: 60.7 ms +- 1.3 ms
.........................................
fanout_to_subgraph_10x_sync: Mean +- std dev: 51.9 ms +- 0.8 ms
.........................................
fanout_to_subgraph_10x_checkpoint: Mean +- std dev: 73.9 ms +- 1.2 ms
.........................................
fanout_to_subgraph_10x_checkpoint_sync: Mean +- std dev: 94.1 ms +- 1.2 ms
.........................................
fanout_to_subgraph_100x: Mean +- std dev: 617 ms +- 29 ms
.........................................
fanout_to_subgraph_100x_sync: Mean +- std dev: 509 ms +- 7 ms
.........................................
fanout_to_subgraph_100x_checkpoint: Mean +- std dev: 778 ms +- 33 ms
.........................................
fanout_to_subgraph_100x_checkpoint_sync: Mean +- std dev: 934 ms +- 17 ms
.........................................
react_agent_10x: Mean +- std dev: 30.5 ms +- 0.6 ms
.........................................
react_agent_10x_sync: Mean +- std dev: 22.8 ms +- 0.4 ms
.........................................
react_agent_10x_checkpoint: Mean +- std dev: 37.6 ms +- 0.6 ms
.........................................
react_agent_10x_checkpoint_sync: Mean +- std dev: 36.6 ms +- 0.4 ms
.........................................
react_agent_100x: Mean +- std dev: 337 ms +- 6 ms
.........................................
react_agent_100x_sync: Mean +- std dev: 273 ms +- 2 ms
.........................................
react_agent_100x_checkpoint: Mean +- std dev: 836 ms +- 5 ms
.........................................
react_agent_100x_checkpoint_sync: Mean +- std dev: 828 ms +- 6 ms
.........................................
wide_state_25x300: Mean +- std dev: 22.7 ms +- 0.5 ms
.........................................
wide_state_25x300_sync: Mean +- std dev: 14.6 ms +- 0.1 ms
.........................................
wide_state_25x300_checkpoint: Mean +- std dev: 273 ms +- 13 ms
.........................................
wide_state_25x300_checkpoint_sync: Mean +- std dev: 274 ms +- 13 ms
.........................................
wide_state_15x600: Mean +- std dev: 26.5 ms +- 0.5 ms
.........................................
wide_state_15x600_sync: Mean +- std dev: 16.8 ms +- 0.3 ms
.........................................
wide_state_15x600_checkpoint: Mean +- std dev: 458 ms +- 14 ms
.........................................
wide_state_15x600_checkpoint_sync: Mean +- std dev: 460 ms +- 14 ms
.........................................
wide_state_9x1200: Mean +- std dev: 25.9 ms +- 0.7 ms
.........................................
wide_state_9x1200_sync: Mean +- std dev: 16.8 ms +- 0.3 ms
.........................................
wide_state_9x1200_checkpoint: Mean +- std dev: 302 ms +- 15 ms
.........................................
wide_state_9x1200_checkpoint_sync: Mean +- std dev: 300 ms +- 12 ms
|
Comparison against main:
libs/langgraph/eval.py#L1
+-----------------------------------------+---------+-----------------------+
| Benchmark | main | changes |
+=========================================+=========+=======================+
| wide_state_15x600_checkpoint | 487 ms | 458 ms: 1.06x faster |
+-----------------------------------------+---------+-----------------------+
| wide_state_9x1200_checkpoint | 317 ms | 302 ms: 1.05x faster |
+-----------------------------------------+---------+-----------------------+
| wide_state_9x1200 | 26.9 ms | 25.9 ms: 1.04x faster |
+-----------------------------------------+---------+-----------------------+
| wide_state_25x300_checkpoint | 283 ms | 273 ms: 1.04x faster |
+-----------------------------------------+---------+-----------------------+
| wide_state_15x600_checkpoint_sync | 475 ms | 460 ms: 1.03x faster |
+-----------------------------------------+---------+-----------------------+
| wide_state_15x600_sync | 17.3 ms | 16.8 ms: 1.03x faster |
+-----------------------------------------+---------+-----------------------+
| wide_state_9x1200_checkpoint_sync | 307 ms | 300 ms: 1.03x faster |
+-----------------------------------------+---------+-----------------------+
| wide_state_9x1200_sync | 17.2 ms | 16.8 ms: 1.02x faster |
+-----------------------------------------+---------+-----------------------+
| fanout_to_subgraph_100x_checkpoint_sync | 950 ms | 934 ms: 1.02x faster |
+-----------------------------------------+---------+-----------------------+
| fanout_to_subgraph_10x_checkpoint | 75.1 ms | 73.9 ms: 1.02x faster |
+-----------------------------------------+---------+-----------------------+
| wide_state_15x600 | 26.8 ms | 26.5 ms: 1.01x faster |
+-----------------------------------------+---------+-----------------------+
| react_agent_100x | 342 ms | 337 ms: 1.01x faster |
+-----------------------------------------+---------+-----------------------+
| react_agent_10x | 30.9 ms | 30.5 ms: 1.01x faster |
+-----------------------------------------+---------+-----------------------+
| fanout_to_subgraph_10x_checkpoint_sync | 95.2 ms | 94.1 ms: 1.01x faster |
+-----------------------------------------+---------+-----------------------+
| wide_state_25x300 | 23.0 ms | 22.7 ms: 1.01x faster |
+-----------------------------------------+---------+-----------------------+
| react_agent_10x_checkpoint | 38.1 ms | 37.6 ms: 1.01x faster |
+-----------------------------------------+---------+-----------------------+
| react_agent_10x_checkpoint_sync | 37.0 ms | 36.6 ms: 1.01x faster |
+-----------------------------------------+---------+-----------------------+
| fanout_to_subgraph_10x | 61.3 ms | 60.7 ms: 1.01x faster |
+-----------------------------------------+---------+-----------------------+
| react_agent_100x_checkpoint_sync | 835 ms | 828 ms: 1.01x faster |
+-----------------------------------------+---------+-----------------------+
| wide_state_25x300_sync | 14.7 ms | 14.6 ms: 1.01x faster |
+-----------------------------------------+---------+-----------------------+
| react_agent_100x_sync | 275 ms | 273 ms: 1.01x faster |
+-----------------------------------------+---------+-----------------------+
| fanout_to_subgraph_10x_sync | 52.2 ms | 51.9 ms: 1.01x faster |
+-----------------------------------------+---------+-----------------------+
| react_agent_100x_checkpoint | 840 ms | 836 ms: 1.00x faster |
+-----------------------------------------+---------+-----------------------+
| react_agent_10x_sync | 22.9 ms | 22.8 ms: 1.00x faster |
+-----------------------------------------+---------+-----------------------+
| fanout_to_subgraph_100x_checkpoint | 769 ms | 778 ms: 1.01x slower |
+---------------------------------------
|