Skip to content

Commit

Permalink
clarify fume score
Browse files Browse the repository at this point in the history
  • Loading branch information
Ray Myers committed Apr 18, 2024
1 parent 9a7ce4e commit d567e70
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions src/pages/leaderboards.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,9 +15,9 @@ There are many LLM benchmarks, but for the purposes of evaluating Autonomous Dev
*Last checked: 2024-04-16*
| Rank | Agent | Score | Score (lite) | Status | Group | License |
| ---- | -------------------- | ------ | ------------ | ----------------- | ------------ | ----------------------- |
| 1 | [Fume](https://twitter.com/aegucer/status/1780319507845988538) | 18.3% | - | Reported | Fume Technologies | Proprietary |
| 1 | [Fume](https://twitter.com/aegucer/status/1780319507845988538) | 18.3% | - | Reported (5% sample) | Fume Technologies | Proprietary |
| 2 | [auto-code-rover](https://github.com/nus-apr/auto-code-rover) | 15.95% | 22.3% | Reported | APR@NUS | GPL-3 |
| 3 | Devin | 13.48% | - | Reported (sample) | Cognition | Proprietary |
| 3 | Devin | 13.48% | - | Reported (25% sample) | Cognition | Proprietary |
| 4 | [SWE-agent](https://swe-agent.com/) + GPT 4 | 12.29% | 17% | Official | Princeton NLP | MIT |


Expand Down

0 comments on commit d567e70

Please sign in to comment.