Skip to content

Latest commit

 

History

History
49 lines (41 loc) · 10.1 KB

README.md

File metadata and controls

49 lines (41 loc) · 10.1 KB

Agent of Code

My first attempt at building an AI "agent". The agent will try iteratively solving Advent of Code puzzles. It'll be run as a Temporal workflow, largely just to be able to go back and inspect the full process that the agent takes in working through the problem since Temporal conveniently logs history. I'll also be making use of Temporal's ability to configure timeouts and schedule the timing of activity execution so that I can have confidence that I can keep the agent from just pounding the AoC server with guesses.

Advent of Code Automation Guidelines Acknowledgement

This agent does follow the automation guidelines on the /r/adventofcode community wiki. Specifically:

  • Puzzle inputs/solutions are not tracked by git (see: .gitignore).
  • Outbound call retries are throttled to every 1 minute in agent/temporal/workflow.py.
  • Once inputs are downloaded, they are cached locally (see: agent/adventofcode/scrape_problems.py).
  • If you suspect a day's input is corrupted, you can manually request a fresh copy by deleting the cached problem.html file for that day.
  • The User-Agent header in agent/adventofcode/_HEADERS.py is set to me since I maintain this tool :)

Status Updates

On day 4 I realized that it'd be fun to explicitly do a bit of a log on how things are going.

❗ *Hypothetical Global Leaderboard Rank* ❗

I'm not a rule-breaker so I'm intentionally not running the agent until AFTER the global leaderboard is full. But it's interesting to see what ranking I COULD'VE gotten.

Day Part 1 Part 2 Global Rank* Total Agent Workflow Time
1 ? -
2 ? -
3 ? -
4 #83
5 #31
6 N/A

Took Muuultiple Attempts

...maybe ~5 other untracked attempts...

7 #3
8 N/A

Multiple Attempts all Failed Part 2

9 #22
10 #41
11 N/A

Part 1 finished in <45sec on the first workflow run, but the agent failed to extract examples for part 2. Took a bit of tweaking the example extraction prompting to get this to work.

12 N/A

Muuuultiple Attempts all Failed Part 2

13 N/A -
14 N/A
15 N/A

Multiple attempts and never even got a solution to part 1!

16 N/A
17 N/A
18 #8*

*First attempt failed part 2 due to a logging-related bug I recently introduced...so I'm gonna count the time of the second run that passed both.

19 #48
20 N/A
21 N/A
22 #17
23 N/A

Agent passed unit tests on part 2 but timed out on full input (and the agent doesn't handle debugging timeouts on final solutions.)

24 N/A
25 N/A

Agent for some reason really struggled to even extract examples from this problem.

Advent of Code - Stars