-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement ForgettingPeriodic agent #12
base: main
Are you sure you want to change the base?
Conversation
We'll borrow a lot of the functionality when making the forgetting version, so we just let the concrete class define it's own UCB method and pull the rest of the functionality into the base. For the forgetting agent, we'll need the full history of the conversions (which is also sufficient for the regular Periodic agent), so we refactor to allow that.
ie, a Periodic agent that forgets observations from the past, implemented via setting the chance of including an observation as an exponential decay. The default value of the exponential decay constant is chosen to include observations from a year ago with probability 0.05.
Codecov Report
@@ Coverage Diff @@
## main #12 +/- ##
=====================================
Coverage 0.00% 0.00%
=====================================
Files 2 2
Lines 54 73 +19
=====================================
- Misses 54 73 +19
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea nice. Good work on the refactor for the generalising/abstraction, did the right amount (not too much) of it (imo).
Just minors/enhancement suggestions.
I guess we need to figure out how we are going to add agent results/analysis in the repo so we can discuss it (without having to be in the same room) ay? Possibly we have a convention branch, like agent_performance
(name tbd), in which we commit the main run notebook, that outputs the history plots and summary df etc. in nb for all the agent-envs. Thoughts?
] | ||
# Set conversion rate to infinity for unchosen actions | ||
# to ensure all actions are tested at least once | ||
# Set conversion rate to infinity for unchosen actions to ensure all actions are tested at |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: small bit of imprecision in comment that could cause a confusion: we're not "set"ting the conversion rate (that's set by the env), we are setting the agents initial belief in the conversion rate. Maybe just adding "beliefs"/"estimates"/"inferences" after the word "rate" would help.
# A list containing a tuple of (num_successes, timestep) for each time an action was picked, | ||
# representing the full history of that action's conversions | ||
ConversionHistory = List[Tuple[int, int]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Enhancement: would it be worth defining types like:
NumberSuccesses = int
Timestep = int
to enable
ConversionHistory = List[Tuple[NumberSuccesses, Timestep]]
Given the importance of that variable for the agents etc., I found myself having to refer back to the commend on line 64 repeatedly, so I kept forgetting which was with the 2uple. Having it in the actual type def instead of having to stitch that with the comment would help a lot I think.
Thoughts?
ie, a Periodic agent that forgets observations from the past, implemented
via setting the chance of including an observation as an exponential decay.
The default value of the exponential decay constant is chosen to include
observations from a year ago with probability 0.05. I think this is probably good for environments
longer than a year, but could probably be higher for environments only of a single year.