-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor & standardize evaluation with Evaluator
#287
Merged
Merged
Changes from all commits
Commits
Show all changes
107 commits
Select commit
Hold shift + click to select a range
bbfbf76
v0: move configs and base API
vict0rsch d56af9e
add `eval_config` arg to `GFlowNetAgent` init
vict0rsch 505b402
WIP: towards evaluator
vict0rsch a856791
rename to `eval_top_k`
vict0rsch 3216ae9
Update for new `evaluator.eval()` api
vict0rsch 294e9b5
quote in print
vict0rsch a24efa3
no figs as empty dicts instead of `(None,)`
vict0rsch 70252e5
fix `self` to `gfn` in `eval_top_k`
vict0rsch 9769016
`load_gflow_net_from_run_path` returns a tuple
vict0rsch bd212ad
move legacy code
vict0rsch 73b1b15
`load_gflow_net_from_run_path` returns a tuple
vict0rsch cddc5d6
DOOOCSTRIIIINGS
vict0rsch 54f2f23
`@classmethod`
vict0rsch f546506
unused `log_iter`
vict0rsch 1225034
GFNA init docstring
vict0rsch f7662d3
refactor `should_` `train/eval/checkpoint`etc.
vict0rsch 924dc18
don't log `None` values
vict0rsch 274ff79
No need for a dedicated `log_test_metrics`
vict0rsch f75c876
move figs to `plot(...)`
vict0rsch 1fdb401
setup `requires` system
vict0rsch baf701a
allow for custom `require`
vict0rsch 00fc1b5
typo returned dict
vict0rsch c1a5dc8
move log prob metrcis to `compute_log_prob_metrics(...)`
vict0rsch 2f08b0d
improve `make_metrics` and `make_requires`
vict0rsch 4659707
refactor `requires`
vict0rsch 987ab7b
typo -> `should_log_train`
vict0rsch 06ccc61
`compute_density_metrics` for `eval()`
vict0rsch ee38802
add `eval:base` default
vict0rsch 1468755
update configs
vict0rsch b670c14
move evaluator init later in gfna init
vict0rsch 06d4d06
remove legacy `.test.` references
vict0rsch 5809254
debug print
vict0rsch 4a3b049
fix logdir exists logic and `exit(1)`
vict0rsch 0cd21e0
trailing whitespace
vict0rsch 4802ab2
remove `oracle` references
vict0rsch d8c9a7b
add `eval` default
vict0rsch 272310c
`_self_` last to allow for overrides in `_self_` to other name spaces
vict0rsch 5160759
`name` -> `display_name`
vict0rsch 975024e
`ALL_REQS` and `ValueError`s
vict0rsch a532eab
missing tensor `.item()`
vict0rsch b3b7ff2
move `kde_pred` to continuous density metrics only
vict0rsch 610dcfc
store pkl & csv paths as `Buffer` attributes
vict0rsch c4294f6
Imrpove robustness and allow `dict` metrics to `make_metrics`
vict0rsch 0af5719
utils for tests file
vict0rsch d51b4b6
+ `gflownet_from_config`
vict0rsch 25db913
generic fixtures
vict0rsch acc56f8
first tests for `gflownet.eval.base.GFlowNetEvaluator`
vict0rsch 93acac1
clean up `oracle` files and `legacy.py`
vict0rsch e740e62
refactor `active_learning` to `use_context`
vict0rsch bb0486c
Remove `sample_only` gflownet arg (and config) and `make_train_test` …
vict0rsch 0edbc6b
revert standardize `main` with `gflownet_from_config`
vict0rsch db4f1dc
trailing breakpoint
vict0rsch 829cf4a
Update docstring
vict0rsch 4347939
remove unused
vict0rsch 25cf720
improve example
vict0rsch 9f50b45
move `from_agent` and `from_dir` methods
vict0rsch 3c5186b
use `gflownet_from_config` in `load_gflow_net_from_run_path`
vict0rsch df6d9af
`empty_ok=False` arg
vict0rsch 626d34c
clean up example
vict0rsch a7018bb
document constants
vict0rsch b6832da
eval top k uses dict data structure
vict0rsch b6527f8
improve docstrings
vict0rsch 9e09b63
add `update_all_metrics_and_requirements`
vict0rsch 47c6992
have dedicated `plot_kwargs`
vict0rsch 4985d9f
standardize `{"metrics": {}, "data": {}}` return pattern
vict0rsch 054aa61
work on docstrings example
vict0rsch 0ac39c8
towards abstract / base pattern
vict0rsch d73de32
update example docstring
vict0rsch cdb66b3
more docs
vict0rsch 473edf9
allow `init` instantiation + more tutorial
vict0rsch 670bdb8
`define_new_metrics`
vict0rsch 82f0781
test `.`
vict0rsch 34082c1
no `.` ?
vict0rsch 43015d1
update links
vict0rsch d0cdb27
always use `evaluator`
vict0rsch 21663da
reference logger
vict0rsch 33fa8ac
more doc polih
vict0rsch 1939e2a
Improve docs and refactor to `AbstractEvaluator` and `BaseEvaluator`
vict0rsch ef99765
comment-out trailing dev docs rendering filter
vict0rsch 02e31ef
improve logging
vict0rsch 64794ad
use `evaluator` namesmace
vict0rsch b3be860
adapt jay
vict0rsch 1d883d3
move metrics to base
vict0rsch 92182b1
fix tests
vict0rsch 8139ab0
clean up prints
vict0rsch dce8abf
use evaluator
vict0rsch 0322a41
evaluator in tests instantiate
vict0rsch 8b27ac3
improve init docs
vict0rsch 4bed143
outline
vict0rsch 66e457d
typo
vict0rsch aed7110
add note
vict0rsch 918fc7c
docs `plot` and `eval_top_k`
vict0rsch fd3fabd
test code-include
vict0rsch ece173e
Update gflownet/evaluator/__init__.py
carriepl 93baba4
Apply suggestions from code review - Improve docstrings
carriepl 43ef77e
Update gflownet/evaluator/__init__.py
carriepl 7a1f97a
Complete GFlowNetAgent docstring
carriepl-mila 0b0b0fc
Remove unused variable
carriepl-mila 5e0f0fa
Fix merge conflicts
carriepl-mila 800314c
Fix pytest filename conflicts
carriepl-mila a9bea9a
Re-integrate changes from main lost in merge
carriepl-mila 81e4745
Update comments
carriepl-mila 689ed6a
Improve comments in Logger
carriepl-mila d689270
Adjust sanity check runs (CTorus) to Evaluator config
alexhernandezgarcia 4f4fcad
Fix typo
alexhernandezgarcia 25a45e2
Evaluator: add samples_topk to plot(); add TODOs
alexhernandezgarcia 6efaa19
Update evaluator config of Tetris sanity runs
alexhernandezgarcia File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
_target_: gflownet.evaluator.base.BaseEvaluator | ||
|
||
# config formerly from logger.test | ||
first_it: True | ||
period: 100 | ||
n: 100 | ||
kde: | ||
bandwidth: 0.1 | ||
kernel: gaussian | ||
n_top_k: 5000 | ||
top_k: 100 | ||
top_k_period: -1 | ||
# Number of backward trajectories to estimate the log likelihood of each test data point | ||
n_trajs_logprobs: 10 | ||
logprobs_batch_size: 100 | ||
logprobs_bootstrap_size: 10000 | ||
# Maximum number of test data points to compute log likelihood probs. | ||
max_data_logprobs: 1e5 | ||
# Number of points to obtain a grid to estimate the reward density | ||
n_grid: 40000 | ||
train_log_period: 1 | ||
checkpoints_period: 1000 | ||
# List of metrics as per gflownet/eval/evaluator.py:METRICS_NAMES | ||
# Set to null for all of them | ||
# Values must be comma separated like `metrics: "l1, kl, js"` (spaces are optional) | ||
metrics: all |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -3,5 +3,5 @@ defaults: | |
|
||
_target_: gflownet.utils.logger.Logger | ||
|
||
tags: | ||
tags: | ||
- gflownet |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -6,6 +6,7 @@ defaults: | |
- proxy: corners | ||
- logger: wandb | ||
- user: default | ||
- evaluator: base | ||
|
||
# Device | ||
device: cuda | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -47,17 +47,13 @@ | |
"sphinx_design", | ||
"sphinx_copybutton", | ||
"sphinxext.opengraph", | ||
"code_include.extension", | ||
] | ||
|
||
# Add any paths that contain templates here, relative to this directory. | ||
templates_path = ["_templates"] | ||
exclude_patterns = ["_build", "Thumbs.db", ".DS_Store"] | ||
|
||
# List of patterns, relative to source directory, that match files and | ||
# directories to ignore when looking for source files. | ||
# This pattern also affects html_static_path and html_extra_path. | ||
exclude_patterns = ["_build", "Thumbs.db", ".DS_Store"] | ||
|
||
|
||
# -- Options for HTML output ------------------------------------------------- | ||
|
||
|
@@ -122,6 +118,7 @@ | |
# sphinx.ext.intersphinx | ||
intersphinx_mapping = { | ||
"torch": ("https://pytorch.org/docs/stable", None), | ||
"omegaconf": ("https://omegaconf.readthedocs.io/en/latest", None), | ||
} | ||
|
||
# sphinx.ext.autodoc & autoapi.extension | ||
|
@@ -179,3 +176,13 @@ | |
"enable": True, | ||
"image": "./_static/images/gflownet-logo.png", | ||
} | ||
|
||
|
||
# def skip_util_classes(app, what, name, obj, skip, options): | ||
# return any( | ||
# name.startswith(f"gflownet.{p}") for p in ["envs", "proxy", "policy", "utils"] | ||
# ) | ||
|
||
|
||
# def setup(sphinx): | ||
# sphinx.connect("autoapi-skip-member", skip_util_classes) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not sure what this part is meant to do. Is that something outdates that should be removed from the PR? Or is this a work in progress that should be finished and then uncommented? |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The format for these evaluator arguments is different than in the icml23/ctorus.yaml config file. Is that a problem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am unsure what this comment refers to exactly, but I would just say that the icml23/ctorus.yaml file is really old (January 2023) so it would be fine to deprecate it / adapt it if needed. Yes, it contains the experiments of a paper, but I believe it's ok to adapt it to the new state of the repo.