Skip to content
/ ALEXP Public

Simultaneous Online Optimization and Model Selection, based on our paper "Anytime Model Selection for Linear Bandits"

License

Notifications You must be signed in to change notification settings

lasgroup/ALEXP

Repository files navigation

ALEXP: Anytime Model Selection for Linear Bandits

This repository contains the code to our paper Anytime Model Selection for Linear Bandits, NeurIPS 2023. We propose ALEXP, an algorithm for simultanous online optimization and feature/model selection. ALEXP is a method of probabilistic aggregation for online model selection and maintains an adaptive probability distribution based on which it iteratively samples features/models. The sampled model is then used to estimate the objective function, as a proxy for the optimization objective. For a brief overview you can watch the NeurIPS teaser video or read the slides of our RSS talk.

Contents and Setup

This is a torch-based repository, requirements.txt lists the needed packages/used versions.

Algorithms

The repository contains the implementation of a few model selection algorithms. The classes are implemented as follows:

  • Adaptive Algorithms: ALEXP and CORRAL (based on an earlier implementation by Aldo), located in algorithms/model_selection.py.
  • Non-adaptive algorithms: ETC/ETS (Explore then Commit/Explore the Select), located in algorithms/acquisition.py. Naturally, we have also implemented a bandit optimization algorithms, mainly GP-UCB and Greedy, located in algorithms/acquisition.py.

Dataset

In the paper, we present experiments based on synthetic data, where for every run of the algorithm, we sample a new random environment. This repository also includes the code to generate random environments. The folder Environment contains the classes for Domain, Reward and Kernel functions, which together make up the environment. To sample an environment, you will need to create an instance of MetaEnvironment and run the method sample_envs(). For more details, see environment/reward_generator.py. Alternatively, you can write a data-loader that maps any given dataset into the domain and reward classes defined in the repository.

Running experiments

The main file for running and testing ALEXP is experiments/run_lexp_hparam.py. The same script may be used for running CORRAL or ETC/ETS by changing the input arguments. Similarly, you can try our different base bandit solvers (among UCB or Greedy). To see the training dynamics of your algorithm, you can alternatively run run_probs.py which runs ALEXP, keeps track of the selected models and saves the data used for creating plots such as Figure 4 in the paper.

Launching large-scale experiments: The launcher scripts in experiments allow you to run the different instances of the experiments (e.g. different hyper-parameters or different random seeds) in parallel on a cluster. The bash commands are detailed in experiments/utils.py, you should modify them based on your cluster job scheduling system.

Tests: The folder sanity_checks includes the scripts for some simple test to verify the correctness of individual modules in the code. For instance, test_vanilla_bandits.py checks the bandit optimization algorithm, given oracle knowledge of the model, or run_online_lasso.py checks the quality of our online lasso regression oracle.

Saving the results: The experiment scripts, i.e. run_*.py or launcher_*.py all automatically save results in json files containing dictionaries (often called results_dict) with all relevant parameters of the experiments and the action-reward pairs, runtime, etc. File names are automatically hashed based on experiment parameters and the time of running the experiment and saved in a results folder. To read the results, you can use the collect_exp_results() method, detailed in experiments/utils.py which searches through the results folder and creates a dataframe based on the aforementioned dictionaries. You can then filter this dataframe to omit the result of some experiments.

Plotting: The folder plotting includes some scripts to plot the results, for instance plot_regret.py should give you an idea to how to read the results and plot them.

The pipeline for launching large-scale experiments, storing and reading the result files is based on the work of Jonas Rothfuss (e.g. in this repository).

Reference and Contact

You can contact Parnian if you have questions. If you find the code useful, please cite our paper:

@inproceedings{kassraie2023anytime,
  title={Anytime Model Selection in Linear Bandits},
  author={Kassraie, Parnian and  Emmenegger, Nicolas and Krause, Andreas and Pacchiano, Aldo},
  booktitle={Advances in Neural Information Processing Systems},
  year={2023}
}

About

Simultaneous Online Optimization and Model Selection, based on our paper "Anytime Model Selection for Linear Bandits"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages