-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Review #2 #10
Comments
We want to thank the reviewer for reviewing the article with such depth. We went through every sentence and have addressed the points raised in our updated article. We would like to point out to the editors that we had received an unofficial review from the reviewer before the official reviews. We addressed the issues that were raised in the unofficial review but the updated article couldn’t be made available to the official reviewer due to some technical issue. All official reviews posted, therefore, are on an earlier version of the article without the initial inputs. We will now address the specific comments which are a part of a few main categories of issues. CommunicationVerbose and Lengthy Article
On getting similar suggestions earlier, we had agreed to the points raised and updated our article. The updated article is significantly condensed and contains collapsible where we thought we were losing track of the main point behind the article. The difference is best conveyed looking at both the variants of the articles. Please see the older and newer articles here and here, respectively. Concretely: FROM: Initially, we had the following sections on different acquisition functions.
TO: We compressed the following sections:
Compressed and formed collapsible for the following sections:
FROM: Furthermore, we had three real-life examples where we showed the BO framework being used for hyperparameters optimization.
TO:
Below we see a collapsible in action. Interactive figure without context
This is an excellent point that refers to the figure trying to explain the effect of ϵ on a particular acquisition function (Probability of Improvement). The main issue that the initial review also pointed out was that the reader was not able to understand what the figure is trying to show. We have moved the figure after the section where we explain the Probability of Improvement. This provides a better context for the reader about the figure. Concretely: FROM: TO: Too many non-interactive simulations
This point is similar to the point raised in the section “Verbose and Lengthy Article”. We have addressed this issue by removing some sections and forming collapsible mentioned above. We further introduced a slider for providing better control to the reader. Initially, all the non-interactive figures were gifs, now these figures can be better controlled by the user. Bird's eye view missing
This is a great brief on BO and we have addressed this point by incorporating a Bayesian Optimization primer at the end of our updated article. FROM: TO: Scientific Correctness & Integrity.Minor Points
We believe that a truly random strategy is the same as the random acquisition function.
After inputs from Jasper Snoek, the BO framework is performing exceptionally better than the earlier version of the article. Therefore we feel this point is no longer an issue. For comparison, Please see the difference in performances in the plot below.
We believe the comparison to be fair given our task is to compare the different acquisition functions for optimizing a black box function in the least number of iterations. Major Points
The point raised in this section raises an issue with the absence of any distinction between the theoretical exposition of BO w.r.t. practical tips for using BO in real life without provable theoretical results. Upon receiving similar inputs from the reviewer earlier, we updated the newer article to focus explicitly on the practical use case of BO. We again want to thank the reviewer for their valuable suggestions. The article significantly improved upon moving forward with the suggestions from the reviewer. |
The following peer review was solicited as part of the Distill review process.
The reviewer chose to keep anonymity. Distill offers reviewers a choice between anonymous review and offering reviews under their name. Non-anonymous review allows reviewers to get credit for the service they offer to the community.
Distill employs a reviewer worksheet as a help for reviewers.
The first three parts of this worksheet ask reviewers to rate a submission along certain dimensions on a scale from 1 to 5. While the scale meaning is consistently "higher is better", please read the explanations for our expectations for each score—we do not expect even exceptionally good papers to receive a perfect score in every category, and expect most papers to be around a 3 in most categories.
Any concerns or conflicts of interest that you are aware of?: No known conflicts of interest
What type of contributions does this article make?: The article provides an exposition of bayesian optimization methods. It motivates the use of Bayesian Optimization (BO), and gives several examples of applying BO for different objectives.
Comments
I think the main contribution of the current article is in the simulations, which illustrate BO in practice. However, I believe the article does not do a great job of explaining the setup and foundations of BO, and of unifying the various examples under a common framework. In this sense, I don't believe its exposition is a significant contribution.
For example, I think the following short note (which the authors cite) does an excellent job of briefly introducing the BO formalism, and presenting different instanciations of BO (for different objective functions) under the same underlying framework: https://www.cse.wustl.edu/~garnett/cse515t/spring_2015/files/lecture_notes/12.pdf
Comments
-- Assume a Gaussian Process prior on the ground-truth function F.
-- Formalize your objective (eg. sampling a point 'x' with maximum expected value of F(x), or maximizing the probability that F(x) > F(x_j) for all previously-sampled points x_j)
-- Use the existing samples {(x, F(x))} to compute the posterior of F given the samples (under the GP prior), and maximize your objective function under the posterior. This yields a choice of new point to sample.
-- (Different "acquisition functions" simply correspond to different objectives in step (2)).
Comments
Minor points:
Major points:
With respect to the scientific content, my main issue is that there is no clear distinction made between:
These two viewpoints are conflated throughout the article. For example, in the section "Formalizing Bayesian Optimization", the points described are actually heuristics about setting (B), not formalisms in the sense of (A).
This confusion also makes it difficult to see how different acquisition functions relate to each other, and what our actual objective is in choosing between different acquisition functions.
The text was updated successfully, but these errors were encountered: