Fix the test_accuracy function by modifying the assertion logic #867
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The original code is testing the accuracy of different estimators by checking if the true Average Treatment Effect (ATE) falls within the calculated confidence interval. However, the check is done only once, using a single-point estimate (ate), which may not be sufficient to validate the estimator's performance. So it failed when the proportion of a true ATE within the confidence interval is NOT greater than 0.5 (50%).
The new logic: To check that 50% of the values are in the 90% confidence interval (which makes sense), but it's testing this with the ate, which returns a single value, so actually the threshold isn't important, it's a single point that is either in the interval or not. Instead, what we should do is generate W, D, and Y several times and check that most of the time the ate is in the bounds (like, generate 10 sets of W, D, Y and check that at least 8 of those times the true ate was inside the interval.
Also applied the logic for test_accuracy_iv (And reduced the sample size n=1000 to improve the test time)