Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Update
beta
distribution parameters in 3.4-Multi-Armed-Bandit-Exper…
…iment.ipynb (#643) Update the beta distribution parameters in the `simulate_experiment` function to avoid bias towards lower success probability. The current specification of the beta distribution: ``` theta = np.random.beta(conversions + 1, exposures + 1) ``` treats every exposure as a failure, that is overstates the failures thus undervalues the success probabilities of the variations. The effect is pronounced for variations with very high baseline conversion rates but less severe for variations with extremely low conversion rates. Traditionally, the Thompson Sampling Algorithm for the Bernoulli Bandit is: ```math \begin{align*} 1: & \text{for } t = 1, 2, \ldots \text{ do:} \\ 2: & \quad \quad \text{Sample model:} \\ 3: & \quad \quad \text{for } k = 1 \text{ to } K \text{ do:} \\ 4: & \quad \quad \quad \text{Sample } \theta_k \sim \text{beta}(\alpha_k, \beta_k) \\ 5: & \quad \quad \text{$$end for$$} \\ 6: \\ 7: & \quad \quad \text{Select and apply action:} \\ 8: & \quad \quad x_t \leftarrow argmax_k \theta_k \\ 9: & \quad \quad \text{Apply } x_t \text{ and observe } r_t \\ 10: \\ 11: & \quad \quad \text{Update distribution:} \\ 12: & \quad \quad (\alpha_{x_t}, \beta_{x_t}) \leftarrow (\alpha_{x_t} + r_t, \beta_{x_t} + 1 - r_t) \\ 13: & \text{end for} \end{align*} ``` Where α, β represent the parameters of each arm i.e. the success and failure counts, respectively OR the number of `conversions` and `non-conversions`, respectively. ``` non-conversions (or beta) = exposures - conversions ``` Co-authored-by: James Jory <[email protected]>
- Loading branch information