- m = Number of training examples
- x’s = “input” variable /features
- y’s = “output” variable / “target” variaable
- (x, y) = one training example
-
$(x^{(i)}, y^{(i)})$ =$i_{th}$ training example - h(x) = hypothesis function
-
$h_\theta(x) = \theta_0 + \theta_1x$ shorthand:h(x)
- squared cost function $$J(\theta_0, \theta_1) = \dfrac {1}{2m} \displaystyle \sum {i=1}^m \left ( \hat{y}{i}- y_{i} \right)^2 = \dfrac {1}{2m} \displaystyle \sum {i=1}^m \left (h\theta (x_{i}) - y_{i} \right)^2$$
- Goal:
$minimize_{\theta_0, \theta_1}J(\theta_0, \theta_1)$
repeat until convergence {
$\theta_j := \theta_j - \alpha \frac{\partial}{\partial \theta_j} J(\theta_0, \theta_1)$ (for j = 0 and j = 0) }
需要同时更新$\theta_j$, 否则先更新$\theta_i$会对后面的项的更新产生影响
$\begin{align*} \text{repeat until convergence: } \lbrace & \newline \theta_0 := & \theta_0 - \alpha \frac{1}{m} \sum\limits_{i=1}^{m}(h_\theta(x_{i}) - y_{i}) \newline \theta_1 := & \theta_1 - \alpha \frac{1}{m} \sum\limits_{i=1}^{m}\left((h_\theta(x_{i}) - y_{i}) x_{i}\right) \newline \rbrace& \end{align*}$