Skip to content

Commit

Permalink
Built site for gh-pages
Browse files Browse the repository at this point in the history
  • Loading branch information
Quarto GHA Workflow Runner committed Nov 19, 2024
1 parent 0798a9e commit 549efa9
Show file tree
Hide file tree
Showing 5 changed files with 91 additions and 51 deletions.
2 changes: 1 addition & 1 deletion .nojekyll
Original file line number Diff line number Diff line change
@@ -1 +1 @@
396fd312
0cb2f234
35 changes: 34 additions & 1 deletion schedule/slides/23-nnets-other.html
Original file line number Diff line number Diff line change
Expand Up @@ -398,7 +398,7 @@
<h2>23 Neural nets - generalization</h2>
<p><span class="secondary">Stat 406</span></p>
<p><span class="secondary">Geoff Pleiss, Trevor Campbell</span></p>
<p>Last modified – 13 November 2024</p>
<p>Last modified – 18 November 2024</p>
<p><span class="math display">\[
\DeclareMathOperator*{\argmin}{argmin}
\DeclareMathOperator*{\argmax}{argmax}
Expand Down Expand Up @@ -634,6 +634,39 @@ <h2>Understanding Double Descent (Hand-Wavy)</h2>
</ul></li>
</ul>
</section>
<section id="understanding-double-descent-less-hand-wavy" class="slide level2">
<h2>Understanding Double Descent (Less Hand-Wavy)</h2>
<div class="flex">
<div class="w-60">
<p>(From <a href="https://arxiv.org/abs/1903.08560">Hastie et al., 2020</a>)</p>
<ul>
<li><p><span class="math inline">\(\gamma = D / N\)</span> (ratio of features / data)</p></li>
<li><p><span class="math inline">\(\sigma^2 = \mathbb{E}[Y|X]\)</span> (observational noise)</p></li>
<li><p>When basis features are uncorrelated, we have (asymptotically)</p></li>
</ul>
<p><span class="math display">\[
\begin{aligned}
\mathrm{Bias}^2 &amp;= \begin{cases}
0 &amp; \gamma &lt; 1 \text{ (underparam.)} \\
1 - \tfrac{1}{\gamma} &amp; \gamma \geq 1 \text{ (overparam.)}
\end{cases} \\
&amp; \\
\mathrm{Var} &amp;= \begin{cases}
\sigma^2 \tfrac{\gamma}{1 - \gamma} &amp; \gamma &lt; 1 \text{ (underparam.)} \\
\sigma^2 \tfrac{1}{\gamma - 1} &amp; \gamma \geq 1 \text{ (overparam.)}
\end{cases} \\
\end{aligned}
\]</span></p>
</div>
<div class="w-38">
<div class="quarto-figure quarto-figure-center">
<figure>
<p><img data-src="gfx/hastie_double_descent.png" class="quarto-figure quarto-figure-center" style="width:100.0%" data-fig-caption="Double descent curve theoretical."></p>
</figure>
</div>
</div>
</div>
</section>
<section id="do-we-need-to-worry-about-variance" class="slide level2">
<h2>Do we need to worry about variance?</h2>
<p><em>Regularizing</em> a neural network (adding a complexity penalty to the loss) is a common practice to prevent overfitting to the noise.</p>
Expand Down
Binary file added schedule/slides/gfx/hastie_double_descent.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
9 changes: 8 additions & 1 deletion search.json
Original file line number Diff line number Diff line change
Expand Up @@ -1390,7 +1390,7 @@
"href": "schedule/slides/23-nnets-other.html#section",
"title": "UBC Stat406 2024W",
"section": "23 Neural nets - generalization",
"text": "23 Neural nets - generalization\nStat 406\nGeoff Pleiss, Trevor Campbell\nLast modified – 13 November 2024\n\\[\n\\DeclareMathOperator*{\\argmin}{argmin}\n\\DeclareMathOperator*{\\argmax}{argmax}\n\\DeclareMathOperator*{\\minimize}{minimize}\n\\DeclareMathOperator*{\\maximize}{maximize}\n\\DeclareMathOperator*{\\find}{find}\n\\DeclareMathOperator{\\st}{subject\\,\\,to}\n\\newcommand{\\E}{E}\n\\newcommand{\\Expect}[1]{\\E\\left[ #1 \\right]}\n\\newcommand{\\Var}[1]{\\mathrm{Var}\\left[ #1 \\right]}\n\\newcommand{\\Cov}[2]{\\mathrm{Cov}\\left[#1,\\ #2\\right]}\n\\newcommand{\\given}{\\ \\vert\\ }\n\\newcommand{\\X}{\\mathbf{X}}\n\\newcommand{\\x}{\\mathbf{x}}\n\\newcommand{\\y}{\\mathbf{y}}\n\\newcommand{\\P}{\\mathcal{P}}\n\\newcommand{\\R}{\\mathbb{R}}\n\\newcommand{\\norm}[1]{\\left\\lVert #1 \\right\\rVert}\n\\newcommand{\\snorm}[1]{\\lVert #1 \\rVert}\n\\newcommand{\\tr}[1]{\\mbox{tr}(#1)}\n\\newcommand{\\brt}{\\widehat{\\beta}^R_{s}}\n\\newcommand{\\brl}{\\widehat{\\beta}^R_{\\lambda}}\n\\newcommand{\\bls}{\\widehat{\\beta}_{ols}}\n\\newcommand{\\blt}{\\widehat{\\beta}^L_{s}}\n\\newcommand{\\bll}{\\widehat{\\beta}^L_{\\lambda}}\n\\newcommand{\\U}{\\mathbf{U}}\n\\newcommand{\\D}{\\mathbf{D}}\n\\newcommand{\\V}{\\mathbf{V}}\n\\]"
"text": "23 Neural nets - generalization\nStat 406\nGeoff Pleiss, Trevor Campbell\nLast modified – 18 November 2024\n\\[\n\\DeclareMathOperator*{\\argmin}{argmin}\n\\DeclareMathOperator*{\\argmax}{argmax}\n\\DeclareMathOperator*{\\minimize}{minimize}\n\\DeclareMathOperator*{\\maximize}{maximize}\n\\DeclareMathOperator*{\\find}{find}\n\\DeclareMathOperator{\\st}{subject\\,\\,to}\n\\newcommand{\\E}{E}\n\\newcommand{\\Expect}[1]{\\E\\left[ #1 \\right]}\n\\newcommand{\\Var}[1]{\\mathrm{Var}\\left[ #1 \\right]}\n\\newcommand{\\Cov}[2]{\\mathrm{Cov}\\left[#1,\\ #2\\right]}\n\\newcommand{\\given}{\\ \\vert\\ }\n\\newcommand{\\X}{\\mathbf{X}}\n\\newcommand{\\x}{\\mathbf{x}}\n\\newcommand{\\y}{\\mathbf{y}}\n\\newcommand{\\P}{\\mathcal{P}}\n\\newcommand{\\R}{\\mathbb{R}}\n\\newcommand{\\norm}[1]{\\left\\lVert #1 \\right\\rVert}\n\\newcommand{\\snorm}[1]{\\lVert #1 \\rVert}\n\\newcommand{\\tr}[1]{\\mbox{tr}(#1)}\n\\newcommand{\\brt}{\\widehat{\\beta}^R_{s}}\n\\newcommand{\\brl}{\\widehat{\\beta}^R_{\\lambda}}\n\\newcommand{\\bls}{\\widehat{\\beta}_{ols}}\n\\newcommand{\\blt}{\\widehat{\\beta}^L_{s}}\n\\newcommand{\\bll}{\\widehat{\\beta}^L_{\\lambda}}\n\\newcommand{\\U}{\\mathbf{U}}\n\\newcommand{\\D}{\\mathbf{D}}\n\\newcommand{\\V}{\\mathbf{V}}\n\\]"
},
{
"objectID": "schedule/slides/23-nnets-other.html#this-lecture",
Expand Down Expand Up @@ -1483,6 +1483,13 @@
"section": "Understanding Double Descent (Hand-Wavy)",
"text": "Understanding Double Descent (Hand-Wavy)\nLet \\(\\boldsymbol Z \\in \\R^{n \\times d}\\) be the matrix of basis expansions for our \\(n\\) training points.\nBasis regression is just OLS with the basis expansion \\(\\boldsymbol Z\\): \\[ \\min_{\\boldsymbol \\beta} \\left\\Vert \\boldsymbol Z \\boldsymbol \\beta - \\boldsymbol y \\right\\Vert_2^2. \\]\n\nWhen \\(d &lt; n\\), the regressor is underparameterized.\nI.e. there is no \\(\\boldsymbol \\beta\\) that perfectly explains our training responses given our basis-expanded training inputs.\nWhen \\(d = n\\), there is a value of \\(\\boldsymbol \\beta\\) that fits our training data perfectly.\nI.e. \\(\\Vert \\boldsymbol Z \\boldsymbol \\beta - \\boldsymbol y \\Vert = 0\\).\n\nWe are fitting both the noise and the signal (leading to a high variance predictor).\n\nWhen \\(d &gt; n\\), we can also fit the data (noise + signal) perfectly.👋 However, more features implies that the the noise gets “spread out” over all of parameters. 👋\n\n👋 Since each parameter only captures “some” of the noise, we are less likely to make predictions based on it. 👋\nThis explanation is overly simplified, and there is a lot more at play."
},
{
"objectID": "schedule/slides/23-nnets-other.html#understanding-double-descent-less-hand-wavy",
"href": "schedule/slides/23-nnets-other.html#understanding-double-descent-less-hand-wavy",
"title": "UBC Stat406 2024W",
"section": "Understanding Double Descent (Less Hand-Wavy)",
"text": "Understanding Double Descent (Less Hand-Wavy)\n\n\n(From Hastie et al., 2020)\n\n\\(\\gamma = D / N\\) (ratio of features / data)\n\\(\\sigma^2 = \\mathbb{E}[Y|X]\\) (observational noise)\nWhen basis features are uncorrelated, we have (asymptotically)\n\n\\[\n\\begin{aligned}\n \\mathrm{Bias}^2 &= \\begin{cases}\n 0 & \\gamma &lt; 1 \\text{ (underparam.)} \\\\\n 1 - \\tfrac{1}{\\gamma} & \\gamma \\geq 1 \\text{ (overparam.)}\n \\end{cases} \\\\\n & \\\\\n \\mathrm{Var} &= \\begin{cases}\n \\sigma^2 \\tfrac{\\gamma}{1 - \\gamma} & \\gamma &lt; 1 \\text{ (underparam.)} \\\\\n \\sigma^2 \\tfrac{1}{\\gamma - 1} & \\gamma \\geq 1 \\text{ (overparam.)}\n \\end{cases} \\\\\n\\end{aligned}\n\\]"
},
{
"objectID": "schedule/slides/23-nnets-other.html#do-we-need-to-worry-about-variance",
"href": "schedule/slides/23-nnets-other.html#do-we-need-to-worry-about-variance",
Expand Down
96 changes: 48 additions & 48 deletions sitemap.xml
Original file line number Diff line number Diff line change
Expand Up @@ -2,194 +2,194 @@
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>https://UBC-STAT.github.io/stat-406/schedule/slides/00-r-review.html</loc>
<lastmod>2024-11-14T06:01:46.010Z</lastmod>
<lastmod>2024-11-19T03:03:30.574Z</lastmod>
</url>
<url>
<loc>https://UBC-STAT.github.io/stat-406/schedule/handouts/keras-nnet.html</loc>
<lastmod>2024-11-14T06:01:46.002Z</lastmod>
<lastmod>2024-11-19T03:03:30.566Z</lastmod>
</url>
<url>
<loc>https://UBC-STAT.github.io/stat-406/schedule/slides/11-kernel-smoothers.html</loc>
<lastmod>2024-11-14T06:01:46.010Z</lastmod>
<lastmod>2024-11-19T03:03:30.574Z</lastmod>
</url>
<url>
<loc>https://UBC-STAT.github.io/stat-406/schedule/slides/02-lm-example.html</loc>
<lastmod>2024-11-14T06:01:46.010Z</lastmod>
<lastmod>2024-11-19T03:03:30.574Z</lastmod>
</url>
<url>
<loc>https://UBC-STAT.github.io/stat-406/schedule/slides/07-greedy-selection.html</loc>
<lastmod>2024-11-14T06:01:46.010Z</lastmod>
<lastmod>2024-11-19T03:03:30.574Z</lastmod>
</url>
<url>
<loc>https://UBC-STAT.github.io/stat-406/schedule/slides/10-basis-expansions.html</loc>
<lastmod>2024-11-14T06:01:46.010Z</lastmod>
<lastmod>2024-11-19T03:03:30.574Z</lastmod>
</url>
<url>
<loc>https://UBC-STAT.github.io/stat-406/schedule/slides/13-gams-trees.html</loc>
<lastmod>2024-11-14T06:01:46.010Z</lastmod>
<lastmod>2024-11-19T03:03:30.574Z</lastmod>
</url>
<url>
<loc>https://UBC-STAT.github.io/stat-406/schedule/slides/24-pca-intro.html</loc>
<lastmod>2024-11-14T06:01:46.010Z</lastmod>
<lastmod>2024-11-19T03:03:30.578Z</lastmod>
</url>
<url>
<loc>https://UBC-STAT.github.io/stat-406/schedule/slides/15-LDA-and-QDA.html</loc>
<lastmod>2024-11-14T06:01:46.010Z</lastmod>
<lastmod>2024-11-19T03:03:30.578Z</lastmod>
</url>
<url>
<loc>https://UBC-STAT.github.io/stat-406/schedule/slides/20-boosting.html</loc>
<lastmod>2024-11-14T06:01:46.010Z</lastmod>
<lastmod>2024-11-19T03:03:30.578Z</lastmod>
</url>
<url>
<loc>https://UBC-STAT.github.io/stat-406/schedule/slides/00-classification-losses.html</loc>
<lastmod>2024-11-14T06:01:46.006Z</lastmod>
<lastmod>2024-11-19T03:03:30.574Z</lastmod>
</url>
<url>
<loc>https://UBC-STAT.github.io/stat-406/schedule/slides/16-logistic-regression.html</loc>
<lastmod>2024-11-14T06:01:46.010Z</lastmod>
<lastmod>2024-11-19T03:03:30.578Z</lastmod>
</url>
<url>
<loc>https://UBC-STAT.github.io/stat-406/schedule/slides/23-nnets-other.html</loc>
<lastmod>2024-11-14T06:01:46.010Z</lastmod>
<lastmod>2024-11-19T03:03:30.578Z</lastmod>
</url>
<url>
<loc>https://UBC-STAT.github.io/stat-406/schedule/slides/19-bagging-and-rf.html</loc>
<lastmod>2024-11-14T06:01:46.010Z</lastmod>
<lastmod>2024-11-19T03:03:30.578Z</lastmod>
</url>
<url>
<loc>https://UBC-STAT.github.io/stat-406/schedule/slides/00-cv-for-many-models.html</loc>
<lastmod>2024-11-14T06:01:46.010Z</lastmod>
<lastmod>2024-11-19T03:03:30.574Z</lastmod>
</url>
<url>
<loc>https://UBC-STAT.github.io/stat-406/schedule/slides/01-lm-review.html</loc>
<lastmod>2024-11-14T06:01:46.010Z</lastmod>
<lastmod>2024-11-19T03:03:30.574Z</lastmod>
</url>
<url>
<loc>https://UBC-STAT.github.io/stat-406/schedule/slides/12-why-smooth.html</loc>
<lastmod>2024-11-14T06:01:46.010Z</lastmod>
<lastmod>2024-11-19T03:03:30.574Z</lastmod>
</url>
<url>
<loc>https://UBC-STAT.github.io/stat-406/schedule/slides/22-nnets-estimation.html</loc>
<lastmod>2024-11-14T06:01:46.010Z</lastmod>
<lastmod>2024-11-19T03:03:30.578Z</lastmod>
</url>
<url>
<loc>https://UBC-STAT.github.io/stat-406/schedule/slides/00-intro-to-class.html</loc>
<lastmod>2024-11-14T06:01:46.010Z</lastmod>
<lastmod>2024-11-19T03:03:30.574Z</lastmod>
</url>
<url>
<loc>https://UBC-STAT.github.io/stat-406/schedule/handouts/lab00-git.html</loc>
<lastmod>2024-11-14T06:01:46.002Z</lastmod>
<lastmod>2024-11-19T03:03:30.566Z</lastmod>
</url>
<url>
<loc>https://UBC-STAT.github.io/stat-406/course-setup.html</loc>
<lastmod>2024-11-14T06:01:45.978Z</lastmod>
<lastmod>2024-11-19T03:03:30.546Z</lastmod>
</url>
<url>
<loc>https://UBC-STAT.github.io/stat-406/computing/windows.html</loc>
<lastmod>2024-11-14T06:01:45.978Z</lastmod>
<lastmod>2024-11-19T03:03:30.546Z</lastmod>
</url>
<url>
<loc>https://UBC-STAT.github.io/stat-406/computing/mac_x86.html</loc>
<lastmod>2024-11-14T06:01:45.978Z</lastmod>
<lastmod>2024-11-19T03:03:30.546Z</lastmod>
</url>
<url>
<loc>https://UBC-STAT.github.io/stat-406/computing/index.html</loc>
<lastmod>2024-11-14T06:01:45.978Z</lastmod>
<lastmod>2024-11-19T03:03:30.546Z</lastmod>
</url>
<url>
<loc>https://UBC-STAT.github.io/stat-406/index.html</loc>
<lastmod>2024-11-14T06:01:45.978Z</lastmod>
<lastmod>2024-11-19T03:03:30.546Z</lastmod>
</url>
<url>
<loc>https://UBC-STAT.github.io/stat-406/computing/mac_arm.html</loc>
<lastmod>2024-11-14T06:01:45.978Z</lastmod>
<lastmod>2024-11-19T03:03:30.546Z</lastmod>
</url>
<url>
<loc>https://UBC-STAT.github.io/stat-406/computing/ubuntu.html</loc>
<lastmod>2024-11-14T06:01:45.978Z</lastmod>
<lastmod>2024-11-19T03:03:30.546Z</lastmod>
</url>
<url>
<loc>https://UBC-STAT.github.io/stat-406/syllabus.html</loc>
<lastmod>2024-11-14T06:01:46.050Z</lastmod>
<lastmod>2024-11-19T03:03:30.618Z</lastmod>
</url>
<url>
<loc>https://UBC-STAT.github.io/stat-406/schedule/index.html</loc>
<lastmod>2024-11-14T06:01:46.006Z</lastmod>
<lastmod>2024-11-19T03:03:30.574Z</lastmod>
</url>
<url>
<loc>https://UBC-STAT.github.io/stat-406/schedule/slides/00-course-review.html</loc>
<lastmod>2024-11-14T06:01:46.010Z</lastmod>
<lastmod>2024-11-19T03:03:30.574Z</lastmod>
</url>
<url>
<loc>https://UBC-STAT.github.io/stat-406/schedule/slides/00-version-control.html</loc>
<lastmod>2024-11-14T06:01:46.010Z</lastmod>
<lastmod>2024-11-19T03:03:30.574Z</lastmod>
</url>
<url>
<loc>https://UBC-STAT.github.io/stat-406/faq.html</loc>
<lastmod>2024-11-14T06:01:45.978Z</lastmod>
<lastmod>2024-11-19T03:03:30.546Z</lastmod>
</url>
<url>
<loc>https://UBC-STAT.github.io/stat-406/schedule/slides/21-nnets-intro.html</loc>
<lastmod>2024-11-14T06:01:46.010Z</lastmod>
<lastmod>2024-11-19T03:03:30.578Z</lastmod>
</url>
<url>
<loc>https://UBC-STAT.github.io/stat-406/schedule/slides/03-regression-function.html</loc>
<lastmod>2024-11-14T06:01:46.010Z</lastmod>
<lastmod>2024-11-19T03:03:30.574Z</lastmod>
</url>
<url>
<loc>https://UBC-STAT.github.io/stat-406/schedule/slides/06-information-criteria.html</loc>
<lastmod>2024-11-14T06:01:46.010Z</lastmod>
<lastmod>2024-11-19T03:03:30.574Z</lastmod>
</url>
<url>
<loc>https://UBC-STAT.github.io/stat-406/schedule/slides/04-bias-variance.html</loc>
<lastmod>2024-11-14T06:01:46.010Z</lastmod>
<lastmod>2024-11-19T03:03:30.574Z</lastmod>
</url>
<url>
<loc>https://UBC-STAT.github.io/stat-406/schedule/slides/14-classification-intro.html</loc>
<lastmod>2024-11-14T06:01:46.010Z</lastmod>
<lastmod>2024-11-19T03:03:30.578Z</lastmod>
</url>
<url>
<loc>https://UBC-STAT.github.io/stat-406/schedule/slides/27-kmeans.html</loc>
<lastmod>2024-11-14T06:01:46.010Z</lastmod>
<lastmod>2024-11-19T03:03:30.578Z</lastmod>
</url>
<url>
<loc>https://UBC-STAT.github.io/stat-406/schedule/slides/08-ridge-regression.html</loc>
<lastmod>2024-11-14T06:01:46.010Z</lastmod>
<lastmod>2024-11-19T03:03:30.574Z</lastmod>
</url>
<url>
<loc>https://UBC-STAT.github.io/stat-406/schedule/slides/00-quiz-0-wrap.html</loc>
<lastmod>2024-11-14T06:01:46.010Z</lastmod>
<lastmod>2024-11-19T03:03:30.574Z</lastmod>
</url>
<url>
<loc>https://UBC-STAT.github.io/stat-406/schedule/slides/26-pca-v-kpca.html</loc>
<lastmod>2024-11-14T06:01:46.010Z</lastmod>
<lastmod>2024-11-19T03:03:30.578Z</lastmod>
</url>
<url>
<loc>https://UBC-STAT.github.io/stat-406/schedule/slides/25-pca-issues.html</loc>
<lastmod>2024-11-14T06:01:46.010Z</lastmod>
<lastmod>2024-11-19T03:03:30.578Z</lastmod>
</url>
<url>
<loc>https://UBC-STAT.github.io/stat-406/schedule/slides/05-estimating-test-mse.html</loc>
<lastmod>2024-11-14T06:01:46.010Z</lastmod>
<lastmod>2024-11-19T03:03:30.574Z</lastmod>
</url>
<url>
<loc>https://UBC-STAT.github.io/stat-406/schedule/slides/28-hclust.html</loc>
<lastmod>2024-11-14T06:01:46.010Z</lastmod>
<lastmod>2024-11-19T03:03:30.578Z</lastmod>
</url>
<url>
<loc>https://UBC-STAT.github.io/stat-406/schedule/slides/09-l1-penalties.html</loc>
<lastmod>2024-11-14T06:01:46.010Z</lastmod>
<lastmod>2024-11-19T03:03:30.574Z</lastmod>
</url>
<url>
<loc>https://UBC-STAT.github.io/stat-406/schedule/slides/17-nonlinear-classifiers.html</loc>
<lastmod>2024-11-14T06:01:46.010Z</lastmod>
<lastmod>2024-11-19T03:03:30.578Z</lastmod>
</url>
<url>
<loc>https://UBC-STAT.github.io/stat-406/schedule/slides/18-the-bootstrap.html</loc>
<lastmod>2024-11-14T06:01:46.010Z</lastmod>
<lastmod>2024-11-19T03:03:30.578Z</lastmod>
</url>
<url>
<loc>https://UBC-STAT.github.io/stat-406/schedule/slides/00-gradient-descent.html</loc>
<lastmod>2024-11-14T06:01:46.010Z</lastmod>
<lastmod>2024-11-19T03:03:30.574Z</lastmod>
</url>
</urlset>

0 comments on commit 549efa9

Please sign in to comment.