Skip to content

Commit

Permalink
Update index.html
Browse files Browse the repository at this point in the history
  • Loading branch information
GzyAftermath authored Oct 6, 2023
1 parent 0784788 commit cb01f20
Showing 1 changed file with 42 additions and 47 deletions.
89 changes: 42 additions & 47 deletions index.html
Original file line number Diff line number Diff line change
Expand Up @@ -43,45 +43,6 @@

<body>

<!-- <nav class="navbar" role="navigation" aria-label="main navigation">
<div class="navbar-brand">
<a role="button" class="navbar-burger" aria-label="menu" aria-expanded="false">
<span aria-hidden="true"></span>
<span aria-hidden="true"></span>
<span aria-hidden="true"></span>
</a>
</div>
<div class="navbar-menu">
<div class="navbar-start" style="flex-grow: 1; justify-content: center;">
<a class="navbar-item" href="https://keunhong.com">
<span class="icon">
<i class="fas fa-home"></i>
</span>
</a>
<div class="navbar-item has-dropdown is-hoverable">
<a class="navbar-link">
More Research
</a>
<div class="navbar-dropdown">
<a class="navbar-item" href="https://hypernerf.github.io">
HyperNeRF
</a>
<a class="navbar-item" href="https://nerfies.github.io">
Nerfies
</a>
<a class="navbar-item" href="https://latentfusion.github.io">
LatentFusion
</a>
<a class="navbar-item" href="https://photoshape.github.io">
PhotoShape
</a>
</div>
</div>
</div>
</div>
</nav> -->


<section class="hero">
Expand Down Expand Up @@ -228,12 +189,12 @@ <h2 class="title is-3">Overview</h2>
<b>Dataset Distillation</b> aims at synthesizing a small synthetic dataset such that a model trained on
this synthetic set will perform <b>equally well</b> as a model trained on the full, real dataset.
Until now, no Dataset Distillation method has reached this completely lossless goal, in part due to the
fact that previous methods only remain effective when the size of the synthetic dataset is <b>extremely
fact that previous methods only remain effective when the size of synthetic dataset is <b>extremely
small</b>.
</p>
<p>
In this work, we elucidate why existing methods fail to generate larger, high-quality synthetic sets,
taking trajectory-matching based distillation methods as an example.
taking trajectory matching (TM) based distillation methods as an example.
Firstly, We empirically find that the training stage of the trajectories we choose to match (<i>i.e.</i>,
<b>early</b> or <b>late</b>) greatly affects the effectiveness of the distilled dataset.
</p>
Expand All @@ -248,7 +209,7 @@ <h2 class="title is-3">Overview</h2>
<p>
Based on our findings, we propose to align the difficulty of the generated patterns with the size of the
synthetic dataset.
In doing so, we successfully scale trajectory matching-based methods to larger synthetic datasets,
In doing so, we successfully scale TM-based methods to larger synthetic datasets,
achieving <b>lossless dataset distillation</b> for the very first time.
</p>
</div>
Expand All @@ -270,6 +231,39 @@ <h2 class="title is-3">Video</h2>
</div>
</section>

<!-- Concurrent Work. -->
<section class="section">
<div class="container">
<div class="columns is-centered">
<div class="container is-max-desktop content">
<h2 class="title is-3">Motivation and Findings</h2>

<div class="content has-text-justified">
<p>According to <a href="https://arxiv.org/abs/2104.09125">Cazenavette et al</a> and <a href="https://arxiv.org/abs/1706.05394">Arpit et al</a>:</p>
<p>
1. TM-based methods embed informative patterns into synthetic data by matching expert training strategies.
</p>
<p>
2. DNNs tend to learn <b>easy</b> patterns <b>early</b> in training.
</p>
<p>
Based on this, we infer that patterns gengerated by matching trajectories from <b>earlier</b> training phases are <b>easier</b> for DNNs to learn.
Then we explore the effect of matching trajectories from differrnt training phases:
</p>
<p></p>
<img src="./static/images/IPC.png" id="aa" />
<p></p>
<p>As can be observed, matching early trajectories, which will generate <b>easy</b> patterns, performs well when IPC is low but tend to be harmful as IPC increases.
Conversely, matching late trajectories is beneficial in the regime of high IPC.
</p>
<p>To keep dataset distillation effective in various IPC cases, we propose to <b>align the difficulty of the generated patterns with the size of the synthetic dataset</b>.</p>
</div>
</div>
</div>
</div>
</section>
<!--/ Concurrent Work. -->

<!-- Concurrent Work. -->
<section class="section">
<div class="container">
Expand All @@ -279,8 +273,9 @@ <h2 class="title is-3">Comparison</h2>

<div class="content has-text-justified">
<p>
Previous dataset distillation becomes ineffective as IPC increases. Benefitting from our difficulty
alignment strategy, our method is effective in all IPC settings.
As can be observed in the figure, previous distillation methods work well only when IPC is extremely small.
Benefited from our difficulty alignment strategy, our method is effective in all IPC settings.
Notably, we distill CIFAR-10 and CIFAR-100 to 1/5 and Tiny ImageNet to 1/10 of their original sizes without any performance loss on ConvNet, offering the first lossless method of dataset distillation.
</p>
<img src="./static/images/comparison.png" id="img_comparison" />

Expand All @@ -305,14 +300,14 @@ <h2 class="title is-3">Visualization</h2>
What do <b>easy</b> patterns and <b>hard</b> patterns look like?
</p>
<p>
Here we visualize the images synthesized by matching <b>early</b> and <b>late</b> trajectories, where <b>easy</b> patterns and <b>hard</b> patterns are embedded into the synthetic data respectively.
Here we visualize the images synthesized by matching <b>early</b> and <b>late</b> trajectories, where <b>easy</b> patterns and <b>hard</b> patterns are embedded respectively.

</p>
<p>
As can be observed, matching <b>early</b> trajectories will blend the target object into the background and blur the details, which helps DNNs to learn to identify common (<b>easy</b>) samples according to their basic patterns.
As can be observed, matching <b>early</b> trajectories will blend the target object into the background and blur the details, which help DNNs to learn to identify commen (<b>easy</b>) samples according to their basic patterns.
</p>
<p>
Conversely, matching <b>late</b> trajectories will give more details about the target, which helps DNNs to learn to identify the outlier (<b>hard</b>) samples.
Conversely, matching <b>late</b> trajectories will give more detials about the target, which help DNNs to learn to identify the outlier (<b>hard</b>) samples.
</p>
<p>

Expand Down

0 comments on commit cb01f20

Please sign in to comment.