v0.1.1: Added more in1k finetuned models.

facebookresearch · Jun 13, 2023 · 1e3024d · 1e3024d
1 parent f3087b7
commit 1e3024d
Show file tree

Hide file tree

Showing 4 changed files with 25 additions and 12 deletions.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -1,6 +1,7 @@
-### **[In Progress]** v0.1.1
+### **[2023.06.12]** v0.1.1
  - Added the ability to specify multiple pretrained checkpoints per architecture (specify with `checkpoint=<ckpt_name>`).
  - Added the ability to pass `strict=False` to a pretrained model so that you can use a different number of classes. **Note:** when changing the number of classes, the head layer will be reset.
+ - Released all in1k finetuned models.
 
 ### **[2023.06.01]** v0.1.0
  - Initial Release.
diff --git a/README.md b/README.md
@@ -36,8 +36,10 @@ Several domain specific vision transformers have been introduced that employ thi
 We show that a lot of this bulk is actually _unnecessary_. Instead of manually adding spatial bases through architectural changes, we opt to _teach_ the model these biases instead. By training with [MAE](https://arxiv.org/abs/2111.06377), we can simplify or remove _all_ of these bulky modules in existing transformers and _increase accuracy_ in the process. The result is Hiera, an extremely efficient and simple architecture that outperforms the state-of-the-art in several image and video recognition tasks.
 
 ## News
+ - **[2023.06.12]** Added more in1k models and some video examples, see inference.ipynb (v0.1.1).
  - **[2023.06.01]** Initial release.
 
+See the [changelog](https://github.com/facebookresearch/hiera/tree/main/CHANGELOG.md) for more details.
 
 ## Installation
 
@@ -75,12 +77,12 @@ As of now, base finetuned models are available. The rest are coming soon.
 ### Image Models
 | Model    | Model Name            | Pretrained Models<br>(IN-1K MAE) | Finetuned Models<br>(IN-1K Supervised) | IN-1K<br>Top-1 (%) | A100 fp16<br>Speed (im/s) |
 |----------|-----------------------|----------------------------------|----------------------------------------|:------------------:|:-------------------------:|
-| Hiera-T  | `hiera_tiny_224`      | Coming Soon         | Coming Soon       |       82.8         |            2758           |
-| Hiera-S  | `hiera_small_224`     | Coming Soon         | Coming Soon       |       83.8         |            2211           |
+| Hiera-T  | `hiera_tiny_224`      | Coming Soon         | [mae_in1k_ft_in1k](https://dl.fbaipublicfiles.com/hiera/hiera_tiny_224.pth)       |       82.8         |            2758           |
+| Hiera-S  | `hiera_small_224`     | Coming Soon         | [mae_in1k_ft_in1k](https://dl.fbaipublicfiles.com/hiera/hiera_small_224.pth)      |       83.8         |            2211           |
 | Hiera-B  | `hiera_base_224`      | Coming Soon         | [mae_in1k_ft_in1k](https://dl.fbaipublicfiles.com/hiera/hiera_base_224.pth)       |       84.5         |            1556           |
-| Hiera-B+ | `hiera_base_plus_224` | Coming Soon         | Coming Soon       |       85.2         |            1247           |
-| Hiera-L  | `hiera_large_224`     | Coming Soon         | Coming Soon       |       86.1         |            531            |
-| Hiera-H  | `hiera_huge_224`      | Coming Soon         | Coming Soon       |       86.9         |            274            |
+| Hiera-B+ | `hiera_base_plus_224` | Coming Soon         | [mae_in1k_ft_in1k](https://dl.fbaipublicfiles.com/hiera/hiera_base_plus_224.pth)  |       85.2         |            1247           |
+| Hiera-L  | `hiera_large_224`     | Coming Soon         | [mae_in1k_ft_in1k](https://dl.fbaipublicfiles.com/hiera/hiera_large_224.pth)      |       86.1         |            531            |
+| Hiera-H  | `hiera_huge_224`      | Coming Soon         | [mae_in1k_ft_in1k](https://dl.fbaipublicfiles.com/hiera/hiera_huge_224.pth)       |       86.9         |            274            |
 
 Each model inputs a 224x224 image.
 ### Video Models

diff --git a/hiera/hiera.py b/hiera/hiera.py
@@ -436,12 +436,16 @@ def forward(
 
 # Image models
 
-@pretrained_model(None)
+@pretrained_model({
+    "mae_in1k_ft_in1k": "https://dl.fbaipublicfiles.com/hiera/hiera_tiny_224.pth",
+}, default="mae_in1k_ft_in1k")
 def hiera_tiny_224(**kwdargs):
     return Hiera(embed_dim=96, num_heads=1, stages=(1, 2, 7, 2), **kwdargs)
 
 
-@pretrained_model(None)
+@pretrained_model({
+    "mae_in1k_ft_in1k": "https://dl.fbaipublicfiles.com/hiera/hiera_small_224.pth",
+}, default="mae_in1k_ft_in1k")
 def hiera_small_224(**kwdargs):
     return Hiera(embed_dim=96, num_heads=1, stages=(1, 2, 11, 2), **kwdargs)
 
@@ -453,17 +457,23 @@ def hiera_base_224(**kwdargs):
     return Hiera(embed_dim=96, num_heads=1, stages=(2, 3, 16, 3), **kwdargs)
 
 
-@pretrained_model(None)
+@pretrained_model({
+    "mae_in1k_ft_in1k": "https://dl.fbaipublicfiles.com/hiera/hiera_base_plus_224.pth",
+}, default="mae_in1k_ft_in1k")
 def hiera_base_plus_224(**kwdargs):
     return Hiera(embed_dim=112, num_heads=2, stages=(2, 3, 16, 3), **kwdargs)
 
 
-@pretrained_model(None)
+@pretrained_model({
+    "mae_in1k_ft_in1k": "https://dl.fbaipublicfiles.com/hiera/hiera_large_224.pth",
+}, default="mae_in1k_ft_in1k")
 def hiera_large_224(**kwdargs):
     return Hiera(embed_dim=144, num_heads=2, stages=(2, 6, 36, 4), **kwdargs)
 
 
-@pretrained_model(None)
+@pretrained_model({
+    "mae_in1k_ft_in1k": "https://dl.fbaipublicfiles.com/hiera/hiera_huge_224.pth",
+}, default="mae_in1k_ft_in1k")
 def hiera_huge_224(**kwdargs):
     return Hiera(embed_dim=256, num_heads=4, stages=(2, 6, 36, 4), **kwdargs)
 

diff --git a/setup.py b/setup.py
@@ -9,7 +9,7 @@
 
 setup(
     name="hiera-transformer",
-    version="0.1.1.dev",
+    version="0.1.1",
     author="Chaitanya Ryali, Daniel Bolya",
     url="https://github.com/facebookresearch/hiera",
     description="A fast, powerful, and simple hierarchical vision transformer",