Skip to content

Commit

Permalink
v0.1.1: Added more in1k finetuned models.
Browse files Browse the repository at this point in the history
  • Loading branch information
dbolya committed Jun 13, 2023
1 parent f3087b7 commit 1e3024d
Show file tree
Hide file tree
Showing 4 changed files with 25 additions and 12 deletions.
3 changes: 2 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
### **[In Progress]** v0.1.1
### **[2023.06.12]** v0.1.1
- Added the ability to specify multiple pretrained checkpoints per architecture (specify with `checkpoint=<ckpt_name>`).
- Added the ability to pass `strict=False` to a pretrained model so that you can use a different number of classes. **Note:** when changing the number of classes, the head layer will be reset.
- Released all in1k finetuned models.

### **[2023.06.01]** v0.1.0
- Initial Release.
12 changes: 7 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,8 +36,10 @@ Several domain specific vision transformers have been introduced that employ thi
We show that a lot of this bulk is actually _unnecessary_. Instead of manually adding spatial bases through architectural changes, we opt to _teach_ the model these biases instead. By training with [MAE](https://arxiv.org/abs/2111.06377), we can simplify or remove _all_ of these bulky modules in existing transformers and _increase accuracy_ in the process. The result is Hiera, an extremely efficient and simple architecture that outperforms the state-of-the-art in several image and video recognition tasks.

## News
- **[2023.06.12]** Added more in1k models and some video examples, see inference.ipynb (v0.1.1).
- **[2023.06.01]** Initial release.

See the [changelog](https://github.com/facebookresearch/hiera/tree/main/CHANGELOG.md) for more details.

## Installation

Expand Down Expand Up @@ -75,12 +77,12 @@ As of now, base finetuned models are available. The rest are coming soon.
### Image Models
| Model | Model Name | Pretrained Models<br>(IN-1K MAE) | Finetuned Models<br>(IN-1K Supervised) | IN-1K<br>Top-1 (%) | A100 fp16<br>Speed (im/s) |
|----------|-----------------------|----------------------------------|----------------------------------------|:------------------:|:-------------------------:|
| Hiera-T | `hiera_tiny_224` | Coming Soon | Coming Soon | 82.8 | 2758 |
| Hiera-S | `hiera_small_224` | Coming Soon | Coming Soon | 83.8 | 2211 |
| Hiera-T | `hiera_tiny_224` | Coming Soon | [mae_in1k_ft_in1k](https://dl.fbaipublicfiles.com/hiera/hiera_tiny_224.pth) | 82.8 | 2758 |
| Hiera-S | `hiera_small_224` | Coming Soon | [mae_in1k_ft_in1k](https://dl.fbaipublicfiles.com/hiera/hiera_small_224.pth) | 83.8 | 2211 |
| Hiera-B | `hiera_base_224` | Coming Soon | [mae_in1k_ft_in1k](https://dl.fbaipublicfiles.com/hiera/hiera_base_224.pth) | 84.5 | 1556 |
| Hiera-B+ | `hiera_base_plus_224` | Coming Soon | Coming Soon | 85.2 | 1247 |
| Hiera-L | `hiera_large_224` | Coming Soon | Coming Soon | 86.1 | 531 |
| Hiera-H | `hiera_huge_224` | Coming Soon | Coming Soon | 86.9 | 274 |
| Hiera-B+ | `hiera_base_plus_224` | Coming Soon | [mae_in1k_ft_in1k](https://dl.fbaipublicfiles.com/hiera/hiera_base_plus_224.pth) | 85.2 | 1247 |
| Hiera-L | `hiera_large_224` | Coming Soon | [mae_in1k_ft_in1k](https://dl.fbaipublicfiles.com/hiera/hiera_large_224.pth) | 86.1 | 531 |
| Hiera-H | `hiera_huge_224` | Coming Soon | [mae_in1k_ft_in1k](https://dl.fbaipublicfiles.com/hiera/hiera_huge_224.pth) | 86.9 | 274 |

Each model inputs a 224x224 image.
### Video Models
Expand Down
20 changes: 15 additions & 5 deletions hiera/hiera.py
Original file line number Diff line number Diff line change
Expand Up @@ -436,12 +436,16 @@ def forward(

# Image models

@pretrained_model(None)
@pretrained_model({
"mae_in1k_ft_in1k": "https://dl.fbaipublicfiles.com/hiera/hiera_tiny_224.pth",
}, default="mae_in1k_ft_in1k")
def hiera_tiny_224(**kwdargs):
return Hiera(embed_dim=96, num_heads=1, stages=(1, 2, 7, 2), **kwdargs)


@pretrained_model(None)
@pretrained_model({
"mae_in1k_ft_in1k": "https://dl.fbaipublicfiles.com/hiera/hiera_small_224.pth",
}, default="mae_in1k_ft_in1k")
def hiera_small_224(**kwdargs):
return Hiera(embed_dim=96, num_heads=1, stages=(1, 2, 11, 2), **kwdargs)

Expand All @@ -453,17 +457,23 @@ def hiera_base_224(**kwdargs):
return Hiera(embed_dim=96, num_heads=1, stages=(2, 3, 16, 3), **kwdargs)


@pretrained_model(None)
@pretrained_model({
"mae_in1k_ft_in1k": "https://dl.fbaipublicfiles.com/hiera/hiera_base_plus_224.pth",
}, default="mae_in1k_ft_in1k")
def hiera_base_plus_224(**kwdargs):
return Hiera(embed_dim=112, num_heads=2, stages=(2, 3, 16, 3), **kwdargs)


@pretrained_model(None)
@pretrained_model({
"mae_in1k_ft_in1k": "https://dl.fbaipublicfiles.com/hiera/hiera_large_224.pth",
}, default="mae_in1k_ft_in1k")
def hiera_large_224(**kwdargs):
return Hiera(embed_dim=144, num_heads=2, stages=(2, 6, 36, 4), **kwdargs)


@pretrained_model(None)
@pretrained_model({
"mae_in1k_ft_in1k": "https://dl.fbaipublicfiles.com/hiera/hiera_huge_224.pth",
}, default="mae_in1k_ft_in1k")
def hiera_huge_224(**kwdargs):
return Hiera(embed_dim=256, num_heads=4, stages=(2, 6, 36, 4), **kwdargs)

Expand Down
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@

setup(
name="hiera-transformer",
version="0.1.1.dev",
version="0.1.1",
author="Chaitanya Ryali, Daniel Bolya",
url="https://github.com/facebookresearch/hiera",
description="A fast, powerful, and simple hierarchical vision transformer",
Expand Down

0 comments on commit 1e3024d

Please sign in to comment.