Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge upstream #1

Open
wants to merge 115 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
115 commits
Select commit Hold shift + click to select a range
8aff17c
Update README.md
eltociear Jan 22, 2024
5c11560
Merge pull request #20 from eltociear/patch-1
ResearcherXman Jan 22, 2024
e2295d1
Update README.md
ResearcherXman Jan 22, 2024
7bd52fd
Update README.md
johndpope Jan 22, 2024
be9aea7
Fix the example in README.md
Czxck001 Jan 23, 2024
e800d4a
Merge pull request #29 from Czxck001/patch-1
ResearcherXman Jan 23, 2024
438acf1
add local demo
ResearcherXman Jan 23, 2024
b29d156
Merge pull request #24 from johndpope/patch-1
ResearcherXman Jan 23, 2024
97bf07a
Update README.md
ResearcherXman Jan 23, 2024
884bdc3
init
sdbds Jan 23, 2024
7fe26ec
add support local models
sdbds Jan 23, 2024
23c81a9
Update app.py
sdbds Jan 23, 2024
0ce3f80
Update app.py
sdbds Jan 23, 2024
5199c63
Update app.py
sdbds Jan 23, 2024
4c64b28
Replicate Init 2.0
zsxkib Jan 23, 2024
7a3d8d5
Merge pull request #23 from zsxkib/main
ResearcherXman Jan 23, 2024
e809b4c
Merge branch 'main' into main
ResearcherXman Jan 23, 2024
ac8b709
Merge pull request #39 from sdbds/main
ResearcherXman Jan 23, 2024
22ef5bf
fix bugs
ResearcherXman Jan 23, 2024
b23490b
Update README.md
ResearcherXman Jan 23, 2024
c120ecf
Update README.md
ResearcherXman Jan 23, 2024
6841b41
Update pipeline_stable_diffusion_xl_instantid.py
ResearcherXman Jan 24, 2024
fee9f5e
Update app.py
ResearcherXman Jan 24, 2024
abbb364
support for torch2.0, update license
ResearcherXman Jan 24, 2024
d227810
add ip_adapter_scale
Jan 24, 2024
90fe1fe
Merge pull request #55 from InstantID/ip_adapter_scale
ResearcherXman Jan 24, 2024
aac78ad
Add star history
ResearcherXman Jan 24, 2024
113f433
Add LCM-LoRA
ResearcherXman Jan 24, 2024
7aff17e
Fix types
ResearcherXman Jan 25, 2024
356ed59
Update README.md
ResearcherXman Jan 26, 2024
41b4750
to avoid confusion
ResearcherXman Jan 26, 2024
4b98bd0
add WebUI support
ResearcherXman Jan 28, 2024
5d8bf9b
Update README.md
ResearcherXman Jan 28, 2024
e23c0f4
Update README.md
ResearcherXman Jan 28, 2024
8d68df1
Update README.md
ResearcherXman Jan 28, 2024
230a4aa
readme
Jan 29, 2024
0a7a39a
readme
Jan 29, 2024
9217659
readme
Jan 29, 2024
2bb2340
readme
Jan 29, 2024
e386de7
readme
Jan 29, 2024
e6bdede
readme
Jan 29, 2024
7c378d2
readme
Jan 29, 2024
0168552
readme
Jan 29, 2024
4039fed
readme
Jan 29, 2024
3a93803
readme
Jan 29, 2024
11608c8
readme
Jan 29, 2024
aa90324
readme
Jan 29, 2024
3908e73
readme
Jan 29, 2024
735a77c
readme
Jan 29, 2024
62eb42a
readme
Jan 29, 2024
1b0b894
readme
Jan 29, 2024
2d63c05
readme
Jan 29, 2024
42c002e
readme
Jan 29, 2024
2b87903
readme
Jan 29, 2024
b10a1c2
readme
Jan 29, 2024
c5af2c4
readme
Jan 29, 2024
759801b
readme
Jan 29, 2024
7bce09c
readme
Jan 29, 2024
0e53c1d
readme
Jan 29, 2024
1b52234
readme
Jan 29, 2024
670e466
readme
Jan 29, 2024
636b612
Merge pull request #94 from InstantID/readme
ResearcherXman Jan 29, 2024
77f3f4c
readme
Jan 29, 2024
81279e9
Merge pull request #95 from InstantID/readme
ResearcherXman Jan 29, 2024
7a2074a
Update README.md
ResearcherXman Jan 29, 2024
2c8f29a
Update README.md
ResearcherXman Jan 29, 2024
f0fff6b
add OpenXLab demo
ResearcherXman Jan 29, 2024
80967dc
Update README.md
ResearcherXman Jan 30, 2024
d77c5ba
Update README.md
ResearcherXman Jan 31, 2024
2daf6c1
support lcm and multicontrol
ResearcherXman Feb 1, 2024
54424dc
fix typos
ResearcherXman Feb 1, 2024
b3204c7
update
ResearcherXman Feb 1, 2024
9669752
update
ResearcherXman Feb 1, 2024
98332df
Merge pull request #114 from InstantID/lcm_multicontrol
ResearcherXman Feb 1, 2024
e36ca46
Fix typos
ResearcherXman Feb 1, 2024
b17d19d
Update README.md
ResearcherXman Feb 1, 2024
dc9f959
fix typos
ResearcherXman Feb 1, 2024
002a7f0
support cpu offloading
ResearcherXman Feb 1, 2024
9951380
Add author
ResearcherXman Feb 2, 2024
1891688
Update README.md
ResearcherXman Feb 18, 2024
62b96b3
Added negative_prompt to pipeline
felixsanz Feb 24, 2024
8413732
Merge pull request #157 from felixsanz/patch-1
ResearcherXman Feb 24, 2024
3437464
fix typos
ResearcherXman Feb 26, 2024
0e0c3d0
fix typos
ResearcherXman Feb 26, 2024
7caac0e
update vae decode
ResearcherXman Feb 28, 2024
4edb63a
Update README.md
ResearcherXman Mar 9, 2024
d395cae
Update README.md
ResearcherXman Mar 19, 2024
e890c79
Update README.md
ResearcherXman Mar 28, 2024
4131e2f
Update README.md
ResearcherXman Mar 28, 2024
0590014
Update README.md
ResearcherXman Mar 28, 2024
228015e
Create pipeline_stable_diffusion_xl_instantid_img2img.py
apolinario Mar 28, 2024
a62bb36
Create FUNDING.yml
ResearcherXman Mar 30, 2024
418b8cf
Add files via upload
ResearcherXman Mar 30, 2024
af6b213
Update FUNDING.yml
ResearcherXman Mar 30, 2024
462f7a3
Update README.md
ResearcherXman Mar 30, 2024
a86c10d
Update README.md
ResearcherXman Mar 30, 2024
9d2762a
Merge pull request #202 from apolinario/patch-1
ResearcherXman Mar 31, 2024
e6b1f21
support diffuser==0.27.0
ResearcherXman Mar 31, 2024
3373895
Support diffusers CPU offloading
apolinario Apr 2, 2024
4727683
Merge pull request #205 from apolinario/patch-2
ResearcherXman Apr 2, 2024
b336a70
Update README.md
ResearcherXman Apr 4, 2024
c73d12d
fix: minor spelling error
AMohamedAakhil Apr 6, 2024
bcd7fe9
Merge pull request #207 from AMohamedAakhil/patch-1
ResearcherXman Apr 6, 2024
042a57c
Update README.md
ResearcherXman Apr 10, 2024
7e93ad3
Update README.md
ResearcherXman Apr 17, 2024
6a4136f
change img
Apr 21, 2024
ee541e7
fix readme
wangqixun Apr 28, 2024
04211a1
fix readme
wangqixun Apr 28, 2024
2844fcf
Update FUNDING.yml
ResearcherXman Jul 4, 2024
c177ae6
kolor demo
wangqixun Jul 18, 2024
3a4feb7
kolor demo
wangqixun Jul 18, 2024
25d88da
kolor demo
wangqixun Jul 18, 2024
79f0c01
kolor demo
wangqixun Jul 18, 2024
65d27a0
kolor demo
wangqixun Jul 18, 2024
2145b67
kolor demo
wangqixun Jul 18, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .github/FUNDING.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# These are supported funding model platforms
ko_fi: instantx
Binary file added .github/wechatpay.jpeg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
8 changes: 8 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -158,3 +158,11 @@ cython_debug/
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/
huggingface/
checkpoints/
models/

# Cog
.cog

gradio_cached_examples
2 changes: 1 addition & 1 deletion LICENSE
Original file line number Diff line number Diff line change
Expand Up @@ -198,4 +198,4 @@
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
limitations under the License.
159 changes: 141 additions & 18 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,26 +1,45 @@
# InstantID
<a href='https://instantid.github.io/'><img src='https://img.shields.io/badge/Project-Page-green'></a>
<a href='https://arxiv.org/abs/2401.07519'><img src='https://img.shields.io/badge/Technique-Report-red'></a>
<div align="center">
<h1>InstantID: Zero-shot Identity-Preserving Generation in Seconds</h1>

[**Qixun Wang**](https://github.com/wangqixun)<sup>12</sup> · [**Xu Bai**](https://huggingface.co/baymin0220)<sup>12</sup> · [**Haofan Wang**](https://haofanwang.github.io/)<sup>12*</sup> · [**Zekui Qin**](https://github.com/ZekuiQin)<sup>12</sup> · [**Anthony Chen**](https://antonioo-c.github.io/)<sup>123</sup>

Huaxia Li<sup>2</sup> · Xu Tang<sup>2</sup> · Yao Hu<sup>2</sup>

<sup>1</sup>InstantX Team · <sup>2</sup>Xiaohongshu Inc · <sup>3</sup>Peking University

<sup>*</sup>corresponding authors

<a href='https://instantid.github.io/'><img src='https://img.shields.io/badge/Project-Page-green'></a>
<a href='https://arxiv.org/abs/2401.07519'><img src='https://img.shields.io/badge/Technique-Report-red'></a>
<a href='https://huggingface.co/papers/2401.07519'><img src='https://img.shields.io/static/v1?label=Paper&message=Huggingface&color=orange'></a>
<a href='https://huggingface.co/spaces/InstantX/InstantID'><img src='https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue'></a>
[![GitHub](https://img.shields.io/github/stars/InstantID/InstantID?style=social)](https://github.com/InstantID/InstantID)

<a href='https://huggingface.co/spaces/InstantX/InstantID'><img src='https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue'></a>
[![ModelScope](https://img.shields.io/badge/ModelScope-Studios-blue)](https://modelscope.cn/studios/instantx/InstantID/summary)
[![Open in OpenXLab](https://cdn-static.openxlab.org.cn/app-center/openxlab_app.svg)](https://openxlab.org.cn/apps/detail/InstantX/InstantID)

**InstantID : Zero-shot Identity-Preserving Generation in Seconds**
</div>

InstantID is a new state-of-the-art tuning-free method to achieve ID-Preserving generation with only single image, supporting various downstream tasks.

<img src='assets/applications.png'>

## Release
- [2024/1/22] 🔥 We release the [pre-trained checkpoints](https://huggingface.co/InstantX/InstantID), [inference code](https://github.com/InstantID/InstantID/blob/main/infer.py) and [gradio demo](https://huggingface.co/spaces/InstantX/InstantID)!
- [2024/1/15] 🔥 We release the technical report.
- [2023/12/11] 🔥 We launch the project page.
- [2024/07/18] 🔥 We are training InstantID for [Kolors](https://huggingface.co/Kwai-Kolors/Kolors-diffusers). The weight requires significant computational power, which is currently in the process of iteration. After the model training is completed, it will be open-sourced. The latest checkpoint results are referenced in [Kolors Version](#kolors-version).
- [2024/04/03] 🔥 We release our recent work [InstantStyle](https://github.com/InstantStyle/InstantStyle) for style transfer, compatible with InstantID!
- [2024/02/01] 🔥 We have supported LCM acceleration and Multi-ControlNets on our [Huggingface Spaces Demo](https://huggingface.co/spaces/InstantX/InstantID)! Our depth estimator is supported by [Depth-Anything](https://github.com/LiheYoung/Depth-Anything).
- [2024/01/31] 🔥 [OneDiff](https://github.com/siliconflow/onediff?tab=readme-ov-file#easy-to-use) now supports accelerated inference for InstantID, check [this](https://github.com/siliconflow/onediff/blob/main/benchmarks/instant_id.py) for details!
- [2024/01/23] 🔥 Our pipeline has been merged into [diffusers](https://github.com/huggingface/diffusers/blob/main/examples/community/pipeline_stable_diffusion_xl_instantid.py)!
- [2024/01/22] 🔥 We release the [pre-trained checkpoints](https://huggingface.co/InstantX/InstantID), [inference code](https://github.com/InstantID/InstantID/blob/main/infer.py) and [gradio demo](https://huggingface.co/spaces/InstantX/InstantID)!
- [2024/01/15] 🔥 We release the [technical report](https://arxiv.org/abs/2401.07519).
- [2023/12/11] 🔥 We launch the [project page](https://instantid.github.io/).

## Demos

### Stylized Synthesis

<p align="center">
<img src="assets/0.png">
<img src="assets/StylizedSynthesis.png">
</p>

### Comparison with Previous Works
Expand All @@ -43,6 +62,17 @@ Comparison with pre-trained character LoRAs. We don't need multiple images and s

Comparison with InsightFace Swapper (also known as ROOP or Refactor). However, in non-realistic style, our work is more flexible on the integration of face and background.

### Kolors Version

We have adapted InstantID for [Kolors](https://huggingface.co/Kwai-Kolors/Kolors-diffusers). Leveraging Kolors' robust text generation capabilities 👍👍👍, InstantID can be integrated with Kolors to simultaneously generate **ID** and **text**.


| demo | demo | demo |
|:-----:|:-----:|:-----:|
<img src="./assets/kolor/demo_1.jpg" >|<img src="./assets/kolor/demo_2.jpg" >|<img src="./assets/kolor/demo_3.jpg" >|



## Download

You can directly download the model from [Huggingface](https://huggingface.co/InstantX/InstantID).
Expand All @@ -55,13 +85,20 @@ hf_hub_download(repo_id="InstantX/InstantID", filename="ControlNetModel/diffusio
hf_hub_download(repo_id="InstantX/InstantID", filename="ip-adapter.bin", local_dir="./checkpoints")
```

Or run the following command to download all models:

```python
pip install -r gradio_demo/requirements.txt
python gradio_demo/download_models.py
```

If you cannot access to Huggingface, you can use [hf-mirror](https://hf-mirror.com/) to download models.
```python
export HF_ENDPOINT=https://hf-mirror.com
huggingface-cli download --resume-download InstantX/InstantID --local-dir checkpoints
huggingface-cli download --resume-download InstantX/InstantID --local-dir checkpoints --local-dir-use-symlinks False
```

For face encoder, you need to manutally download via this [URL](https://github.com/deepinsight/insightface/issues/1896#issuecomment-1023867304) to `models/antelopev2` as the default link is invalid. Once you have prepared all models, the folder tree should be like:
For face encoder, you need to manually download via this [URL](https://github.com/deepinsight/insightface/issues/1896#issuecomment-1023867304) to `models/antelopev2` as the default link is invalid. Once you have prepared all models, the folder tree should be like:

```
.
Expand All @@ -74,6 +111,14 @@ For face encoder, you need to manutally download via this [URL](https://github.c

## Usage

If you want to reproduce results in the paper, please refer to the code in [infer_full.py](infer_full.py). If you want to compare the results with other methods, even without using depth-controlnet, it is recommended that you use this code.

If you are pursuing better results, it is recommended to follow [InstantID-Rome](https://github.com/instantX-research/InstantID-Rome).

The following code👇 comes from [infer.py](infer.py). If you want to quickly experience InstantID, please refer to the code in [infer.py](infer.py).



```python
# !pip install opencv-python transformers accelerate insightface
import diffusers
Expand Down Expand Up @@ -104,7 +149,8 @@ pipe = StableDiffusionXLInstantIDPipeline.from_pretrained(
base_model,
controlnet=controlnet,
torch_dtype=torch.float16
).cuda()
)
pipe.cuda()

# load adapter
pipe.load_ip_adapter_instantid(face_adapter)
Expand All @@ -114,11 +160,11 @@ Then, you can customized your own face images

```python
# load an image
image = load_image("./examples/yann-lecun_resize.jpg")
face_image = load_image("./examples/yann-lecun_resize.jpg")

# prepare face emb
face_info = app.get(cv2.cvtColor(np.array(face_image), cv2.COLOR_RGB2BGR))
face_info = sorted(face_info, key=lambda x:(x['bbox'][2]-x['bbox'][0])*x['bbox'][3]-x['bbox'][1])[-1] # only use the maximum face
face_info = sorted(face_info, key=lambda x:(x['bbox'][2]-x['bbox'][0])*(x['bbox'][3]-x['bbox'][1]))[-1] # only use the maximum face
face_emb = face_info['embedding']
face_kps = draw_kps(face_image, face_info['kps'])

Expand All @@ -127,27 +173,101 @@ prompt = "film noir style, ink sketch|vector, male man, highly detailed, sharp f
negative_prompt = "ugly, deformed, noisy, blurry, low contrast, realism, photorealistic, vibrant, colorful"

# generate image
pipe.set_ip_adapter_scale(0.8)
image = pipe(
prompt,
negative_prompt=negative_prompt,
image_embeds=face_emb,
image=face_kps,
controlnet_conditioning_scale=0.8,
ip_adapter_scale=0.8,
).images[0]
```

To save VRAM, you can enable CPU offloading
```python
pipe.enable_model_cpu_offload()
pipe.enable_vae_tiling()
```

## Speed Up with LCM-LoRA

Our work is compatible with [LCM-LoRA](https://github.com/luosiallen/latent-consistency-model). First, download the model.

```python
from huggingface_hub import hf_hub_download
hf_hub_download(repo_id="latent-consistency/lcm-lora-sdxl", filename="pytorch_lora_weights.safetensors", local_dir="./checkpoints")
```

To use it, you just need to load it and infer with a small num_inference_steps. Note that it is recommendated to set guidance_scale between [0, 1].
```python
from diffusers import LCMScheduler

lcm_lora_path = "./checkpoints/pytorch_lora_weights.safetensors"

pipe.load_lora_weights(lcm_lora_path)
pipe.fuse_lora()
pipe.scheduler = LCMScheduler.from_config(pipe.scheduler.config)

num_inference_steps = 10
guidance_scale = 0
```

## Start a local gradio demo <a href='https://github.com/gradio-app/gradio'><img src='https://img.shields.io/github/stars/gradio-app/gradio'></a>
Run the following command:

```python
python gradio_demo/app.py
```

or MultiControlNet version:
```python
gradio_demo/app-multicontrolnet.py
```

## Usage Tips
- For higher similarity, increase the weight of controlnet_conditioning_scale (IdentityNet) and ip_adapter_scale (Adapter).
- For over-saturation, decrease the ip_adapter_scale. If not work, decrease controlnet_conditioning_scale.
- For higher text control ability, decrease ip_adapter_scale.
- For specific styles, choose corresponding base model makes differences.
- We have not supported multi-person yet, only use the largest face as reference facial landmarks.
- We provide a [style template](https://github.com/ahgsql/StyleSelectorXL/blob/main/sdxl_styles.json) for reference.

## Community Resources

### Replicate Demo
- [zsxkib/instant-id](https://replicate.com/zsxkib/instant-id)

### WebUI
- [Mikubill/sd-webui-controlnet](https://github.com/Mikubill/sd-webui-controlnet/discussions/2589)

### ComfyUI
- [cubiq/ComfyUI_InstantID](https://github.com/cubiq/ComfyUI_InstantID)
- [ZHO-ZHO-ZHO/ComfyUI-InstantID](https://github.com/ZHO-ZHO-ZHO/ComfyUI-InstantID)
- [huxiuhan/ComfyUI-InstantID](https://github.com/huxiuhan/ComfyUI-InstantID)

### Windows
- [sdbds/InstantID-for-windows](https://github.com/sdbds/InstantID-for-windows)

# Acknowledgements
## Acknowledgements
- InstantID is developed by InstantX Team, all copyright reserved.
- Our work is highly inspired by [IP-Adapter](https://github.com/tencent-ailab/IP-Adapter) and [ControlNet](https://github.com/lllyasviel/ControlNet). Thanks for their great works!
- Thanks to the HuggingFace team for their generous GPU support!
- Thanks [Yamer](https://civitai.com/user/Yamer) for developing [YamerMIX](https://civitai.com/models/84040?modelVersionId=196039), we use it as base model in our demo.
- Thanks [ZHO-ZHO-ZHO](https://github.com/ZHO-ZHO-ZHO), [huxiuhan](https://github.com/huxiuhan), [sdbds](https://github.com/sdbds), [zsxkib](https://replicate.com/zsxkib) for their generous contributions.
- Thanks to the [HuggingFace](https://github.com/huggingface) gradio team for their free GPU support!
- Thanks to the [ModelScope](https://github.com/modelscope/modelscope) team for their free GPU support!
- Thanks to the [OpenXLab](https://openxlab.org.cn/apps/detail/InstantX/InstantID) team for their free GPU support!
- Thanks to [SiliconFlow](https://github.com/siliconflow) for their OneDiff integration of InstantID!

## Disclaimer
This project is released under [Apache License](https://github.com/InstantID/InstantID?tab=Apache-2.0-1-ov-file#readme) and aims to positively impact the field of AI-driven image generation. Users are granted the freedom to create images using this tool, but they are obligated to comply with local laws and utilize it responsibly. The developers will not assume any responsibility for potential misuse by users.
The code of InstantID is released under [Apache License](https://github.com/InstantID/InstantID?tab=Apache-2.0-1-ov-file#readme) for both academic and commercial usage. **However, both manual-downloading and auto-downloading face models from insightface are for non-commercial research purposes only** according to their [license](https://github.com/deepinsight/insightface?tab=readme-ov-file#license). **Our released checkpoints are also for research purposes only**. Users are granted the freedom to create images using this tool, but they are obligated to comply with local laws and utilize it responsibly. The developers will not assume any responsibility for potential misuse by users.

## Star History

[![Star History Chart](https://api.star-history.com/svg?repos=InstantID/InstantID&type=Date)](https://star-history.com/#InstantID/InstantID&Date)


## Sponsor Us
If you find this project useful, you can buy us a coffee via Github Sponsor! We support [Paypal](https://ko-fi.com/instantx) and [WeChat Pay](https://tinyurl.com/instantx-pay).

## Cite
If you find InstantID useful for your research and applications, please cite us using this BibTeX:
Expand All @@ -159,3 +279,6 @@ If you find InstantID useful for your research and applications, please cite us
journal={arXiv preprint arXiv:2401.07519},
year={2024}
}
```

For any question, please feel free to contact us via [email protected] or [email protected].
Binary file removed assets/.DS_Store
Binary file not shown.
Binary file removed assets/0.png
Binary file not shown.
Binary file added assets/StylizedSynthesis.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/kolor/demo_1.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/kolor/demo_2.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/kolor/demo_3.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
31 changes: 31 additions & 0 deletions cog.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# Configuration for Cog ⚙️
# Reference: https://github.com/replicate/cog/blob/main/docs/yaml.md

build:
# set to true if your model requires a GPU
gpu: true
# cuda: "12.1"

# a list of ubuntu apt packages to install
system_packages:
- "libgl1-mesa-glx"
- "libglib2.0-0"

# python version in the form '3.11' or '3.11.4'
python_version: "3.11"

# a list of packages in the format <package-name>==<version>
python_packages:
- "opencv-python==4.9.0.80"
- "transformers==4.37.0"
- "accelerate==0.26.1"
- "insightface==0.7.3"
- "diffusers==0.25.1"
- "onnxruntime==1.16.3"

# commands run after the environment is setup
run:
- curl -o /usr/local/bin/pget -L "https://github.com/replicate/pget/releases/download/v0.6.0/pget_linux_x86_64" && chmod +x /usr/local/bin/pget

# predict.py defines how predictions are run on your model
predict: "cog/predict.py:Predictor"
60 changes: 60 additions & 0 deletions cog/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
# InstantID Cog Model

[![Replicate](https://replicate.com/zsxkib/instant-id/badge)](https://replicate.com/zsxkib/instant-id)

## Overview
This repository contains the implementation of [InstantID](https://github.com/InstantID/InstantID) as a [Cog](https://github.com/replicate/cog) model.

Using [Cog](https://github.com/replicate/cog) allows any users with a GPU to run the model locally easily, without the hassle of downloading weights, installing libraries, or managing CUDA versions. Everything just works.

## Development
To push your own fork of InstantID to [Replicate](https://replicate.com), follow the [Model Pushing Guide](https://replicate.com/docs/guides/push-a-model).

## Basic Usage
To make predictions using the model, execute the following command from the root of this project:

```bash
cog predict \
-i image=@examples/sam_resize.png \
-i prompt="analog film photo of a man. faded film, desaturated, 35mm photo, grainy, vignette, vintage, Kodachrome, Lomography, stained, highly detailed, found footage, masterpiece, best quality" \
-i negative_prompt="nsfw" \
-i width=680 \
-i height=680 \
-i ip_adapter_scale=0.8 \
-i controlnet_conditioning_scale=0.8 \
-i num_inference_steps=30 \
-i guidance_scale=5
```

<table>
<tr>
<td>
<p align="center">Input</p>
<img src="https://replicate.delivery/pbxt/KGy0R72cMwriR9EnCLu6hgVkQNd60mY01mDZAQqcUic9rVw4/musk_resize.jpeg" alt="Sample Input Image" width="90%"/>
</td>
<td>
<p align="center">Output</p>
<img src="https://replicate.delivery/pbxt/oGOxXELcLcpaMBeIeffwdxKZAkuzwOzzoxKadjhV8YgQWk8IB/result.jpg" alt="Sample Output Image" width="100%"/>
</td>
</tr>
</table>

## Input Parameters

The following table provides details about each input parameter for the `predict` function:

| Parameter | Description | Default Value | Range |
| ------------------------------- | ---------------------------------- | -------------------------------------------------------------------------------------------------------------- | ----------- |
| `image` | Input image | A path to the input image file | Path string |
| `prompt` | Input prompt | "analog film photo of a man. faded film, desaturated, 35mm photo, grainy, vignette, vintage, Kodachrome, ... " | String |
| `negative_prompt` | Input Negative Prompt | (empty string) | String |
| `width` | Width of output image | 640 | 512 - 2048 |
| `height` | Height of output image | 640 | 512 - 2048 |
| `ip_adapter_scale` | Scale for IP adapter | 0.8 | 0.0 - 1.0 |
| `controlnet_conditioning_scale` | Scale for ControlNet conditioning | 0.8 | 0.0 - 1.0 |
| `num_inference_steps` | Number of denoising steps | 30 | 1 - 500 |
| `guidance_scale` | Scale for classifier-free guidance | 5 | 1 - 50 |

This table provides a quick reference to understand and modify the inputs for generating predictions using the model.


Loading