Florence2 in ComfyUI

Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks. Florence-2 can interpret simple text prompts to perform tasks like captioning, object detection, and segmentation. It leverages our FLD-5B dataset, containing 5.4 billion annotations across 126 million images, to master multi-task learning. The model's sequence-to-sequence architecture enables it to excel in both zero-shot and fine-tuned settings, proving to be a competitive vision foundation model.

Installation:

clone this repository to 'ComfyUI/custom_nodes` -folder. Only real dependency is new enough transformers version.

Supports the following models, they are automatically downloaded to ComfyUI/LLM:

https://huggingface.co/microsoft/Florence-2-base

https://huggingface.co/microsoft/Florence-2-base-ft

https://huggingface.co/microsoft/Florence-2-large

https://huggingface.co/microsoft/Florence-2-large-ft

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
.github/workflows		.github/workflows
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
nodes.py		nodes.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Florence2 in ComfyUI

Installation:

About

Releases

Packages

Languages

License

the-ride-never-ends/ComfyUI-Florence2-DocVQA

Folders and files

Latest commit

History

Repository files navigation

Florence2 in ComfyUI

Installation:

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages