Releases: michaelfeil/infinity
Releases · michaelfeil/infinity
0.0.5
What's Changed
- Docker image multi by @michaelfeil in #24
- patch missing event -> 200ms to 7ms inference at Batch size 1
Full Changelog: 0.0.4...0.0.5
0.0.4
What's Changed
PRs:
- Fastembed v2 by @michaelfeil in #21 :
Issues:
Closes #5 ONNX Support via https://github.com/qdrant/fastembed/
Closes #22 making pytorch and optional dependency
tl,dr
fastembed
as backend besides ct2 or torch- v1/models returns "backend"
- makes torch an optional dependency
- calculates "min" sleep time dynamically on startup _> slightly optimized.
- default model is now "BAAI/bge-small-en-v1.5"
Full Changelog: 0.0.3...0.0.4
0.0.3
What's Changed
- add Flash-Attention+ optimum-BetterTransformers by @michaelfeil in #20
- Improve real-time / sleep strategy, async await for queues and result futures - reducing latency a bit by @michaelfeil in #12
- add better FIFO queueing strategy - your requests now have a upper bound how long they queue by @michaelfeil in #19
Docs:
- Docs: Update README.md by @michaelfeil in #8
- Update description. Update pyproject.toml by @michaelfeil in #9
- Refactor model dir by @michaelfeil in #10
- Update README.md by @michaelfeil in #14
- Update README.md by @michaelfeil in #15
Full Changelog: 0.0.2rc0...0.0.3
0.0.2
What's Changed
- Docs: Update README.md by @michaelfeil in #8
- Update description. Update pyproject.toml by @michaelfeil in #9
- Refactor model dir by @michaelfeil in #10
- Improve real-time / sleep strategy, async await for queues and result futures by @michaelfeil in #12
Full Changelog: 0.0.1...0.0.2rc0
0.0.1
Initial release of Infinity
0.0.1-dev3
What's Changed
- startup msg, log handling, import by @michaelfeil in #4
- update CI to release pypi by @michaelfeil in #7
Full Changelog: 0.0.1-dev2...0.0.1-dev3
0.0.1-dev2 - Speedups
adds new dependency (orjson) for faster response serialization - 300%
uses torch.inference_mode() and delayed moving to CPU - 10%
adds uvicorn[standard] - slightly faster 2-5%?
Updates readme
0.0.1-dev1
This is a release for testing the CI of Infinity.