Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weaviate integration #1864

Closed
wants to merge 6 commits into from
Closed

Conversation

Vargha-Kh
Copy link

Pull Request Description

Summary of Changes

  • Added Weaviate VectorDb Integration: Introduced a new Weaviate class under phi/vectordb/weaviate/weaviate.py, enabling seamless integration with Weaviate vector database.
  • Implemented Required Abstract Methods: Ensured compliance with the base VectorDb interface by implementing methods like doc_exists, name_exists, exists, insert, and search.
  • Enhanced Support for Vector, Keyword, and Hybrid Search: Provides different search strategies (vector, keyword, and hybrid) leveraging user-configurable embeddings and Weaviate’s query features.
  • Lazy-Loaded Weaviate Client: Uses a property (@property) for client to defer creation of the Weaviate client until first use, matching the pattern used in other vector DB integrations (e.g., Pinecone, Milvus).
  • Schema Creation and Management: Allows users to create, drop, and check for existing Weaviate classes (indexes) via the create() and drop() methods.

Related Issues

  • I personally needed supporting Weaviate as a vector database in phiData.

Motivation and Context

  • Centralized Vector Database Options: Extends phiData’s supported vector databases, offering a flexible choice for managing high-dimensional embeddings.
  • Ease of Use: Simplifies setup for developers who want to combine Weaviate with phiData’s agent system without manually creating or managing Weaviate integrations.
  • Search Variety: Supports vector-based semantic queries, direct keyword matching, and a rudimentary hybrid approach that merges results from both.

Environment or Dependencies

  • Weaviate Python Client: Requires weaviate-client>=4.4.0,<5.0 (or an appropriate version of Weaviate client v4).

Impact on AI/ML Components

  • Model Embedding Usage: This PR does not change existing AI models but provides an additional storage/search mechanism for AI-generated embeddings.

Type of Change

  • New feature (non-breaking change which adds functionality)

Checklist

  • My code follows Phidata's style guidelines and best practices.
  • I have performed a self-review of my code.
  • I have added docstrings and comments for complex logic.
  • My changes generate no new warnings or errors.
  • I have added cookbook examples for my new addition (if needed).
  • I have verified my changes in a clean environment.

Additional Notes

  • Deployment: Local deployments using a standard Weaviate container from tutorial (https://weaviate.io/developers/weaviate/installation/docker-compose) at http://localhost:8080 should work out of the box.
  • Customization: Users can extend or modify the schema or distance configurations in create() to match their Weaviate cluster setup.
  • Documentation: Future enhancements might add advanced hybrid search capabilities (e.g., text + vector weighting) and more robust filtering.

@dirkbrnd dirkbrnd requested review from manthanguptaa and ysolanky and removed request for manthanguptaa January 24, 2025 08:36
@manthanguptaa
Copy link
Contributor

Hey @Vargha-Kh! We recently rebranded to Agno, which means that this PR needs to be ported to Agno. Let me know if you have questions or need help. Sorry for the inconvenience caused

@Vargha-Kh Vargha-Kh closed this by deleting the head repository Feb 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants