-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Knowledge #1567
Knowledge #1567
Changes from all commits
75322b2
dc314c1
a8a2f80
1a35114
6131dba
617ee98
4af263c
59165cb
86ede83
7b59c5b
98a708c
10f445e
cb03ee6
cdf5233
b907938
352d053
b2c06d5
cbfcde7
4831dcb
d579c5a
b104404
70910dd
c8bf242
cbfdbe3
e882725
efa8a37
de742c8
914067d
0c5b6f2
705ee16
58bf2d5
ec2fe6f
8373c9b
e7d816f
787f2ea
b185b9e
4663997
76da972
fe18da5
23276cb
3c4504b
44ab749
52189a4
8a54042
8564f55
38c0d61
9329119
6359b64
c0ad457
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,75 @@ | ||
--- | ||
title: Knowledge | ||
description: What is knowledge in CrewAI and how to use it. | ||
icon: book | ||
--- | ||
|
||
# Using Knowledge in CrewAI | ||
|
||
## Introduction | ||
|
||
The Knowledge class in CrewAI provides a powerful way to manage and query knowledge sources for your AI agents. This guide will show you how to implement knowledge management in your CrewAI projects. | ||
Additionally, we have specific tools for generate knowledge sources for strings, text files, PDF's, and Spreadsheets. You can expand on any source type by extending the `KnowledgeSource` class. | ||
|
||
## Basic Implementation | ||
|
||
Here's a simple example of how to use the Knowledge class: | ||
|
||
```python | ||
from crewai import Agent, Task, Crew, Process, LLM | ||
from crewai.knowledge.source.string_knowledge_source import StringKnowledgeSource | ||
|
||
# Create a knowledge source | ||
content = "Users name is John. He is 30 years old and lives in San Francisco." | ||
string_source = StringKnowledgeSource( | ||
content=content, metadata={"preference": "personal"} | ||
) | ||
|
||
|
||
llm = LLM(model="gpt-4o-mini", temperature=0) | ||
# Create an agent with the knowledge store | ||
agent = Agent( | ||
role="About User", | ||
goal="You know everything about the user.", | ||
backstory="""You are a master at understanding people and their preferences.""", | ||
verbose=True, | ||
allow_delegation=False, | ||
llm=llm, | ||
) | ||
task = Task( | ||
description="Answer the following questions about the user: {question}", | ||
expected_output="An answer to the question.", | ||
agent=agent, | ||
) | ||
|
||
crew = Crew( | ||
agents=[agent], | ||
tasks=[task], | ||
verbose=True, | ||
process=Process.sequential, | ||
knowledge={"sources": [string_source], "metadata": {"preference": "personal"}}, # Enable knowledge by adding the sources here. You can also add more sources to the sources list. | ||
) | ||
|
||
result = crew.kickoff(inputs={"question": "What city does John live in and how old is he?"}) | ||
``` | ||
|
||
|
||
## Embedder Configuration | ||
|
||
You can also configure the embedder for the knowledge store. This is useful if you want to use a different embedder for the knowledge store than the one used for the agents. | ||
|
||
```python | ||
... | ||
string_source = StringKnowledgeSource( | ||
content="Users name is John. He is 30 years old and lives in San Francisco.", | ||
metadata={"preference": "personal"} | ||
) | ||
crew = Crew( | ||
... | ||
knowledge={ | ||
"sources": [string_source], | ||
"metadata": {"preference": "personal"}, | ||
"embedder_config": {"provider": "openai", "config": {"model": "text-embedding-3-small"}}, | ||
}, | ||
) | ||
``` |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -39,6 +39,16 @@ Repository = "https://github.com/crewAIInc/crewAI" | |
[project.optional-dependencies] | ||
tools = ["crewai-tools>=0.14.0"] | ||
agentops = ["agentops>=0.3.0"] | ||
fastembed = ["fastembed>=0.4.1"] | ||
pdfplumber = [ | ||
"pdfplumber>=0.11.4", | ||
] | ||
pandas = [ | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Suggestion:I'm wondering if we need to keep "pandas" as an optional dependency. I took a look at the code, and it seems we're only using it to read Excel files and save them as CSVs. Maybe we could find some lighter libraries to handle that? Just a thought! If the lib is still required maybe we should go with "polars"
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. these are optional deps, maybe this can be a fast follow ? |
||
"pandas>=2.2.3", | ||
] | ||
openpyxl = [ | ||
"openpyxl>=3.1.5", | ||
] | ||
mem0 = ["mem0ai>=0.1.29"] | ||
|
||
[tool.uv] | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,55 @@ | ||
from abc import ABC, abstractmethod | ||
from typing import List | ||
|
||
import numpy as np | ||
|
||
|
||
class BaseEmbedder(ABC): | ||
""" | ||
Abstract base class for text embedding models | ||
""" | ||
|
||
@abstractmethod | ||
def embed_chunks(self, chunks: List[str]) -> np.ndarray: | ||
""" | ||
Generate embeddings for a list of text chunks | ||
|
||
Args: | ||
chunks: List of text chunks to embed | ||
|
||
Returns: | ||
Array of embeddings | ||
""" | ||
pass | ||
|
||
@abstractmethod | ||
def embed_texts(self, texts: List[str]) -> np.ndarray: | ||
""" | ||
Generate embeddings for a list of texts | ||
|
||
Args: | ||
texts: List of texts to embed | ||
|
||
Returns: | ||
Array of embeddings | ||
""" | ||
pass | ||
|
||
@abstractmethod | ||
def embed_text(self, text: str) -> np.ndarray: | ||
""" | ||
Generate embedding for a single text | ||
|
||
Args: | ||
text: Text to embed | ||
|
||
Returns: | ||
Embedding array | ||
""" | ||
pass | ||
|
||
@property | ||
@abstractmethod | ||
def dimension(self) -> int: | ||
"""Get the dimension of the embeddings""" | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: Do you think we need the --all-extra option in this case? It seems like we'll have to install all the optional dependencies to be able to run our tests. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, there are a bunch of optional dep that were brought up like the
pdfplumber
for our PdfKnowledgeSource.