Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inserting a multi-part document results in duplicate nodes and edges #485

Open
jin38324 opened this issue Dec 18, 2024 · 0 comments
Open

Comments

@jin38324
Copy link
Contributor

I want to convert a long document into a Lightrage knowledge base.
If I convert the entire document directly, one error will lead to a total failure. For the purpose of risk control, the document is divided into multiple parts and insert is performed separately.
From the result data, it is found that there are a lot of duplications in nodes and edges.

After investigation, the reason is in the relevant code below.
LightRAG will merge nodes and edges with the same name when executing insert; but this is only valid for the same execution of _process_single_content. If you insert multiple documents like I did, the results will not be merged.

_merge_nodes_then_upsert(k, v, knowledge_graph_inst, global_config)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant