Skip to content

Commit

Permalink
perf(knext): implement KAG agent method with prompt optimization and …
Browse files Browse the repository at this point in the history
…CheckDivideQuestion module and experiment (#363)

Co-authored-by: mfz-ant <[email protected]>
  • Loading branch information
ZhanGHanG9991 and mfz-ant authored Oct 9, 2024
1 parent 58c5657 commit 1148cc1
Show file tree
Hide file tree
Showing 7 changed files with 50,648 additions and 8 deletions.
4 changes: 4 additions & 0 deletions python/knext/knext/ca/logic/agents/divide_and_conquer.py
Original file line number Diff line number Diff line change
Expand Up @@ -151,6 +151,10 @@ async def solve_problem_impl(self, question: Question, **kwargs):

children_questions = self.divide_question.forward(current_question)

children_questions = self.check_divide_question.forward(
current_question, children_questions
)

# display child question
for child_idx, child_question in enumerate(children_questions):
info_dict = {
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
Requirements:
1. Determine whether several subquestions are the correct result of decomposing the original question.
2. If the question decomposition is correct, only output "YES" without including any additional information.
3. If the question decomposition is not correct, output "NO" and provide a concise reason and then give the correct question decomposition and the list of question dependencies.
3.1. Describe the dependencies of the decomposed questions. If a question has multiple dependencies, separate them with commas.
3.2. Avoid using pronouns in sub-questions unless the pronoun is the answer to another question.
3.3. If the question does not need to be decomposed to be answered, output the original question.

Example 1:
llm_input:
question: In which dynasty was the Zhaozhou Bridge built, and who was its last emperor?
subquestions:
Who built the Zhaozhou Bridge?
In which year was the Zhaozhou Bridge built?
Who was the first emperor of this dynasty?
llm_output:
NO
The decomposition is incorrect because it includes irrelevant subquestions. The original question asks for the dynasty during which the Zhaozhou Bridge was built and its last emperor, but the subquestions inquire about the bridge's builder, the construction year, and the dynasty's first emperor. The relevant subquestions should directly focus on the dynasty and its last emperor.
Correct decomposition approach:
In which dynasty was the Zhaozhou Bridge built?
Who was the last emperor of this dynasty?
dependent relationship:
question1 deps None
question2 deps question1

Example 2:
llm_input:
question: Who are the cousin, father, mother, and second uncle of Qian Xuesen's wife?
subquestions:
Who is Qian Xuesen’s wife?
Who is the cousin of Qian Xuesen’s wife?
Who is the father of Qian Xuesen’s wife?
Who is the mother of Qian Xuesen’s wife?
Who is the second uncle of Qian Xuesen’s wife?
llm_output:
YES

Example 3:
llm_input:
question: In "Chinese Paladin 1", whose mother died to seal the Water Demon Beast?
subquestions:
In "Chinese Paladin 1", who is the mother of the Water Demon Beast?
Who died to seal the Water Demon Beast?
llm_output:
NO
The decomposition does not accurately address the original question. The first subquestion incorrectly seeks the identity of the Water Demon Beast's mother, which is not relevant. The correct focus should be on identifying the character whose mother died to seal the Water Demon Beast.
Correct decomposition approach:
In "Chinese Paladin 1", which character died to seal the Water Demon Beast?
Who is the child of this character?
dependent relationship:
question1 deps None
question2 deps question1

Example 4:
llm_input:
question: If my hukou and social security are registered in Beijing, but my hometown is in Zhangjiakou, and I fully paid for a house in Zhangjiakou, can I withdraw from my housing provident fund?
subquestions:
Under what conditions can one withdraw from the housing provident fund in Beijing?
Do the conditions of having a hukou and social security in Beijing, hometown in Zhangjiakou, and a fully paid house in Zhangjiakou meet the above requirements?
llm_output:
YES

Example 5:
llm_input:
question: What is the relationship between Qian Xuesen's wife and Jiang Baili?
subquestions:
Who is Jiang Baili's wife?
What is the relationship between this woman and Qian Xuesen?
llm_output:
NO
The subquestions imply an incorrect relationship, suggesting that Jiang Baili’s wife has a direct relationship with Qian Xuesen. The correct approach should focus on identifying Qian Xuesen's wife and then exploring her relationship with Jiang Baili.
Correct decomposition approach:
Who is Qian Xuesen's wife?
What is the relationship between this woman and Jiang Baili?
dependent relationship:
question1 deps None
question2 deps question1

Example 6:
llm_input:
question: What TV series did the male lead of "Zhou Chu Eliminates the Three Evils" co-star with Joe Chen?
subquestions:
Who is the male lead of Joe Chen?
Is "Zhou Chu Eliminates the Three Evils" a TV series?
Who co-starred with Joe Chen?
llm_output:
NO
The subquestions are incorrect and misleading. The first assumes Joe Chen is a show, not a person. The second is unrelated, questioning the media type of "Zhou Chu Eliminates the Three Evils." The third is too broad, not specific to the TV series in question. Correct subquestions should identify the male lead of the specific TV series and then explore who co-starred with Joe Chen in any relevant series.
Correct decomposition approach:
Who is the male lead of "Zhou Chu Eliminates the Three Evils"?
What TV series has this male lead starred in?
What TV series has Joe Chen starred in?
What is the intersection of these two TV series lists?
dependent relationship:
question1 deps None
question2 deps question1
question3 deps None
question4 deps question1, question2

Example 7:
llm_input:
question: What is the relationship between these two authors?
subquestions:
What is the relationship between these two authors?
llm_output:
YES

Based on the above requirements and examples, analyze the following question decomposition:
llm_input:
question: $question
subquestions:
$sub_questions
llm_output:
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,37 @@ dependent relationship:
question1 deps None
question2 deps question1

Example 5:
llm_input: What is the relationship between these two authors?
llm_output:
What is the relationship between these two authors?
dependent relationship:
question1 deps None

Example 6:
llm_input: If my hukou and social security are registered in Beijing, but my hometown is in Zhangjiakou, and I fully paid for a house in Zhangjiakou, can I withdraw from my housing provident fund?
llm_output:
Under what conditions can one withdraw from the housing provident fund in Beijing?
Do the conditions of having a hukou and social security in Beijing, hometown in Zhangjiakou, and a fully paid house in Zhangjiakou meet the above requirements?
dependent relationship:
question1 deps None
question2 deps question1

Example 7:
llm_input: Who are the cousin, father, mother, and second uncle of Qian Xuesen's wife?
llm_output:
Who is Qian Xuesen’s wife?
Who is the cousin of Qian Xuesen’s wife?
Who is the father of Qian Xuesen’s wife?
Who is the mother of Qian Xuesen’s wife?
Who is the second uncle of Qian Xuesen’s wife?
dependent relationship:
question1 deps None
question2 deps question1
question3 deps question1
question4 deps question1
question5 deps question1

Based on the above requirements and examples, divide following question:
llm_input: $question
llm_output:
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ Question: Who is Wu Jing's wife?
Answer: Wu Jing's wife is Xie Nan
llm_output: What is Xie Nan's surname?

Example 4:
Example 5:
llm_input: Who was the recipient of the Nobel Prize in Economics that year?
context:
Question: In which year did Mo Yan win the Nobel Prize?
Expand Down
Original file line number Diff line number Diff line change
@@ -1,22 +1,36 @@
Requirements:
1. When answering questions, rely on the information provided in the context below.
2. Answer the questions as directly as possible, without including extraneous information.
3. Do not include information that is only relevant to one aspect.
4. In addition to the context, you can also rely on some common knowledge when answering questions.
5. If the context does not provide relevant information, respond with "No relevant information found," without fabricating an answer.
1. When answering questions, must rely exclusively on the information provided in the context. Do not modify the format of the answer in the original text.
2. Answer the questions as directly as possible, without including extraneous information. For example, if the question asks for the year, the answer should not include redundant information such as the month or date.
3. Eliminate unnecessary elements such as articles, explanatory parentheses, quotation marks and other redundancies from your answers.
4. Do not change the capitalization of the answer in the context.
5. In addition to the context, you can also rely on some common knowledge when answering questions.
6. If the context does not provide relevant information, search for keywords within the context based on the parts of speech, semantics, and the nature of the potential answers indicated in the question.

Example 1:
llm_input: What is the relationship between Jin Yong and Xu Zhimo?
context:
context:
Everyone knows that Jin Yong and Xu Zhimo are cousins. Jin Yong's mother and Xu Zhimo's father are cousins, and Xu Zhimo is about 27 years older than Jin Yong, so Jin Yong has to call Xu Zhimo his cousin. Jin Yong's original name is Cha Liangyong, and both Jin Yong's and Xu Zhimo's families are wealthy families in Jiangsu and Zhejiang.
llm_output: Cousins

Example 2:
llm_input: What is the surname of Xie Nan's father?
context:
context:
No information found related to Xie Nan's father
llm_output: Xie

Example 3:
llm_input: Who discovered penicillin?
context:
Alexander Fleming discovered penicillin in 1928 while working on influenza virus research. He observed that a mold called Penicillium notatum had antibacterial properties.
llm_output: Alexander Fleming

Example 4:
llm_input: When did the Council of Trent take place?
context:
The Council of Trent was one of the Roman Catholic Church's most important ecumenical councils. Initiated in response to the Protestant Reformation, it convened in three periods from 1545 to 1563, with significant sessions occurring in 1560-1568.
llm_output: 1560-1568


Based on the above requirements and examples, answer the following question:
llm_input: $question
context: $context
Expand Down
3 changes: 3 additions & 0 deletions python/knext/knext/ca/logic/modules/planner.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@ def __init__(
class DivideQuestion(Planner):
"""
Module for dividing a question into serveral sub questions.
"""

def __init__(
Expand Down Expand Up @@ -134,6 +135,7 @@ def _process_dep(_input_list):
class CheckDivideQuestion(Planner):
"""
Module for checking the divided question.
"""

def __init__(
Expand Down Expand Up @@ -250,6 +252,7 @@ def _process_dep(_input_list):
class RewriteQuestionBasedOnDeps(Planner):
"""
Module for rewriting a question based on the current question and dependent question
"""

def __init__(
Expand Down
Loading

0 comments on commit 1148cc1

Please sign in to comment.