-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation Summary Buffer Memory #203
base: main
Are you sure you want to change the base?
Changes from 2 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,18 +1,16 @@ | ||
import { | ||
assistant, | ||
createStream, | ||
getUserData, | ||
Message, | ||
user, | ||
} from '@/utils/ai'; | ||
import { assistant, createStream, getUserData, user } from '@/utils/ai'; | ||
import { honcho } from '@/utils/honcho'; | ||
import { responsePrompt } from '@/utils/prompts/response'; | ||
import responsePrompt from '@/utils/prompts/response'; | ||
import summaryPrompt from '@/utils/prompts/summary'; | ||
import { NextRequest, NextResponse } from 'next/server'; | ||
|
||
export const runtime = 'nodejs'; | ||
export const maxDuration = 100; | ||
export const dynamic = 'force-dynamic'; // always run dynamically | ||
|
||
const MAX_CONTEXT_SIZE = 11; | ||
const SUMMARY_SIZE = 5; | ||
|
||
export async function POST(req: NextRequest) { | ||
const { message, conversationId, thought, honchoThought } = await req.json(); | ||
|
||
|
@@ -45,14 +43,131 @@ export async function POST(req: NextRequest) { | |
|
||
const honchoHistory = Array.from(honchoIter.items); | ||
|
||
const summaryIter = await honcho.apps.users.sessions.metamessages.list( | ||
appId, | ||
userId, | ||
conversationId, | ||
{ | ||
metamessage_type: 'summary', | ||
} | ||
); | ||
|
||
const summaryHistory = Array.from(summaryIter.items); | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For metamessage list functions you can specify the number of items to return and just index that value directly or see if it is null. can pass in a page size of 1 and then index the Also I think we could parallelize the 3 metamessage list calls in a |
||
// Get the last summary content | ||
const lastSummary = summaryHistory[summaryHistory.length - 1]?.content; | ||
|
||
// Find the index of the message associated with the last summary | ||
const lastSummaryMessageIndex = responseHistory.findIndex( | ||
(m) => m.id === summaryHistory[summaryHistory.length - 1]?.message_id | ||
); | ||
console.log('lastSummaryMessageIndex', lastSummaryMessageIndex); | ||
|
||
// Check if we've exceeded max context size since last summary | ||
const messagesSinceLastSummary = | ||
lastSummaryMessageIndex === -1 | ||
? responseHistory.length | ||
: responseHistory.length - lastSummaryMessageIndex; | ||
|
||
const needsSummary = messagesSinceLastSummary >= MAX_CONTEXT_SIZE; | ||
console.log('messagesSinceLastSummary', messagesSinceLastSummary); | ||
console.log('needsSummary', needsSummary); | ||
|
||
const lastMessageOfSummary = needsSummary | ||
? responseHistory[responseHistory.length - MAX_CONTEXT_SIZE + SUMMARY_SIZE] | ||
: undefined; | ||
|
||
let newSummary: string | undefined; | ||
|
||
console.log('=== CONVERSATION STATUS ==='); | ||
console.log('Total messages:', responseHistory.length); | ||
console.log('Messages since last summary:', messagesSinceLastSummary); | ||
console.log('Last summary message index:', lastSummaryMessageIndex); | ||
console.log('Last summary content:', lastSummary); | ||
console.log('Last message of summary:', lastMessageOfSummary?.content); | ||
console.log('Needs summary:', needsSummary); | ||
console.log('================================'); | ||
if (needsSummary) { | ||
console.log('=== Starting Summary Generation ==='); | ||
|
||
// Get the most recent MAX_CONTEXT_SIZE messages | ||
const recentMessages = responseHistory.slice(-MAX_CONTEXT_SIZE); | ||
console.log('Recent messages:', recentMessages); | ||
|
||
// Get the oldest SUMMARY_SIZE messages from those | ||
const messagesToSummarize = recentMessages.slice(0, SUMMARY_SIZE); | ||
console.log('Messages to summarize:', messagesToSummarize); | ||
|
||
// Format messages for summary prompt | ||
const formattedMessages = messagesToSummarize | ||
.map((msg) => { | ||
if (msg.is_user) { | ||
return `User: ${msg.content}`; | ||
} | ||
return `Assistant: ${msg.content}`; | ||
}) | ||
.join('\n'); | ||
console.log('Formatted messages:', formattedMessages); | ||
|
||
// Create summary prompt with existing summary if available | ||
const summaryMessages = [ | ||
...summaryPrompt, | ||
user`<new_messages> | ||
${formattedMessages} | ||
</new_messages> | ||
|
||
<existing_summary> | ||
${lastSummary || ''} | ||
</existing_summary>`, | ||
]; | ||
console.log('Summary messages:', summaryMessages); | ||
|
||
// Get summary response | ||
console.log('Creating summary stream...'); | ||
const summaryStream = await createStream(summaryMessages, { | ||
sessionId: conversationId, | ||
userId, | ||
type: 'summary', | ||
}); | ||
|
||
if (!summaryStream) { | ||
console.error('Failed to get summary stream'); | ||
throw new Error('Failed to get summary stream'); | ||
} | ||
Comment on lines
+127
to
+135
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. use |
||
|
||
// Read the full response from the stream | ||
console.log('Reading stream...'); | ||
const reader = summaryStream.body?.getReader(); | ||
if (!reader) { | ||
console.error('Failed to get reader from summary stream'); | ||
throw new Error('Failed to get reader from summary stream'); | ||
} | ||
|
||
let fullResponse = ''; | ||
while (true) { | ||
const { done, value } = await reader.read(); | ||
if (done) break; | ||
const chunk = new TextDecoder().decode(value); | ||
fullResponse += chunk; | ||
} | ||
console.log('Full response:', fullResponse); | ||
|
||
// Extract summary from response | ||
const summaryMatch = fullResponse.match(/<summary>([\s\S]*?)<\/summary/); | ||
newSummary = summaryMatch ? summaryMatch[1] : undefined; | ||
console.log('Extracted summary:', newSummary); | ||
|
||
console.log('=== Summary Generation Complete ==='); | ||
} | ||
|
||
console.log('honchoHistory', honchoHistory); | ||
console.log('responseHistory', responseHistory); | ||
|
||
const getHonchoMessage = (id: string) => | ||
honchoHistory.find((m) => m.message_id === id)?.content || | ||
'No Honcho Message'; | ||
|
||
const history = responseHistory.map((message, i) => { | ||
const history = responseHistory.map((message) => { | ||
if (message.is_user) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We should restrict the history query to only get a fixed number of messages. Currently with an |
||
return user`<honcho>${getHonchoMessage(message.id)}</honcho> | ||
${message.content}`; | ||
|
@@ -61,10 +176,12 @@ export async function POST(req: NextRequest) { | |
} | ||
}); | ||
|
||
const summaryMessage = user`<past_summary>${newSummary || lastSummary}</past_summary>`; | ||
|
||
const finalMessage = user`<honcho>${honchoThought}</honcho> | ||
${message}`; | ||
|
||
const prompt = [...responsePrompt, ...history, finalMessage]; | ||
const prompt = [...responsePrompt, summaryMessage, ...history, finalMessage]; | ||
|
||
console.log('responsePrompt', prompt); | ||
|
||
|
@@ -126,6 +243,23 @@ export async function POST(req: NextRequest) { | |
content: response.text, | ||
} | ||
), | ||
|
||
// Save summary metamessage if one was created | ||
...(newSummary | ||
? [ | ||
honcho.apps.users.sessions.metamessages.create( | ||
appId, | ||
userId, | ||
conversationId, | ||
{ | ||
message_id: lastMessageOfSummary!.id, | ||
metamessage_type: 'summary', | ||
content: newSummary, | ||
metadata: { type: 'assistant' }, | ||
} | ||
), | ||
] | ||
: []), | ||
]); | ||
} | ||
); | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The thought chain is also going to run into the same problem of filling up its context window if it has to load the entire conversation. Can we use the same summary here or does it need to be a different summary? |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,62 @@ | ||
import { user, assistant, Message } from '@/utils/ai'; | ||
|
||
const MAXIMUM_SUMMARY_SIZE: string = '6 sentences'; | ||
|
||
const summaryPrompt: Message[] = [ | ||
user`You are an AI assistant tasked with creating or updating conversation history summaries. Your goal is to produce concise, information-dense summaries that capture key points while adhering to a specified size limit. | ||
|
||
The size limit for the summary is: | ||
<size_limit> | ||
${MAXIMUM_SUMMARY_SIZE} | ||
</size_limit> | ||
|
||
For each summarization task, you will receive the following inputs: | ||
|
||
1. New messages to be summarized: | ||
<new_messages> | ||
{NEW_MESSAGES} | ||
</new_messages> | ||
|
||
2. An existing summary (if available): | ||
<existing_summary> | ||
{EXISTING_SUMMARY} | ||
</existing_summary> | ||
|
||
Instructions: | ||
|
||
1. Review the existing summary (if provided) and the new messages. | ||
|
||
2. Analyze the conversation inside <analysis> tags: | ||
a. Summarize the existing summary (if any) | ||
b. List key points from new messages | ||
c. Identify overlaps between existing summary and new messages, and highlight new information | ||
d. Prioritize information based on importance and relevance | ||
e. Plan how to express key points concisely | ||
It's OK for this section to be quite long. | ||
|
||
3. Create or update the summary based on your analysis: | ||
- Ensure a coherent and chronological flow of information. | ||
- Use concise language and avoid redundancy. | ||
- Combine related points where possible to save space. | ||
- Only mention participant names if crucial for context or decisions. | ||
- Use clear abbreviations for common terms if needed to save space. | ||
|
||
4. Check the summary length against the maximum output size. If it exceeds the limit, prioritize critical information and remove less essential details. | ||
|
||
5. Present your final summary within <summary> tags. Do not include any explanations or meta-commentary outside these tags. | ||
|
||
Example output structure: | ||
|
||
<analysis> | ||
[Your detailed analysis of the conversation, including steps a through e as outlined above] | ||
</analysis> | ||
|
||
<summary> | ||
[Your concise, information-dense summary of the conversation, adhering to the size limit] | ||
</summary> | ||
|
||
Remember, your goal is to create a dense, informative summary that captures the key points of the conversation within the specified size constraint.`, | ||
assistant`Got it. I'm ready for any summarization tasks you have for me!`, | ||
]; | ||
|
||
export default summaryPrompt; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be restricted to a fixed limit of the last xyz number of messages. Does it make sense to keep it to the CONTEXT_SIZE. Can restrict the page size of the paginated request and then access only the
items
in that request.