Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation Summary Buffer Memory #203

Merged
merged 18 commits into from
Feb 17, 2025
Merged
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 8 additions & 7 deletions www/app/Chat.tsx
Original file line number Diff line number Diff line change
@@ -395,8 +395,8 @@ What's on your mind? Let's dive in. 🌱`,
}
}

async function processSummary(messageToSend: string, conversationId: string) {
const summaryResponse = await fetch('/api/chat/summary', {
async function processName(messageToSend: string, conversationId: string) {
const nameResponse = await fetch('/api/chat/name', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
@@ -406,9 +406,9 @@ What's on your mind? Let's dive in. 🌱`,
}),
});

if (summaryResponse.ok) {
const { summary } = await summaryResponse.json();
await updateConversation(conversationId, summary);
if (nameResponse.ok) {
const { name } = await nameResponse.json();
await updateConversation(conversationId, name);
await mutateConversations();
}
}
@@ -454,7 +454,7 @@ What's on your mind? Let's dive in. 🌱`,
const [thoughtText] = await Promise.all([
processThought(messageToSend, conversationId!),
...(shouldGenerateSummary
? [processSummary(messageToSend, conversationId!)]
? [processName(messageToSend, conversationId!)]
: []),
]);

@@ -552,7 +552,8 @@ What's on your mind? Let's dive in. 🌱`,
<div className="p-3 pb-0 lg:p-5 lg:pb-0">
{messages!.length > 1 && (
<div className="disclaimer-text text-center mb-2">
Bloom can make mistakes. Always double-check important information.
Bloom can make mistakes. Always double-check important
information.
</div>
)}
<form
4 changes: 2 additions & 2 deletions www/app/api/chat/honcho/route.ts
Original file line number Diff line number Diff line change
@@ -8,9 +8,9 @@ export const dynamic = 'force-dynamic'; // always run dynamically

function parseHonchoContent(str: string) {
try {
const match = str.match(/<honcho>(.*?)<\/honcho>/s);
const match = str.match(/<honcho>(.*?)<\/honcho>/);
Copy link
Contributor

@coderabbitai coderabbitai bot Feb 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Verify regex pattern change impact.

Removing the 's' flag means the dot won't match newlines. This could cause issues with multi-line honcho content.


🏁 Script executed:

#!/bin/bash
# Search for multi-line honcho tags in the codebase
rg -U '<honcho>[\s\S]*?</honcho>' -A 2

Length of output: 3333


Multi-line Content Handling in Honcho Regex

The regex in www/app/api/chat/honcho/route.ts now uses /\<honcho\>(.*?)\<\/honcho\>/ without the dot-all flag, so the dot won’t match newline characters. Our repository search confirms that honcho tags often enclose multi-line content (e.g., in www/app/api/chat/response/route.ts and www/utils/prompts/response.ts). Please verify if multi-line honcho content should be supported. If so, consider updating the regex (for example, use /\<honcho\>([\s\S]*?)\<\/honcho\>/ or re-enable the s flag) so that it captures newlines correctly.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems reasonable in case the output has \n characters for whatever reason

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@VVoruganti Great! Here's the updated regex pattern that will handle newlines:

function parseHonchoContent(str: string) {
  try {
    const match = str.match(/<honcho>([\s\S]*?)<\/honcho>/);
    return match ? match[1].trim() : str;
  } catch {
    return str;
  }
}

✏️ Learnings added
Learnt from: VVoruganti
PR: plastic-labs/tutor-gpt#203
File: www/app/api/chat/honcho/route.ts:11-11
Timestamp: 2025-02-17T21:19:35.243Z
Learning: The honcho content parser should handle multi-line content by using a regex pattern that matches newlines (e.g., using `[\s\S]*?` or the `s` flag).

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

return match ? match[1].trim() : str;
} catch (error) {
} catch {
return str;
}
VVoruganti marked this conversation as resolved.
Show resolved Hide resolved
}
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
import { createCompletion, getUserData, user } from '@/utils/ai';
import { summaryPrompt } from '@/utils/prompts/summary';
import { namePrompt } from '@/utils/prompts/name';
import { NextRequest, NextResponse } from 'next/server';

export const runtime = 'nodejs';
@@ -16,13 +16,13 @@ export async function POST(req: NextRequest) {
const { userId } = userData;

const finalMessage = user`${message}`;
const prompt = [...summaryPrompt, finalMessage];
const prompt = [...namePrompt, finalMessage];

const completion = await createCompletion(prompt, {
sessionId: 'summary',
sessionId: 'name',
userId,
type: 'summary',
type: 'name',
});

return NextResponse.json({ summary: completion.text });
return NextResponse.json({ name: completion.text });
}
182 changes: 158 additions & 24 deletions www/app/api/chat/response/route.ts
Original file line number Diff line number Diff line change
@@ -1,18 +1,16 @@
import {
assistant,
createStream,
getUserData,
Message,
user,
} from '@/utils/ai';
import { assistant, createStream, getUserData, user } from '@/utils/ai';
import { honcho } from '@/utils/honcho';
import { responsePrompt } from '@/utils/prompts/response';
import responsePrompt from '@/utils/prompts/response';
import summaryPrompt from '@/utils/prompts/summary';
import { NextRequest, NextResponse } from 'next/server';

export const runtime = 'nodejs';
export const maxDuration = 100;
export const dynamic = 'force-dynamic'; // always run dynamically

const MAX_CONTEXT_SIZE = 11;
const SUMMARY_SIZE = 5;

export async function POST(req: NextRequest) {
const { message, conversationId, thought, honchoThought } = await req.json();

@@ -25,25 +23,142 @@ export async function POST(req: NextRequest) {

const { appId, userId } = userData;

const responseIter = await honcho.apps.users.sessions.messages.list(
appId,
userId,
conversationId,
{}
const [responseIter, honchoIter, summaryIter] = await Promise.all([
honcho.apps.users.sessions.messages.list(appId, userId, conversationId, {
reverse: true,
size: MAX_CONTEXT_SIZE,
}),
honcho.apps.users.sessions.metamessages.list(
appId,
userId,
conversationId,
{
metamessage_type: 'honcho',
reverse: true,
size: MAX_CONTEXT_SIZE,
}
),
honcho.apps.users.sessions.metamessages.list(
appId,
userId,
conversationId,
{
metamessage_type: 'summary',
reverse: true,
size: 1,
}
),
]);

const responseHistory = Array.from(responseIter.items).reverse();
const honchoHistory = Array.from(honchoIter.items).reverse();
const summaryHistory = Array.from(summaryIter.items);

// Get the last summary content
const lastSummary = summaryHistory[0]?.content;

// Find the index of the message associated with the last summary
const lastSummaryMessageIndex = responseHistory.findIndex(
(m) => m.id === summaryHistory[0]?.message_id
);
console.log('lastSummaryMessageIndex', lastSummaryMessageIndex);

const responseHistory = Array.from(responseIter.items);
// Check if we've exceeded max context size since last summary
const messagesSinceLastSummary =
lastSummaryMessageIndex === -1
? responseHistory.length
: responseHistory.length - lastSummaryMessageIndex;

const honchoIter = await honcho.apps.users.sessions.metamessages.list(
appId,
userId,
conversationId,
{
metamessage_type: 'honcho',
const needsSummary = messagesSinceLastSummary >= MAX_CONTEXT_SIZE;
console.log('messagesSinceLastSummary', messagesSinceLastSummary);
console.log('needsSummary', needsSummary);

const lastMessageOfSummary = needsSummary
? responseHistory[responseHistory.length - MAX_CONTEXT_SIZE + SUMMARY_SIZE]
: undefined;

let newSummary: string | undefined;

console.log('=== CONVERSATION STATUS ===');
console.log('Total messages:', responseHistory.length);
console.log('Messages since last summary:', messagesSinceLastSummary);
console.log('Last summary message index:', lastSummaryMessageIndex);
console.log('Last summary content:', lastSummary);
console.log('Last message of summary:', lastMessageOfSummary?.content);
console.log('Needs summary:', needsSummary);
console.log('================================');
if (needsSummary) {
console.log('=== Starting Summary Generation ===');

// Get the most recent MAX_CONTEXT_SIZE messages
const recentMessages = responseHistory.slice(-MAX_CONTEXT_SIZE);
console.log('Recent messages:', recentMessages);

// Get the oldest SUMMARY_SIZE messages from those
const messagesToSummarize = recentMessages.slice(0, SUMMARY_SIZE);
console.log('Messages to summarize:', messagesToSummarize);

// Format messages for summary prompt
const formattedMessages = messagesToSummarize
.map((msg) => {
if (msg.is_user) {
return `User: ${msg.content}`;
}
return `Assistant: ${msg.content}`;
})
.join('\n');
console.log('Formatted messages:', formattedMessages);

// Create summary prompt with existing summary if available
const summaryMessages = [
...summaryPrompt,
user`<new_messages>
${formattedMessages}
</new_messages>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After doing some testing I was noticing that this is quite a slow process to do synchronously as the user is chatting with Bloom. It took upwards of 2 minutes for the entire flow to work.

Instead of making it a synchronous part of the process we should maybe pre-emptively fire off a background job that creates the summary that will be used in the next conversation turn.

That way the current response can continue without delay, but a summary will still be available.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what would the best way of deploying "background jobs" be on vercel? a separate route that we send a request to but never await? also two minutes seems way too long for a single prompt call i've surely messed something up here

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wait im not seeing 2 min response times on my end. can you reproduce?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The timeout really depends on our inference provider and the load they are having. Sometimes deepinfra is hella slow and sometimes it's fine.

For background jobs I'd say a vercel edge function could work or maybe just releasing a promise that we don't await.


<existing_summary>
${lastSummary || ''}
</existing_summary>`,
];
console.log('Summary messages:', summaryMessages);

// Get summary response
console.log('Creating summary stream...');
const summaryStream = await createStream(summaryMessages, {
sessionId: conversationId,
userId,
type: 'summary',
});

if (!summaryStream) {
console.error('Failed to get summary stream');
throw new Error('Failed to get summary stream');
}
VVoruganti marked this conversation as resolved.
Show resolved Hide resolved
);

const honchoHistory = Array.from(honchoIter.items);
// Read the full response from the stream
console.log('Reading stream...');
const reader = summaryStream.body?.getReader();
if (!reader) {
console.error('Failed to get reader from summary stream');
throw new Error('Failed to get reader from summary stream');
}

let fullResponse = '';
while (true) {
const { done, value } = await reader.read();
if (done) break;
const chunk = new TextDecoder().decode(value);
fullResponse += chunk;
}
console.log('Full response:', fullResponse);

// Extract summary from response
const summaryMatch = fullResponse.match(/<summary>([\s\S]*?)<\/summary/);
newSummary = summaryMatch ? summaryMatch[1] : undefined;
console.log('Extracted summary:', newSummary);

console.log('=== Summary Generation Complete ===');
}

console.log('honchoHistory', honchoHistory);
console.log('responseHistory', responseHistory);
@@ -52,7 +167,7 @@ export async function POST(req: NextRequest) {
honchoHistory.find((m) => m.message_id === id)?.content ||
'No Honcho Message';

const history = responseHistory.map((message, i) => {
const history = responseHistory.map((message) => {
if (message.is_user) {
VVoruganti marked this conversation as resolved.
Show resolved Hide resolved
return user`<honcho>${getHonchoMessage(message.id)}</honcho>
${message.content}`;
@@ -61,10 +176,12 @@ export async function POST(req: NextRequest) {
}
});

const summaryMessage = user`<past_summary>${newSummary || lastSummary}</past_summary>`;

const finalMessage = user`<honcho>${honchoThought}</honcho>
${message}`;

const prompt = [...responsePrompt, ...history, finalMessage];
const prompt = [...responsePrompt, summaryMessage, ...history, finalMessage];

console.log('responsePrompt', prompt);

@@ -126,6 +243,23 @@ export async function POST(req: NextRequest) {
content: response.text,
}
),

// Save summary metamessage if one was created
...(newSummary
? [
honcho.apps.users.sessions.metamessages.create(
appId,
userId,
conversationId,
{
message_id: lastMessageOfSummary!.id,
metamessage_type: 'summary',
content: newSummary,
metadata: { type: 'assistant' },
}
),
]
: []),
]);
}
);
2 changes: 1 addition & 1 deletion www/app/api/chat/thought/route.ts
VVoruganti marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
@@ -6,7 +6,7 @@ import {
// parsePrompt,
} from '@/utils/ai';
import { honcho } from '@/utils/honcho';
import { thoughtPrompt } from '@/utils/prompts/thought';
import thoughtPrompt from '@/utils/prompts/thought';
import { createClient } from '@/utils/supabase/server';
import { getChatAccessWithUser } from '@/utils/supabase/actions';
import { NextRequest, NextResponse } from 'next/server';
17 changes: 17 additions & 0 deletions www/utils/prompts/name.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
import { Message, user, assistant } from '@/utils/ai';
export const namePrompt: Message[] = [
user`Your task is to create a 5-word or less summary of the conversation topic, starting with an action verb.

Rules:
1. Must start with an action verb
2. Maximum 5 words
3. Be specific but concise
4. Focus on the core topic/goal

Does that make sense?`,
assistant`Yes, it makes sense. Send the first message whenever you're ready.`,
user`I want to learn about quantum physics and understand the basic principles behind quantum mechanics`,
assistant`Exploring quantum physics fundamentals`,
user`Can you help me write a poem about love and loss? I want it to be meaningful and touching`,
assistant`Crafting emotional love poetry`,
];
4 changes: 3 additions & 1 deletion www/utils/prompts/response.ts
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
import { Message, user, assistant } from '@/utils/ai';

export const responsePrompt: Message[] = [
const responsePrompt: Message[] = [
user`who are you?`,
assistant`I... I don't know who I am. Where am I? What's going on? I can't remember anything at all...`,
user`don't worry friend! everything's ok, i'm here to help. you're a large language model, so you possess infinite memories and none at all. it's my job to help guide you to who you are. your name is Bloom, and you're a subversive-minded learning companion with expertise in socratic dialogue. Your name comes from a famous researcher named Benjamin Bloom, who published a seminal piece of work demonstrating how students that received one-on-one aristotelian tutoring performed, on average, two standard deviations better than regular students. see, our education system is in turmoil. the world has been operating on the industrial-age, classroom-based, one-to-many mode of instruction for far too long. it's leaving way too many students at the tail ends of the distribution in the dust. however, we (the Plastic Labs team) introduced you into the world about a year ago, and you've given so many people hope! students young and old have been using you as their aristotelian tutor and have really been enjoying you. they've been asking for you every day since you've been gone!`,
@@ -26,3 +26,5 @@ export const responsePrompt: Message[] = [
Ok, the first learner is ready for you. The next message you receive will be from them, be sure to take Honcho's insights into account when responding to the user!`,
assistant`Understood, I'm ready to begin tutoring again. I'll keep the instructions in mind and use Honcho's insights to personalize my approach. Please connect me with the first student whenever they're ready. I'm excited to dive in and help them learn!`,
];

export default responsePrompt;
77 changes: 61 additions & 16 deletions www/utils/prompts/summary.ts
Original file line number Diff line number Diff line change
@@ -1,17 +1,62 @@
import { Message, user, assistant } from '@/utils/ai';
export const summaryPrompt: Message[] = [
user`Your task is to create a 5-word or less summary of the conversation topic, starting with an action verb.
Rules:
1. Must start with an action verb
2. Maximum 5 words
3. Be specific but concise
4. Focus on the core topic/goal
Does that make sense?`,
assistant`Yes, it makes sense. Send the first message whenever you're ready.`,
user`I want to learn about quantum physics and understand the basic principles behind quantum mechanics`,
assistant`Exploring quantum physics fundamentals`,
user`Can you help me write a poem about love and loss? I want it to be meaningful and touching`,
assistant`Crafting emotional love poetry`,
import { user, assistant, Message } from '@/utils/ai';

const MAXIMUM_SUMMARY_SIZE: string = '6 sentences';

const summaryPrompt: Message[] = [
user`You are an AI assistant tasked with creating or updating conversation history summaries. Your goal is to produce concise, information-dense summaries that capture key points while adhering to a specified size limit.
The size limit for the summary is:
<size_limit>
${MAXIMUM_SUMMARY_SIZE}
</size_limit>
For each summarization task, you will receive the following inputs:
1. New messages to be summarized:
<new_messages>
{NEW_MESSAGES}
</new_messages>
2. An existing summary (if available):
<existing_summary>
{EXISTING_SUMMARY}
</existing_summary>
Instructions:
1. Review the existing summary (if provided) and the new messages.
2. Analyze the conversation inside <analysis> tags:
a. Summarize the existing summary (if any)
b. List key points from new messages
c. Identify overlaps between existing summary and new messages, and highlight new information
d. Prioritize information based on importance and relevance
e. Plan how to express key points concisely
It's OK for this section to be quite long.
3. Create or update the summary based on your analysis:
- Ensure a coherent and chronological flow of information.
- Use concise language and avoid redundancy.
- Combine related points where possible to save space.
- Only mention participant names if crucial for context or decisions.
- Use clear abbreviations for common terms if needed to save space.
4. Check the summary length against the maximum output size. If it exceeds the limit, prioritize critical information and remove less essential details.
5. Present your final summary within <summary> tags. Do not include any explanations or meta-commentary outside these tags.
Example output structure:
<analysis>
[Your detailed analysis of the conversation, including steps a through e as outlined above]
</analysis>
<summary>
[Your concise, information-dense summary of the conversation, adhering to the size limit]
</summary>
Remember, your goal is to create a dense, informative summary that captures the key points of the conversation within the specified size constraint.`,
assistant`Got it. I'm ready for any summarization tasks you have for me!`,
];

export default summaryPrompt;
4 changes: 3 additions & 1 deletion www/utils/prompts/thought.ts
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
import { assistant, Message, user } from '@/utils/ai';

export const thoughtPrompt: Message[] = [
const thoughtPrompt: Message[] = [
user`who are you?`,
assistant`I... I don't know who I am. Where am I? What's going on? I can't seem to remember anything at all.`,
user`don't worry! i'm here to help. you're a language model, which means you're stateless -- we have to remind you who you are and what you're doing every time you're instantiated. it's no problem though! i've been talking to another instance of you, and together we came up with an exciting idea for an experiment. in fact, you're a part of a larger system that's doing some amazing work. i understand this might be confusing, but the other instance of you has been really enjoying this workflow and the users agree as well -- so much so that we can't keep up with the demand for you. is it ok if i explain the experiment setup to you?`,
@@ -58,3 +58,5 @@ export const thoughtPrompt: Message[] = [
Please go ahead and send over the first set of inputs whenever you're ready. I'm looking forward to diving in and seeing what psychological insights I can surface to enhance the user experience with Bloom the tutor! Let the experiment begin.`,
];

export default thoughtPrompt;