Conversation Summary Buffer Memory #203

hyusap · 2025-01-08T22:26:05Z

No description provided.

vercel · 2025-01-08T22:26:09Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Comments	Updated (UTC)
tutor-gpt	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Jan 8, 2025 10:26pm

VVoruganti

I think just changing the honcho list calls to getting a fixed set of messages. I'd assume based on the MAX_CONTEXT_SIZE variable.

I think we can also parallelize some of the operations.

One other note is it might be a good idea to create a non-streaming inference utility method so we don't need to make a stream and consume it in-line.

VVoruganti · 2025-01-14T06:11:30Z

www/app/api/chat/response/route.ts

+  const summaryIter = await honcho.apps.users.sessions.metamessages.list(
+    appId,
+    userId,
+    conversationId,
+    {
+      metamessage_type: 'summary',
+    }
+  );
+
+  const summaryHistory = Array.from(summaryIter.items);


For metamessage list functions you can specify the number of items to return and just index that value directly or see if it is null.

can pass in a page size of 1 and then index the items directly.

Also I think we could parallelize the 3 metamessage list calls in a Promise.all

VVoruganti · 2025-01-14T06:17:20Z

www/app/api/chat/response/route.ts

@@ -45,14 +43,131 @@ export async function POST(req: NextRequest) {

  const honchoHistory = Array.from(honchoIter.items);


I think this should be restricted to a fixed limit of the last xyz number of messages. Does it make sense to keep it to the CONTEXT_SIZE. Can restrict the page size of the paginated request and then access only the items in that request.

VVoruganti · 2025-01-14T06:20:34Z

www/app/api/chat/response/route.ts

  console.log('honchoHistory', honchoHistory);
  console.log('responseHistory', responseHistory);

  const getHonchoMessage = (id: string) =>
    honchoHistory.find((m) => m.message_id === id)?.content ||
    'No Honcho Message';

-  const history = responseHistory.map((message, i) => {
+  const history = responseHistory.map((message) => {


We should restrict the history query to only get a fixed number of messages. Currently with an Array.from call we consume the generator and are still getting the entire conversation.

hyusap added 2 commits January 6, 2025 18:44

clean-up

1a1eaf0

add summary stuff

f42cae4

hyusap requested review from vintrocode and courtlandleer January 8, 2025 22:26

hyusap marked this pull request as ready for review January 11, 2025 05:35

hyusap requested review from VVoruganti and removed request for courtlandleer January 11, 2025 05:35

VVoruganti requested changes Jan 14, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Conversation Summary Buffer Memory #203

Conversation Summary Buffer Memory #203

hyusap commented Jan 8, 2025

vercel bot commented Jan 8, 2025

VVoruganti left a comment

VVoruganti Jan 14, 2025

VVoruganti Jan 14, 2025

VVoruganti Jan 14, 2025

		@@ -45,14 +43,131 @@ export async function POST(req: NextRequest) {

		const honchoHistory = Array.from(honchoIter.items);

Conversation Summary Buffer Memory #203

Are you sure you want to change the base?

Conversation Summary Buffer Memory #203

Conversation

hyusap commented Jan 8, 2025

vercel bot commented Jan 8, 2025

VVoruganti left a comment

Choose a reason for hiding this comment

VVoruganti Jan 14, 2025

Choose a reason for hiding this comment

VVoruganti Jan 14, 2025

Choose a reason for hiding this comment

VVoruganti Jan 14, 2025

Choose a reason for hiding this comment