Skip to content

ben-pr-p/zod-ai

Repository files navigation

zod-ai

This package provides functionality similar to Marvin's AI utilities, but based on Zod functions and for the Typescript ecosystem.

I am very pleased with the developer experience, and I hope you enjoy it as well! I find that it's all I need to build complex AI applications, instead of LangChain or other things with more features.

Code Calling AI

You can create a function that calls an AI model by wrapping a Zod function. This simplifies instructing the LLM to respond in the proper format, as well as parsing the response into the proper type in your codebase.

import { makeAi } from 'zod-ai';
import { OpenAI } from 'openai';
import { z } from 'zod';

// Initialize your OpenAI client
const client = new OpenAI(process.env.OPENAI_API_KEY);

// Initialize your ai wrapper function with the client and requested model
const ai = makeAi({
  client,
  model: "gpt-3.5-turbo-1106",
});

// Wrap a Zod function that has arguments, returns, and a description
const returnFamousActorsFromVibe = ai(
  z
    .function()
    .args(z.string())
    .returns(z.array(z.string()))
    .describe(
      "Return a list of famous actors that match the user provided vibe"
    )
);
// returnFamousActorsFromVibe has type: (vibe: string) => Promise<string[]>
const actors = await returnFamousActorsFromVibe("villains");
console.log(actors) // [ "Tom Hiddleston", "Heath Ledger", "Jack Nicholson", "Anthony Hopkins" ]

Support for Other Models

In addition to OpenAI, Anyscale endpoints and custom access (like Ollama) are also supported.

Anyscale endpoints have a special feature that constrains response generation to a specific JSON schema, which OpenAI and Ollama only offer constraining the response to JSON.

To take advantage of Anyscale's schema feature, set the clientSupportsJsonSchema flag to true:

const anyscale = new OpenAI({
  baseURL: "https://api.endpoints.anyscale.com/v1",
  apiKey: config.ANYSCALE_ENDPOINTS_API_KEY,
});

const anyscaleAiFn = makeAi({
  clientSupportsJsonSchema: true,
  client: anyscale,
  model: "mistralai/Mistral-7B-Instruct-v0.1",
});

The Anyscale model options with this mode are mistralai/Mistral-7B-Instruct-v0.1 and mistralai/Mixtral-8x7B-Instruct-v0.1. That is documented here.

To use zod-ai with Ollama, pass in a custom chat function, like so:

const mistralAiFn = makeAi({
  chat: async (systemPrompt: string, userMessage: string) => {
    const response = await ollama.chat({
      model: "dolphin2.1-mistral",
      messages: [
        { role: "system", content: systemPrompt },
        { role: "user", content: userMessage },
      ],
    });

    return response.message.content;
  },
});

You can put any call to an LLM that you'd like inside of that chat parameter.

In my experience, gpt-3.5-turbo and ollama latency is about the same, even taking into account the network round trip which is not a factor for ollama. I was expecting better latency from ollama, but I didn't get it. Although Anyscale's documented latency claims are very good, I found it to be much higher (doing the same generation sometimes in 5 seconds that gpt-3.5-turbo and ollama do in 1 second), but not sure if I'm doing something wrong there.

AI Calling Code

In the above scenario, your code calls an AI model. You can also use zod-ai to simplify interaction with OpenAI's function calling (tools) feature, which allows a model calls your code.

import { makeTool, formatTools, handleToolCalls, isToolCallRequested } from 'zod-ai';
import { OpenAI } from 'openai';

// Initialize your OpenAI client
const client = new OpenAI(process.env.OPENAI_API_KEY);

// Create a tool
const getContactInfo = makeTool(
  // First argument is a Zod function schema
  z.function()
   .args(z.object({ firstName: z.string(), lastName: z.string() }))
   .returns(z.object({ email: z.string(), phone: z.string() }))
   .describe('Search the users contact book for a contact with the provided first and last name'),

  // The function signature is validated by Zod/TS - you can use async or not, either is fine
  async ({ firstName, lastName }) => {
    const contact = await getContactFromDatabase(firstName, lastName)
    return { email: contact.email, phone: contact.phone }
  }
)

const tools = { getContactInfo }

// Now, use the provided `formatTools` to transform it into a format that OpenAI wants
const nextChatResponse = await client.chat.completions.create({
  model: "gpt-3.5-turbo-1106",
  messages: [
    { role: "user", content: "Do I have a phone number for Alice Barkley" },
  ],
  tools: formatTools(tools),
});

// Use isToolCallRequested to check if the AI requested a tool call
const toolCallRequested = isToolCallRequested(nextChatResponse);

if (toolCallRequested) {
  const toolResponseMessages = await handleToolCalls(tools, nextChatResponse.choices[0].message.tool_calls!);

  // handleToolCalls response is fully ready to send back to OpenAI, with tool ID and role set properly
  // so you can just go ahead and:
  const finalChatResponse = await client.chat.completions.create({
    model: "gpt-3.5-turbo-1106",
    messages: [
      { role: "user", content: "Do I have a phone number for Alice Barkley" },
      ...toolResponseMessages // use it here!
    ],
    tools: formatTools(tools),
  });
}

Installation

zod and openai are peer dependencies and not bundled here.

Bun:

bun add zod openai zod-ai

NPM:

npm install --save zod openai zod-ai

Yarn:

yarn add zod openai zod-ai

Overriding the System Prompt

In the "Code Calling AI" usage, by default, zod-ai will construct a system prompt that uses the description, arguments, and return type of your zod function. That system prompt looks like this:

 Your job is to generate an output for a function. The function is described as:
${description}

The user will provide input that matches the following schema:
${JSON.stringify(inputSchema, null, 2)}

You MUST respond in a JSON format. Your response must match the following JSONSchema definition:
${JSON.stringify(outputSchema, null, 2)} 

If you want to change this system prompt for whatever reason, you can pass in another function to makeAi:

const ai = makeAi({
  client,
  model: "gpt-3.5-turbo-1106",
  systemPrompt: (description: string, inputSchema: string, outputSchema: string) => `new system prompt`
})

Note that your system prompt must instruct the LLM to respond in JSON or OpenAI will throw an error.

Usage of Descriptions

Zod objects and JSON Schema both support descriptions. zod-ai will use these descriptions if you include them. For example, we could have written the above tool usage example as:

const getContactInfo = makeTool(
  z.function()
   .args(z.object({ 
    firstName: z.string().describe('The first name of the contact to search for'), 
    lastName: z.string().describe('The last name of the contact to search for') 
  }))
   .returns(z.object({ email: z.string(), phone: z.string() }))
   .describe('Search the users contact book for a contact with the provided first and last name'),

  // The function signature is validated by Zod/TS - you can use async or not, either is fine
  async ({ firstName, lastName }) => {
    const contact = await getContactFromDatabase(firstName, lastName)
    return { email: contact.email, phone: contact.phone }
  }
)

While descriptions are helpful for the top level functions, most of the time the combination of the parameter name and the function description will be enough for the LLM to understand how to use the parameter. However, if an extra description would be helpful, you can add it, just not that it counts against your input tokens.

About

Code calling AI + AI calling code, with Zod

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published