Proposal for new Instructor API ahead of 1.0 #544

jxnl · 2024-03-29T03:57:22Z

jxnl
Mar 29, 2024
Maintainer

As the number of AI APIs from companies like Anthropic, Mistral, OpenAI, LiteLLM, and Bedrock continues to grow, it's becoming increasingly challenging to manage them effectively. To address this issue, we propose creating a unified API wrapper called instructor that simplifies the integration process and provides a consistent interface for developers.

Take a look at this PR to see some more stuff: #546

from instructor import Instructor
from anthropic import Anthropic
from litellm import LiteLLM
from openai import OpenAI

client = Instructor.from_anthropic(Anthropic())
client = Instructor.from_litellm(LiteLLM())
client = Instructor.from_openai(OpenAI())  # This would support grok

The instructor library will offer the following features:

Unified API methods:

class Instructor:
    def create(
        self,
        response_mode: Type[T],
        messages: List[Message],
        temperature: float = 1.0,
        validation_context: dict = None,
        max_retries: int = 1,
        **kwargs
    ) -> T:
        """
        Returns:
            An instance of the specified response model.
        """
        pass

    def create_completion(
        self,
        response_model: Type[T],
        messages: List[Message],
        temperature: float = 1.0,
        validation_context: dict = None,
        max_retries: int = 1,
        **kwargs
    ) -> Tuple[T, ChatCompletion]:
        """
        Returns:
            A tuple containing an instance of the specified response model and the corresponding ChatCompletion object.
        """
        pass

    def create_partial(
        self,
        response_mode: Type[T],
        messages: List[Message],
        temperature: float = 1.0,
        validation_context: dict = None,
        max_retries: int = 1,
        **kwargs
    ) -> Generator[T, None, None]:
        """
        Returns:
            A generator that yields instances of the specified response model as the partial response is generated.
        """
        pass

    def create_iterable(
        self,
        response_model: Type[T],
        messages: List[Message],
        temperature: float = 1.0,
        validation_context: dict = None,
        max_retries: int = 1,
        stream: bool = False,
        **kwargs
    ) -> Iterable[T]:
        """
        Returns:
            An iterable that yields instances of the specified response model as the response is generated.
        """
        pass

Customizable properties:
- client.allowed_modes = [...] (can be set from the from_* factory methods)
Simplified handling of Anthropic's system kwarg for context setting.

Event-driven architecture using an observer pattern:

instructor.on(Event.RAW_RESPONSE, lambda x: print(x))
instructor.on(Event.KWARGS, lambda x: print(x))
instructor.on(Event.MODEL, lambda x: print(x))
instructor.on(Event.ERROR, lambda x: print(x))
instructor.on(Event.COMPLETION, lambda x: print(x))

By implementing this unified API wrapper, developers will be able to seamlessly switch between different AI providers, reduce integration complexity, and maintain a consistent codebase. The instructor library will abstract away the differences between the various APIs, making it easier to experiment with and compare different AI models.

Additionally, the event-driven architecture will enable developers to easily monitor and respond to various events during the interaction with the AI models, such as raw responses, keyword arguments, model information, errors, and completions.

Overall, this proposal aims to simplify the integration process, improve developer experience, and promote interoperability among the growing number of AI APIs.

I have no faith that apis will be consistent in the future, and after 120k monthly downloads I'd like to think we've earned the right to create own sdk, I'd like plenty of pushback, and i expect we'll keep the old patch method simply because it'll likely power this new udk.

Before

import instructor
from openai import OpenAI
from pydantic import BaseModel

# This enables response_model keyword
# from client.chat.completions.create
client = instructor.patch(OpenAI())


class UserDetail(BaseModel):
    name: str
    age: int


user = client.chat.completions.create(
    model="gpt-3.5-turbo",
    response_model=UserDetail,
    messages=[
        {"role": "user", "content": "Extract Jason is 25 years old"},
    ],
)

assert isinstance(user, UserDetail)
assert user.name == "Jason"
assert user.age == 25
print(user.model_dump_json(indent=2))
"""
{
  "name": "Jason",
  "age": 25
}
"""

After (Backwards Compat)

import instructor
from openai import OpenAI
from pydantic import BaseModel

# This enables response_model keyword
# from client.chat.completions.create
client = instructor.from_openai(OpenAI())


class UserDetail(BaseModel):
    name: str
    age: int


user = client.chat.completions.create(
    model="gpt-3.5-turbo",
    response_model=UserDetail,
    messages=[
        {"role": "user", "content": "Extract Jason is 25 years old"},
    ],
)

assert isinstance(user, UserDetail)
assert user.name == "Jason"
assert user.age == 25
print(user.model_dump_json(indent=2))
"""
{
  "name": "Jason",
  "age": 25
}
"""

After (Backwards Compat)

import instructor
from openai import OpenAI
from pydantic import BaseModel

# This enables response_model keyword
# from client.chat.completions.create
client = instructor.from_openai(OpenAI())


class UserDetail(BaseModel):
    name: str
    age: int


user = client.chat.completions.create(
    model="gpt-3.5-turbo",
    response_model=UserDetail,
    messages=[
        {"role": "user", "content": "Extract Jason is 25 years old"},
    ],
)

for user in client.create_partial(
    model="gpt-3.5-turbo",
    response_model=UserDetail,
    messages=[
        {"role": "user", "content": "Extract Jason is 25 years old"},
    ],
):
   ...
   
user, completion = client.create_with_response(
    model="gpt-3.5-turbo",
    response_model=UserDetail,
    messages=[
        {"role": "user", "content": "Extract Jason is 25 years old"},
    ],
):
   ...

jxnl · 2024-03-29T04:06:28Z

jxnl
Mar 29, 2024
Maintainer Author

from typing import Any, Self
import openai
import instructor
from typing import Type, TypeVar
from pydantic import BaseModel


T = TypeVar("T", bound=BaseModel)


class Instructor:

    client: Any

    def __init__(
        self,
        client,
        default_model: str | None = None,
        mode: instructor.Mode = instructor.Mode.TOOLS,
    ):
        self.client = client
        self.default_model = default_model
        self.mode = mode

    @classmethod
    def from_openai(
        cls,
        client: openai.OpenAI,
        default_model: str | None = None,
        mode: instructor.Mode = instructor.Mode.TOOLS,
    ):
        return cls(
            client=instructor.patch(client).chat.completions,
            default_model=default_model,
            mode=mode,
        )

    @property
    def chat(self) -> Self:
        return self

    @property
    def completions(self) -> Self:
        return self

    def create(self, response_model: Type[T], *args, **kwargs) -> T:
        if self.default_model is not None:
            kwargs["model"] = self.default_model
        return self.client.create(response_model=response_model, *args, **kwargs)


if __name__ == "__main__":

    class User(BaseModel):
        name: str
        age: int

    client = Instructor.from_openai(openai.OpenAI())

    user = client.create(
        response_model=User,
        messages=[{"role": "user", "content": "Jason is 10"}],
        temperature=0,
    )
    print(user)

    user = client.chat.completions.create(
        response_model=User,
        messages=[{"role": "user", "content": "Jason is 10"}],
        temperature=0,
    )
    print(user)

0 replies

jxnl · 2024-03-29T04:21:54Z

jxnl
Mar 29, 2024
Maintainer Author

with this we can also do smarter things like add better rate limiting

0 replies

leehanchung · 2024-03-29T04:39:59Z

leehanchung
Mar 29, 2024

The SDK should strictly adhere to one of the major inference API vendors, e.g., OpenAI. Otherwise, this risks creating additional standards.

For transparency, it would be nice to have a mapping from various API services to Instructor/OpenAI. This would minimize user confusion, especially during debugging.

2 replies

morgante Mar 29, 2024

The OpenAI SDK has effectively become the common format - if you support OpenAI as the primary interface, tools like LiteLLM can solve compatibility problem.

jxnl Mar 29, 2024
Maintainer Author

i hear the openai format might change within the year, resulting in 2 openai apis and 1 anthropic

jd-solanki · 2024-03-29T04:45:23Z

jd-solanki
Mar 29, 2024

I haven't used instructor yet but I'm planning to use it in next month and I hope it'll fully type safe if isn't.

BTW I'm glad to see instrucutor and how smartly it solve the issues related prompts.

0 replies

diptanu · 2024-03-29T05:21:59Z

diptanu
Mar 29, 2024

Love this direction. Since you brought up metrics - suggest exporting metrics with the OpenTelemetry protocol which is a standard widely adopted by observability vendors. Instructor users will be able to get out of the box metrics for monitoring in most production environments.

2 replies

morgante Mar 29, 2024

There's even a standard for LLM calls from OpenLLMetry.

jxnl Mar 29, 2024
Maintainer Author

yeah i'd want another library to do that part

raunakdoesdev · 2024-03-29T05:55:11Z

raunakdoesdev
Mar 29, 2024

Hmm... doesn't this kind of contradict the ethos you shared here:
https://jxnl.co/writing/2024/02/20/formatting-strings/

When the next change to the API comes about (say image inputs, or video inputs, or something else) won't this add extra maintenance cost?

2 replies

jxnl Mar 29, 2024
Maintainer Author

you still get messages: List[Messages]

jxnl Mar 29, 2024
Maintainer Author

the real change for most people is

client.chat.completions.create

and giving users proper type safety, cause right now its fucked

zby · 2024-03-29T07:14:03Z

zby
Mar 29, 2024

Please add a way to use list of types in response_mode for the case of LLM choosing a tool (in a more agentic workflow) - a replacement for https://github.com/jxnl/instructor/blob/main/examples/union/run.py - which creates a more complex schema.

4 replies

jxnl Mar 29, 2024
Maintainer Author

union is what makes it type consistant

zby Mar 29, 2024

Maybe you could use a superclass for that?

This is maybe subject for another thread - but beside the complex schema - I have also another problem with that union solution:

Here is a modified code from the example:

from pydantic import BaseModel, Field
from typing import Union
import instructor
from openai import OpenAI


class SearchWikipedia(BaseModel):
    """Search action class with a 'query' field and a process method."""

    query: str = Field(description="The search query")

    def process(self):
        """Process the search action."""
        return f"SearchWikipedia method called for query: {self.query}"


class SearchBritannica(BaseModel):
    """Search action class with a 'query' field and a process method."""

    query: str = Field(description="The search query")

    def process(self):
        """Process the search action."""
        return f"SearchBritannica method called for query: {self.query}"


# Union of Search, Lookup, and Finish
class TakeAction(BaseModel):
    action: Union[SearchWikipedia, SearchBritannica]

    def process(self):
        """Process the action."""
        return self.action.process()


try:
    # Enables `response_model`
    client = instructor.patch(OpenAI())
    action = client.chat.completions.create(
        model="gpt-3.5-turbo",
        response_model=TakeAction,
        messages=[
            {"role": "user", "content": "Please search Britannica for the word 'computer'"},
        ],
    )
    assert isinstance(action, TakeAction), "The action is not TakeAction"
    print(action.process())
except Exception as e:
    print(f"An error occurred: {e}")

Even though I tell it to search Britannica the output is always:

SearchWikipedia method called for query: computer

This is because the function call received from the LLM is always just TakeAction:
Function(arguments='{"action":{"query":"computer"}}', name='TakeAction')
and the library just instantiates the first type from the Union that it could instantiate.

jxnl Apr 1, 2024
Maintainer Author

its cause if you use union and the json is polymorphic the json is always just {"query": str} it'll never be able to tell

you're better of doing

class search:
query: str
backend: Litera[wiki, britanica]

zby Apr 4, 2024

The problem with these workarounds (both using Union itself and this) is that the LLM capabilities are still the limiting factor and we want to use every ounce of it we can get from the providers. For that we want to use the full LLM api. Instructor constrains the api - it cannot provide a list to the tools argument of the chat.completions.create method - in the name of making the system more type consistent, and then (in the workaround above) you constrain what the tool functions can be themselves. I believe this is taking the wrong side of that trade-off for now. Some day the LLM capabilities will not be the bottleneck - then it will be the time for more abstraction.

Also for type safety you can require that the elements of the tools list are object of sub-classes of a common super-class.

By the way I also still hate how you use patch for adding functionality - but this is an internal thing. It forces you to use low level techniques instead of using standard object-oriented abstractions in your code - but users of your library don't need to care about that. I would definitively use Instructor, instead of writing my own library, if Instructor let me use the full OpenAI api. With my own lib I can perfect all the little things, like not using patch or decoupling schema generation from dispatching - but I don't have time to cover all the cases.

ddebowczyk · 2024-03-29T10:51:34Z

ddebowczyk
Mar 29, 2024

Convenience suggestion: separate endpoints for events:

on(event, lambda event: print(event))

and endpoint(s) that return just unwrapped objects, like I did for php port, e.g.:

on_update(lambda updated_response: print(updated_response))
on_iterable(lambda updated_iterable: print(updated_iterable.last())

on_update - called on any change to response data
on_iterable - called only on every completed iterable item

0 replies

ddebowczyk · 2024-03-29T11:15:02Z

ddebowczyk
Mar 29, 2024

For consideration:

# just return the object
object = Instructor.with_client(Anthropic()).request(...params...).get()

# return stream / partial generator
partials = Instructor.with_client(Anthropic()).request(...params...).partials()
# or
stream = Instructor.with_client(Anthropic()).request(...params...).stream()

# return async promise
promise = Instructor.with_client(Anthropic()).request(...params...).async()

# return raw completion response
response = Instructor.with_client(Anthropic()).request(...params...).raw()

# return the tuple of object + raw completion object (as in your proposal)
response_tuple = Instructor.with_client(Anthropic()).request(...params...).get_with_raw()

# + convenience endpoint

# just return the object - for convenience, the same as ...request(...).get()
response = Instructor.with_client(Anthropic()).respond(...params...)

Also, with_client() should be optional, and use the default client if with_client()
does not provide alternative. This allows for a short, convenient calls:

user = Instructor.request(...).get()
# or
user = Instructor.respond(...)

0 replies

krrishdholakia · 2024-05-07T16:42:22Z

krrishdholakia
May 7, 2024

Hey @jxnl curious, why not use litellm to help unify the format here?

LiteLLM Maintainer

1 reply

jxnl May 7, 2024
Maintainer Author

We have a from_litellm

jonastemplestein · 2024-06-13T13:57:28Z

jonastemplestein
Jun 13, 2024

This looks great - when do you think you'll add this abstraction? I've been thinking of creating a similar abstraction for our project but would prefer to use yours :)

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal for new Instructor API ahead of 1.0 #544

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 11 comments 11 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

Select a reply

Proposal for new Instructor API ahead of 1.0 #544

jxnl Mar 29, 2024 Maintainer

Before

After (Backwards Compat)

After (Backwards Compat)

Replies: 11 comments · 11 replies

jxnl Mar 29, 2024 Maintainer Author

jxnl Mar 29, 2024 Maintainer Author

jxnl Mar 29, 2024 Maintainer Author

jxnl Mar 29, 2024 Maintainer Author

jxnl Mar 29, 2024 Maintainer Author

jxnl Mar 29, 2024 Maintainer Author

jxnl Mar 29, 2024 Maintainer Author

jxnl Apr 1, 2024 Maintainer Author

jxnl May 7, 2024 Maintainer Author

jxnl
Mar 29, 2024
Maintainer

Replies: 11 comments 11 replies

jxnl
Mar 29, 2024
Maintainer Author

jxnl
Mar 29, 2024
Maintainer Author

jxnl Mar 29, 2024
Maintainer Author

jxnl Mar 29, 2024
Maintainer Author

jxnl Mar 29, 2024
Maintainer Author

jxnl Mar 29, 2024
Maintainer Author

jxnl Mar 29, 2024
Maintainer Author

jxnl Apr 1, 2024
Maintainer Author

jxnl May 7, 2024
Maintainer Author