Skip to content

Making tools

Gene Kogan edited this page Nov 17, 2024 · 5 revisions

Tools are defined by three things.

  • config.yaml - defines the interface for the tool, including args, metadata, and handler.
  • handler.py - code for running the tool.
  • test.json - example args for testing.

An example of a tool is found under /eve/tools/example_tool.

ComfyUI and Replicate handlers are provided. Tools which use either of these do not need their own handler.py.

Defining the config

The top-level fields for a tool config include the following:

  • key : Unique key to the tool which is referenced in tol calls. [required]
  • name : Display name of the tool. [required]
  • description : A short description of the tool which is presentable to users. [required]
  • tip : Additional description of the tool which is given to LLM at runtime to teach it how to use the tool expertly. Can be more verbose and technical. Generally not for user consumption.
  • output_type : The output type of the tool. Choices are [bool, str, int, float, image, video, audio, lora]`. [required]
  • cost_estimate : A string formula representing the cost of a tool call in terms of its own parameters, e.g. 0.5 * n_frames. [required]
  • base_model : Choices are [sd15, sdxl, sd3, flux-dev, flux-schnell
  • resolutions : An array of preset resolutions given as strings, e.g. [16-9_2048x1152, 3-2_1782x1024, 1-1_1024x1024, 2-3_1024x1782, 9-16_1152x2048]
  • status : Either inactive (completely unavailable), stage (available only on stage), prod (available on stage and prod). Default is stage.
  • visible : If the tool is listed on Tools UI. Default is true (visible).
  • allowlist : A list of user IDs allowed to access the endpoint. If not set, all users are allowed.
  • handler : Which handler to use to run the tool. Choices are [local, modal, comfyui, replicate, gcp]. Default is local, which means the tool needs its own handler.py.

Defining the parameters

The last top-level field parameters is a dictionary that defines all parameters to the model.

Parameters have the following sub-fields.

  • type : The data type of the parameter. Choices are [str, int, float, bool, array, object, image, video, audio, or lora]. [required]
  • label : A human-readable label for the parameter, for display. [required]
  • description : A short description of the parameter's purpose or usage, presentable to users. [required]
  • tip : Optional aditional or more verbose description of the parameter, intended for detailed guidance.
  • example : An example value for the parameter. You usually don't need this.
  • required : Whether the parameter must be explicitly set by a user / tool caller.
  • default : The default value assigned to the parameter if none is provided.
  • choices : Specifies a predefined set of acceptable values for the parameter (applies to str, int, float, and bool types).
  • choices_labels : Human-readable labels for the choices specified in choices. For UI display.
  • minimum : Minimum allowable value for numeric parameters (int or float).
  • maximum : Maximum allowable value for numeric parameters (int or float).
  • step : Minimum increment between minimum and maximum.

Preset tools

A tool may be defined as a preset of another tool. A preset generally narrows the api interfaces and uses its parent tool's handler code. An example of a preset is Style Transfer.

To make a preset, specify the parent tool key in a field in the config parent_tool, and overwrite any parameter fields you want to redescribe or constrain options to.

For example, the following will take the parameters of the parent tool, and overwrite the label, description, tip, and required fields of the parent.

parameters:
  prompt:
    label: Prompt
    description: Describe the style you want to use
    tip: |-
      This should be a description of the style you want to use!
    required: true

ComfyUI tools

For info on actually building ComfyUI workflows as tools, see here.

ComfyUI tools are those that run a single ComfyUI workflow. They are defined with handler: comfyui in the config and have several additional requirements:

  • They are bundled inside of workspaces in the workflows repo. Each workspace has a snapshot.json exported from ComfyUI and a downloads.json with pre-defined downloads needed to run the workspace.
  • Each tool also contains a workflow_api.json exported from ComfyUI in dev mode.

ComfyUI configs have the following additional fields:

  • comfyui_output_node_id : The output node of the main output for the workflow. [required]
  • comfyui_intermediate_outputs : An optional dictionary which defines additional outputs to save as "intermediate outputs", given a node id and a name, e.g. controlnet_signal: 323

Additionally, parameters in ComfyUI config files have a comfyui field which defines how to inject the arg into the workflow. For example:

  prompt
    label: Prompt
    description: Describe an image
    type: str
    required: true
    comfyui: 
      node_id: 370
      field: inputs
      subfield: body

This instructs the tool to inject the prompt arg into the workflow json at workflow[370]["inputs"]["body"].

The comfyui field also has two lesser used sub-fields which are occasionally useful, remap and preprocessing.

  • preprocessing defines custom code to pre-process the arg before injecting it into the node. For example, preprocessing: csv will convert a string array to a comma-separated string, and preprocessing: folder will convert an array of files into a temp folder with the files in it.
  • remap allows you to do secondary injection of the args to other nodes, using a remap dictionary that substitutes the value to a different. For example:
  comfyui:
      node_id: 406
      field: inputs
      subfield: preprocessor
      remap:
      - node_id: 107
        field: inputs
        subfield: control_net_name
        map:
          AnyLineArtPreprocessor_aux: control_v11p_sd15_lineart.pth
          CannyEdgePreprocessor: control_v11p_sd15_canny.pth
          DensePosePreprocessor: control_v11p_sd15_openpose.pth
          DepthAnythingV2Preprocessor: control_v11f1p_sd15_depth.pth
          Scribble_XDoG_Preprocessor: control_v11p_sd15_scribble.pth
          none: controlnetQRPatternQR_v2Sd15.safetensors

This inserts the value to node 406 as usual, but will also inject it to workflow[107]["inputs"]["control_net_name"] with a different value corresponding to the map. For example, if the arg inserted to node 406 is CannyEdgePreprocessor, then control_v11p_sd15_canny.pth is inserted to node 107.

Replicate Tools

Replicate Tools are those which are deployed on Replicate as cogs.

To add a Replicate endpoint as a tool to Eve, make a tool with handler: replicate as the handler field, and add the following additional fields.

  • replicate_model : Reference to Replicate model, e.g. (edenartlab/sdxl-lora-trainer). [required]
  • version : Optional hash of the model to use. If not provided, will use a deployment or latest.
  • output_handler : Post-processing handler to coerce Replicate outputs into tool result output. normal (where the output is a single media url or array of them) is fine for most cases. eden output handler is used for Eden's older structs (including files and thumbnail arrays), and trainer is used specifically for sdxl-lora-trainer. If the output is different from this, another output handler can be easily added.

GCP Tools

GCP tools are those deployed on Google Cloud and are currently just used for Flux Lora Trainer.

Clone this wiki locally