[DNM] Complex image type handling #1801

isaacbmiller · 2024-11-15T02:33:11Z

Inspired by #1763 and to close #1767.

Goal is to support images inside arbitrary pydantic objects and to eventually make it easy to add support for other modalities.

JehandadK · 2024-11-25T08:25:27Z

examples/vlm/mmmu.ipynb

    "class ColorSignature(dspy.Signature):\n",
    "    \"\"\"Output the color of the designated image.\"\"\"\n",
-    "    image_1: dspy.Image = dspy.InputField(desc=\"An image\")\n",
-    "    image_2: dspy.Image = dspy.InputField(desc=\"An image\")\n",
+    "    images: List[dspy.Image] = dspy.InputField(desc=\"An image\")\n",


Suggested change

" images: List[dspy.Image] = dspy.InputField(desc=\"An image\")\n",

" images: List[dspy.Image] = dspy.InputField(desc=\"A list of images\")\n",

JehandadK · 2024-11-25T08:46:34Z

Thanks for the PR @isaacbmiller. I tried this branch with this signature and sharing relevant logs here with you:

class CreateTitleAndDescription(dspy.Signature):
    """Taking in a list of images and a category tree to output title, description and attributes."""

    images: List[dspy.Image] = dspy.InputField(desc="list of images about the item")
    categories: str = dspy.InputField(desc="category tree of the item")
    title: str = dspy.OutputField(desc=title_desc)
    description: str = dspy.OutputField(desc=description_desc)

    {
      'role': 'user',
      'content': '[[ ## images ## ]]\n["https://static-sd.mercdn.net/photos/m34859626959_1.jpg", "https://static-sd.mercdn.net/photos/m34859626959_2.jpg", "https://static-sd.mercdn.net/photos/m34859626959_3.jpg"]\n\n[[ ## categories ## ]]\nゲーム・おもちゃ・グッズ > トレーディングカード\n\nRespond with the corresponding output fields, starting with the field `[[ ## title ## ]]`, then `[[ ## description ## ]]`, and then ending with the marker for `[[ ## completed ## ]]`.'
    }

isaacbmiller added 5 commits November 11, 2024 12:24

Fix dataset download

cdf92e0

WIP complex image types

849284d

WIP fixing complex images

845f7e2

Refactor chat_adapter somewhat working

37f8c04

Ruff fixes

b8da48e

isaacbmiller marked this pull request as draft November 15, 2024 02:35

remove print and update notebooks

d31c63d

JehandadK reviewed Nov 25, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DNM] Complex image type handling #1801

[DNM] Complex image type handling #1801

isaacbmiller commented Nov 15, 2024

JehandadK Nov 25, 2024

JehandadK commented Nov 25, 2024

	" images: List[dspy.Image] = dspy.InputField(desc=\"An image\")\n",
	" images: List[dspy.Image] = dspy.InputField(desc=\"A list of images\")\n",

[DNM] Complex image type handling #1801

Are you sure you want to change the base?

[DNM] Complex image type handling #1801

Conversation

isaacbmiller commented Nov 15, 2024

JehandadK Nov 25, 2024

Choose a reason for hiding this comment

JehandadK commented Nov 25, 2024