-
Notifications
You must be signed in to change notification settings - Fork 229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
✨ Image input #389
✨ Image input #389
Conversation
- Add TimerObjectContainer class with methods to manage user objects - Implement add_object, get_objects, and clear_objects methods - Ensure objects are added, retrieved, and cleared based on time limits
WalkthroughThe recent update brings significant enhancements to the system's functionality. It introduces a new entity, Changes
Recent Review DetailsConfiguration used: CodeRabbit UI Files selected for processing (1)
Additional Context UsedLanguageTool (66)
Additional comments not posted (3)
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (invoked as PR comments)
Additionally, you can add CodeRabbit Configration File (
|
- Fix indentation and spacing issues - Add missing import statement - Update method signatures
Add AssistantMessage type for Slack replies in the receiver module.
- Fixed indentation in telegram sender init file for better readability.
Adds a utility function to resize images for OpenAI models based on specified mode (low, high, auto). Images are resized to meet specific dimension requirements. - Resizes images to 512x512 for 'low' mode - Resizes images based on length and width limits for 'high' mode - Automatically resizes images for 'auto' mode https://platform.openai.com/docs/guides/vision
Adds a utility function to resize images for OpenAI models based on specified mode (low, high, auto). Images are resized to meet specific dimension requirements. - Resizes images to 512x512 for 'low' mode - Resizes images based on length and width limits for 'high' mode - Automatically resizes images for 'auto' mode https://platform.openai.com/docs/guides/vision
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
def resize_openai_image( | ||
image_bytes: bytes, mode: Literal["low", "high", "auto"] = "auto" | ||
) -> bytes: | ||
""" | ||
如果是 low,缩放到 512*512。如果是 high,图像的短边应小于 768 像素,长边应小于 2,000 像素,如果大于这个尺寸,按比例缩放到这个尺寸,长宽比不变。如果是 Auto,如果尺寸大于512但是小于 768,缩放到 512,如果长边大于2000或宽边大于768,按照比例缩放到合适的尺寸 | ||
https://platform.openai.com/docs/guides/vision | ||
:param image_bytes: 图片的二进制数据 | ||
:param mode: 模式 | ||
:return: 处理后的图片二进制数据 | ||
""" | ||
# 将 bytes 转换为图片对象 | ||
image = Image.open(BytesIO(image_bytes)) | ||
# 获取图片的尺寸 | ||
width, height = image.size | ||
# 限定尺寸的阈值 | ||
limit_small = 512 | ||
limit_short = 768 | ||
limit_long = 2000 | ||
# 决定是否需要改变图片尺寸的标记 | ||
resize_flag = False | ||
new_size = width, height | ||
if mode == "low": | ||
if max(width, height) > limit_small: | ||
new_size = limit_small, limit_small | ||
resize_flag = True | ||
elif mode == "high": | ||
if min(width, height) > limit_short or max(width, height) > limit_long: | ||
new_size = min(limit_short, width), min(limit_long, height) | ||
resize_flag = True | ||
elif mode == "auto": | ||
if limit_small < max(width, height) < limit_short: | ||
new_size = limit_small, limit_small | ||
resize_flag = True | ||
elif min(width, height) > limit_short or max(width, height) > limit_long: | ||
new_size = min(limit_short, width), min(limit_long, height) | ||
resize_flag = True | ||
if resize_flag: | ||
image.thumbnail(new_size, Image.Resampling.BICUBIC) | ||
bytes_io = BytesIO() | ||
image.save(bytes_io, format="PNG") | ||
bytes_return = bytes_io.getvalue() | ||
return bytes_return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add validation for the mode
parameter in resize_openai_image
.
+ if mode not in ["low", "high", "auto"]:
+ raise ValueError("Invalid mode. Allowed values are 'low', 'high', 'auto'.")
Consider handling exceptions that may arise from Image.open
and image.save
to enhance the robustness of this function.
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
def resize_openai_image( | |
image_bytes: bytes, mode: Literal["low", "high", "auto"] = "auto" | |
) -> bytes: | |
""" | |
如果是 low,缩放到 512*512。如果是 high,图像的短边应小于 768 像素,长边应小于 2,000 像素,如果大于这个尺寸,按比例缩放到这个尺寸,长宽比不变。如果是 Auto,如果尺寸大于512但是小于 768,缩放到 512,如果长边大于2000或宽边大于768,按照比例缩放到合适的尺寸 | |
https://platform.openai.com/docs/guides/vision | |
:param image_bytes: 图片的二进制数据 | |
:param mode: 模式 | |
:return: 处理后的图片二进制数据 | |
""" | |
# 将 bytes 转换为图片对象 | |
image = Image.open(BytesIO(image_bytes)) | |
# 获取图片的尺寸 | |
width, height = image.size | |
# 限定尺寸的阈值 | |
limit_small = 512 | |
limit_short = 768 | |
limit_long = 2000 | |
# 决定是否需要改变图片尺寸的标记 | |
resize_flag = False | |
new_size = width, height | |
if mode == "low": | |
if max(width, height) > limit_small: | |
new_size = limit_small, limit_small | |
resize_flag = True | |
elif mode == "high": | |
if min(width, height) > limit_short or max(width, height) > limit_long: | |
new_size = min(limit_short, width), min(limit_long, height) | |
resize_flag = True | |
elif mode == "auto": | |
if limit_small < max(width, height) < limit_short: | |
new_size = limit_small, limit_small | |
resize_flag = True | |
elif min(width, height) > limit_short or max(width, height) > limit_long: | |
new_size = min(limit_short, width), min(limit_long, height) | |
resize_flag = True | |
if resize_flag: | |
image.thumbnail(new_size, Image.Resampling.BICUBIC) | |
bytes_io = BytesIO() | |
image.save(bytes_io, format="PNG") | |
bytes_return = bytes_io.getvalue() | |
return bytes_return | |
def resize_openai_image( | |
image_bytes: bytes, mode: Literal["low", "high", "auto"] = "auto" | |
) -> bytes: | |
""" | |
如果是 low,缩放到 512*512。如果是 high,图像的短边应小于 768 像素,长边应小于 2,000 像素,如果大于这个尺寸,按比例缩放到这个尺寸,长宽比不变。如果是 Auto,如果尺寸大于512但是小于 768,缩放到 512,如果长边大于2000或宽边大于768,按照比例缩放到合适的尺寸 | |
https://platform.openai.com/docs/guides/vision | |
:param image_bytes: 图片的二进制数据 | |
:param mode: 模式 | |
:return: 处理后的图片二进制数据 | |
""" | |
if mode not in ["low", "high", "auto"]: | |
raise ValueError("Invalid mode. Allowed values are 'low', 'high', 'auto'.") | |
# 将 bytes 转换为图片对象 | |
image = Image.open(BytesIO(image_bytes)) | |
# 获取图片的尺寸 | |
width, height = image.size | |
# 限定尺寸的阈值 | |
limit_small = 512 | |
limit_short = 768 | |
limit_long = 2000 | |
# 决定是否需要改变图片尺寸的标记 | |
resize_flag = False | |
new_size = width, height | |
if mode == "low": | |
if max(width, height) > limit_small: | |
new_size = limit_small, limit_small | |
resize_flag = True | |
elif mode == "high": | |
if min(width, height) > limit_short or max(width, height) > limit_long: | |
new_size = min(limit_short, width), min(limit_long, height) | |
resize_flag = True | |
elif mode == "auto": | |
if limit_small < max(width, height) < limit_short: | |
new_size = limit_small, limit_small | |
resize_flag = True | |
elif min(width, height) > limit_short or max(width, height) > limit_long: | |
new_size = min(limit_short, width), min(limit_long, height) | |
resize_flag = True | |
if resize_flag: | |
image.thumbnail(new_size, Image.Resampling.BICUBIC) | |
bytes_io = BytesIO() | |
image.save(bytes_io, format="PNG") | |
bytes_return = bytes_io.getvalue() | |
return bytes_return |
llmkira/openai/request.py
Outdated
@model_validator(mode="after") | ||
def check_vision(self): | ||
if not self.model.startswith(("gpt-4-vision", "gpt-4-turbo", "claude-3")): | ||
logger.info( | ||
"Remove the image content part from the messages, because the model is not supported." | ||
) | ||
for message in self.messages: | ||
if isinstance(message, UserMessage) and isinstance( | ||
message.content, list | ||
): | ||
message.content = [ | ||
content | ||
for content in message.content | ||
if content.type != "image_url" | ||
] | ||
return self |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refactor the image content removal logic in the check_vision
method for efficiency.
+ message.content = [content for content in message.content if not isinstance(content, ImageContent)]
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
@model_validator(mode="after") | |
def check_vision(self): | |
if not self.model.startswith(("gpt-4-vision", "gpt-4-turbo", "claude-3")): | |
logger.info( | |
"Remove the image content part from the messages, because the model is not supported." | |
) | |
for message in self.messages: | |
if isinstance(message, UserMessage) and isinstance( | |
message.content, list | |
): | |
message.content = [ | |
content | |
for content in message.content | |
if content.type != "image_url" | |
] | |
return self | |
@model_validator(mode="after") | |
def check_vision(self): | |
if not self.model.startswith(("gpt-4-vision", "gpt-4-turbo", "claude-3")): | |
logger.info( | |
"Remove the image content part from the messages, because the model is not supported." | |
) | |
for message in self.messages: | |
if isinstance(message, UserMessage) and isinstance( | |
message.content, list | |
): | |
message.content = [content for content in message.content if not isinstance(content, ImageContent)] | |
return self |
llmkira/openai/cell.py
Outdated
def add_image( | ||
self, | ||
image_url: Union[str, bytes], | ||
detail: Literal["low", "high", "auto"] = "auto", | ||
): | ||
self.content.append(ContentPart.create_image(url=image_url, detail=detail)) | ||
return self |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optimize the add_image
method in the UserMessage
class to handle image resizing more efficiently.
+ # Optimize resizing by directly using the resized image URL
+ url = f"data:image/jpeg;base64,{base64.b64encode(url).decode('utf-8')}"
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
def add_image( | |
self, | |
image_url: Union[str, bytes], | |
detail: Literal["low", "high", "auto"] = "auto", | |
): | |
self.content.append(ContentPart.create_image(url=image_url, detail=detail)) | |
return self | |
def add_image( | |
self, | |
image_url: Union[str, bytes], | |
detail: Literal["low", "high", "auto"] = "auto", | |
): | |
# Optimize resizing by directly using the resized image URL | |
url = f"data:image/jpeg;base64,{base64.b64encode(image_url).decode('utf-8')}" | |
self.content.append(ContentPart.create_image(url=url, detail=detail)) | |
return self |
Summary by CodeRabbit