Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for uploading multiple images or provide detailed images to update the code #492

Open
Liao1 opened this issue Feb 18, 2025 · 2 comments

Comments

@Liao1
Copy link

Liao1 commented Feb 18, 2025

Hi team, appreciate your work which is awesome!

Is your feature request related to a problem? Please describe.
I ran it locally and played with it using Claude Sonnet 3.5 API, I found that the quality of generated code depends on the complexity of the provided image, for example, if I provide the screenshot of the whole page, the generated code can only show main sections of the page, but if I provide a screenshot which only contains the first section of the page, there is a higher degree of restoration. I tried to update the site using different prompts, it works, but it also need a lot of efforts to find and describe the different of generated page and original page.

Describe the solution you'd like
So I'm thinking whether it helps or not to allow us to provide an image of a specific section of the site and update it accordingly. The whole process will be:

  1. Provide the complete image and generate the basic structure of the code
  2. Update the code using images of different sections to achieve higher degree of restoration.
  3. Repeat step#2 until we get a satisfied version and update the page manually.

Describe alternatives you've considered
An alternative solution is to allow to provide multiple images to provide more information to the model, not sure which solution is better.

@Liao1
Copy link
Author

Liao1 commented Feb 18, 2025

Actually I tried to upload video but got errors, which is Exception in ASGI application, similar to some of the previous issues.

@abi
Copy link
Owner

abi commented Feb 18, 2025

Hi @Liao1 thanks for the suggestion. I have explored splitting a web page up in order to get better results. While it works okay, it's not as great as when a model fully generates the entire web page (it's more consistent with colors, fonts, etc.). The newer models like O1 and Gemini are getting much better at large web pages so in the near future, we'll switch to those models to solve this problem. Claude 4 is probably also coming soon and should be a lot better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants