Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatic Text-to-poster pipeline PosterGen 打算什么时候release #7

Open
hitlxm opened this issue Oct 9, 2024 · 7 comments
Open

Comments

@hitlxm
Copy link

hitlxm commented Oct 9, 2024

如上

@posterllava
Copy link
Owner

Thanks for your attention! We will update the paper on arXiv before Nov. 2024 —and possibly the same for the tutorial code.

@hitlxm
Copy link
Author

hitlxm commented Oct 11, 2024

Thanks for your attention! We will update the paper on arXiv before Nov. 2024 —and possibly the same for the tutorial code.

另外,现在并不支持c+s -> p,你们有any plan吗?

@posterllava
Copy link
Owner

posterllava commented Oct 11, 2024

我们目前的settings是与CGL和posterlayout对齐,但我们的方法本质上兼容c->s+p, c+s->p甚至none->c+s+p,这只需要针对你的需求对输入数据做不同的mask,我们默认是mask掉输入中的size和position(作为输出),保留category(作为输入)。按需求修改data/qbposter/get_prompt.py中的procss_json函数即可。

@hitlxm
Copy link
Author

hitlxm commented Oct 12, 2024

我们目前的settings是与CGL和posterlayout对齐,但我们的方法本质上兼容c->s+p, c+s->p甚至none->c+s+p,这只需要针对你的需求对输入数据做不同的mask,我们默认是mask掉输入中的size和position(作为输出),保留category(作为输入)。按需求修改data/qbposter/get_prompt.py中的procss_json函数即可。

所以prompt是这样吗?
"value": "\nHello! Could you please help me to place 4 foreground elements over the background image of resolution [768, 768] to craft an aesthetically pleasing, harmonious, balanced, and visually appealing commercial poster?\nFinding semantic-meaningful objects or visual foci on the background image at first might help in designing, and you should avoid any unnecessary blocking of them. logo needs to be placed at the top of the background image. \nFor each layout, there are 3 additional user requirements and you are expected to generate a layout corresponding to them. Here is the user requirements: The layout should ensure that there is clear space around both text elements. text elements should be aligned vertically with consistent horizontal margins.\nPlease return the result by completing the following JSON file. Each element's location and size should be represented by a bounding box described as [left, top, width, height], and each number is a continuous digit from 0 to 1.\nHere is the initial JSON file: [{'label': 'text', 'box': [None,None,100,30]}, {'label': 'text', 'box': [None,None,100,30]}, {'label': 'text', 'box': [None,None,100,30]}, {'label': 'logo', 'box': [None,None,30,30]}]\n"

@hitlxm
Copy link
Author

hitlxm commented Oct 12, 2024

模版里面是[left, top, right, bottom], 换成[left, top, width, height],之后,输出结果就看着比较奇怪,貌似输出结果还是[left, top, right, bottom]这种模式?
[left, top, right, bottom] + c -> s +p 模式
sg-11134249-7rdvc-lzpusqheht6v79_ori_vis_with_uc

[left, top, width, height] + c + s-> p 模式
prompt:
"value": "\nHello! Could you please help me to place 4 foreground elements over the background image of resolution [768, 768] to craft an aesthetically pleasing, harmonious, balanced, and visually appealing commercial poster?\nFinding semantic-meaningful objects or visual foci on the background image at first might help in designing, and you should avoid any unnecessary blocking of them. logo needs to be placed at the top of the background image. \nFor each layout, there are 3 additional user requirements and you are expected to generate a layout corresponding to them. Here is the user requirements: The layout should ensure that there is clear space around both text elements. text elements should be aligned vertically with consistent horizontal margins.\nPlease return the result by completing the following JSON file. Each element's location and size should be represented by a bounding box described as [left, top, width, height], and each number is a continuous digit from 0 to 1.\nHere is the initial JSON file: [{'label': 'text', 'box': [None, None, 0.13020833333333334, 0.0390625]}, {'label': 'text', 'box': [None, None, 0.13020833333333334, 0.0390625]}, {'label': 'text', 'box': [None, None, 0.13020833333333334, 0.0390625]}, {'label': 'logo', 'box': [None, None, 0.0390625, 0.0390625]}]\n"
SeaTalk_IMG_20241012_101524

@KiedaTamashi
Copy link

@hitlxm hi你解决这个了么?[left, top, width, height]模式是可用的么?

@flt-ime
Copy link

flt-ime commented Feb 21, 2025

Thanks for your attention! We will update the paper on arXiv before Nov. 2024 —and possibly the same for the tutorial code.

Could you please let us know when you release the pre-print paper on arXiv? Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants