You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Would it be possible to implement Image2SFX (https://huggingface.co/spaces/fffiloni/Image2SFX-comparison)? Especially with the possibility of comparing different models. Probably even have a multiple-choice UX where you can select the models you would like to use.
Thank you!
The text was updated successfully, but these errors were encountered:
So, basically, all this is doing is using some kosmos API to do a caption of the image, and then feeding that to one of the audiogen models.
As such, this would feel like a great opportunity for an extension to be created that leverages one/more LLMs to create the caption...similar to my smartprocess extension for Auto1111.
Load the image, pick a LLM to do the captioning, feed it into one of the musicGen models...
Would it be possible to implement Image2SFX (https://huggingface.co/spaces/fffiloni/Image2SFX-comparison)? Especially with the possibility of comparing different models. Probably even have a multiple-choice UX where you can select the models you would like to use.
Thank you!
The text was updated successfully, but these errors were encountered: