We have proposed an image generation approach using speech emotion recognition. We have used VQGAN and CLIP models in tandem to generate an image from a text prompt from speech and emotions recognized from the speech spectrogram. We see that the achieved accuracy of the emotion recognition model was about 75%.
-
Notifications
You must be signed in to change notification settings - Fork 0
We have proposed an image generation approach using speech emotion recognition. We have used VQGAN and CLIP models in tandem to generate an image from a text prompt from speech and emotions recognized from the speech spectrogram. We see that the achieved accuracy of the emotion recognition model was about 75%.
ImSourin/Art-generation-using-speech-emotions
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
We have proposed an image generation approach using speech emotion recognition. We have used VQGAN and CLIP models in tandem to generate an image from a text prompt from speech and emotions recognized from the speech spectrogram. We see that the achieved accuracy of the emotion recognition model was about 75%.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published