Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds audio querying to MultimodalQ&A Example #1225

Merged
merged 20 commits into from
Dec 12, 2024

Conversation

mhbuehler
Copy link
Contributor

@mhbuehler mhbuehler commented Dec 4, 2024

Description

Adds the option to speak into the browser microphone or upload an audio file to the main query tab. The speech audio is POSTed as a base64 string to the MultimodalQnA gateway where it is transcribed to text by the ASR whisper service and then passed on to the LVM service.

Note: This PR has been updated so that the changes to the megaservice are included in this repo in multimodalqna.py, so a corresponding PR to GenAIComps is not needed.

Issues

This is a part of the MultimodalQnA Image & Audio Enhancements RFC

Type of change

  • New feature (non-breaking change which adds new functionality)

Dependencies

N/A

Tests

New tests were added to the example's README.md, test_compose_on_xeon.sh, and test_compose_on_gaudi.sh.

okhleif-IL and others added 5 commits December 2, 2024 15:40
Signed-off-by: okhleif-IL <[email protected]>

* validated, updated tests

Signed-off-by: okhleif-IL <[email protected]>

* added one more curl test for audio

Signed-off-by: okhleif-IL <[email protected]>

* fixed typo

Signed-off-by: okhleif-IL <[email protected]>

* reverted git clone command

Signed-off-by: okhleif-IL <[email protected]>

* added ASR test

Signed-off-by: okhleif-IL <[email protected]>

* fixed command with backslashes

Signed-off-by: okhleif-IL <[email protected]>

---------

Signed-off-by: Melanie Buehler <[email protected]>
Signed-off-by: okhleif-IL <[email protected]>
Signed-off-by: dmsuehir <[email protected]>
* MMQnA doc update correcting ASR and whisper image names

Signed-off-by: dmsuehir <[email protected]>

* Add image tags

Signed-off-by: dmsuehir <[email protected]>

---------

Signed-off-by: dmsuehir <[email protected]>
* Enabled audio query functionality in the MultimodalQnA UI

Signed-off-by: Melanie Buehler <[email protected]>
@ashahba ashahba added WIP r1.2 OPEA 1.2 RELEASE TAG labels Dec 4, 2024
@ashahba ashahba added this to the v1.2 milestone Dec 4, 2024
Copy link
Collaborator

@ashahba ashahba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but don't merge please unless the branch is changed to point to main again.

mhbuehler and others added 3 commits December 5, 2024 14:42
* Add services to tests and correct small text error

Signed-off-by: Melanie Buehler <[email protected]>

* Revert unintended changes

Signed-off-by: Melanie Buehler <[email protected]>

---------

Signed-off-by: Melanie Buehler <[email protected]>
Signed-off-by: Melanie Buehler <[email protected]>
Copy link
Collaborator

@ashahba ashahba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Holding off on this PR until opea-project/GenAIComps#974 is merged.

mhbuehler and others added 6 commits December 6, 2024 16:48
Signed-off-by: Melanie Buehler <[email protected]>
Update repo clones for MultimodalQnA E2E tests
* Moved gateway changes to multimodalqna.py

Signed-off-by: okhleif-IL <[email protected]>

* reverted port changes

Signed-off-by: okhleif-IL <[email protected]>

* addressed review comments

Signed-off-by: okhleif-IL <[email protected]>

* reverted print statement

Signed-off-by: okhleif-IL <[email protected]>

---------

Signed-off-by: okhleif-IL <[email protected]>
@mhbuehler mhbuehler changed the title Adds audio querying to MultimodalQ&A UI Adds audio querying to MultimodalQ&A Example Dec 10, 2024
okhleif-IL and others added 2 commits December 10, 2024 14:54
* Moved gateway changes to multimodalqna.py

Signed-off-by: okhleif-IL <[email protected]>

* reverted port changes

Signed-off-by: okhleif-IL <[email protected]>

* addressed review comments

Signed-off-by: okhleif-IL <[email protected]>

* reverted print statement

Signed-off-by: okhleif-IL <[email protected]>

* removed proxies

Signed-off-by: okhleif-IL <[email protected]>

---------

Signed-off-by: okhleif-IL <[email protected]>
@ashahba ashahba removed the WIP label Dec 10, 2024
Copy link
Collaborator

@ashahba ashahba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@lvliang-intel lvliang-intel merged commit c760cac into opea-project:main Dec 12, 2024
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
r1.2 OPEA 1.2 RELEASE TAG
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants