Adds audio querying to MultimodalQ&A Example #1225

mhbuehler · 2024-12-04T23:02:02Z

Description

Adds the option to speak into the browser microphone or upload an audio file to the main query tab. The speech audio is POSTed as a base64 string to the MultimodalQnA gateway where it is transcribed to text by the ASR whisper service and then passed on to the LVM service.

Note: This PR has been updated so that the changes to the megaservice are included in this repo in multimodalqna.py, so a corresponding PR to GenAIComps is not needed.

Issues

This is a part of the MultimodalQnA Image & Audio Enhancements RFC

Type of change

New feature (non-breaking change which adds new functionality)

Dependencies

N/A

Tests

New tests were added to the example's README.md, test_compose_on_xeon.sh, and test_compose_on_gaudi.sh.

Signed-off-by: okhleif-IL <[email protected]> * validated, updated tests Signed-off-by: okhleif-IL <[email protected]> * added one more curl test for audio Signed-off-by: okhleif-IL <[email protected]> * fixed typo Signed-off-by: okhleif-IL <[email protected]> * reverted git clone command Signed-off-by: okhleif-IL <[email protected]> * added ASR test Signed-off-by: okhleif-IL <[email protected]> * fixed command with backslashes Signed-off-by: okhleif-IL <[email protected]> --------- Signed-off-by: Melanie Buehler <[email protected]> Signed-off-by: okhleif-IL <[email protected]> Signed-off-by: dmsuehir <[email protected]>

* MMQnA doc update correcting ASR and whisper image names Signed-off-by: dmsuehir <[email protected]> * Add image tags Signed-off-by: dmsuehir <[email protected]> --------- Signed-off-by: dmsuehir <[email protected]>

* Enabled audio query functionality in the MultimodalQnA UI Signed-off-by: Melanie Buehler <[email protected]>

Signed-off-by: Melanie Buehler <[email protected]>

Temporarily redirect clones for tests

for more information, see https://pre-commit.ci

MultimodalQnA/tests/test_compose_on_xeon.sh

ashahba

LGTM but don't merge please unless the branch is changed to point to main again.

* Add services to tests and correct small text error Signed-off-by: Melanie Buehler <[email protected]> * Revert unintended changes Signed-off-by: Melanie Buehler <[email protected]> --------- Signed-off-by: Melanie Buehler <[email protected]>

Signed-off-by: Melanie Buehler <[email protected]>

ashahba

Holding off on this PR until opea-project/GenAIComps#974 is merged.

Fixed build.yaml inconsistency

Signed-off-by: Melanie Buehler <[email protected]>

Update repo clones for MultimodalQnA E2E tests

* Moved gateway changes to multimodalqna.py Signed-off-by: okhleif-IL <[email protected]> * reverted port changes Signed-off-by: okhleif-IL <[email protected]> * addressed review comments Signed-off-by: okhleif-IL <[email protected]> * reverted print statement Signed-off-by: okhleif-IL <[email protected]> --------- Signed-off-by: okhleif-IL <[email protected]>

* Moved gateway changes to multimodalqna.py Signed-off-by: okhleif-IL <[email protected]> * reverted port changes Signed-off-by: okhleif-IL <[email protected]> * addressed review comments Signed-off-by: okhleif-IL <[email protected]> * reverted print statement Signed-off-by: okhleif-IL <[email protected]> * removed proxies Signed-off-by: okhleif-IL <[email protected]> --------- Signed-off-by: okhleif-IL <[email protected]>

ashahba

LGTM!

okhleif-IL and others added 5 commits December 2, 2024 15:40

MMQnA doc update correcting ASR and whisper image names (#24)

22a5b15

* MMQnA doc update correcting ASR and whisper image names Signed-off-by: dmsuehir <[email protected]> * Add image tags Signed-off-by: dmsuehir <[email protected]> --------- Signed-off-by: dmsuehir <[email protected]>

Integrate audio query into UI (#22)

84ec278

* Enabled audio query functionality in the MultimodalQnA UI Signed-off-by: Melanie Buehler <[email protected]>

Temporarily redirect clones for tests

c9fe70e

Signed-off-by: Melanie Buehler <[email protected]>

Merge pull request #25 from mhbuehler/melanie/redirect_clones_for_tests

7f7236d

Temporarily redirect clones for tests

mhbuehler requested a review from lvliang-intel as a code owner December 4, 2024 23:02

mhbuehler and others added 2 commits December 4, 2024 15:02

Merge branch 'main' into mmqna-audio-query

56db11a

[pre-commit.ci] auto fixes from pre-commit.com hooks

f67146f

for more information, see https://pre-commit.ci

mhbuehler mentioned this pull request Dec 4, 2024

Adds audio querying to MultimodalQ&A gateway opea-project/GenAIComps#974

Closed

1 task

ashahba added WIP r1.2 OPEA 1.2 RELEASE TAG labels Dec 4, 2024

ashahba added this to the v1.2 milestone Dec 4, 2024

chensuyue reviewed Dec 5, 2024

View reviewed changes

MultimodalQnA/tests/test_compose_on_xeon.sh Outdated Show resolved Hide resolved

chensuyue reviewed Dec 5, 2024

View reviewed changes

MultimodalQnA/tests/test_compose_on_xeon.sh Outdated Show resolved Hide resolved

Merge branch 'main' into mmqna-audio-query

fdf5a08

ashahba approved these changes Dec 5, 2024

View reviewed changes

mhbuehler and others added 3 commits December 5, 2024 14:42

Fixed build.yaml inconsistency

30e33a6

Signed-off-by: Melanie Buehler <[email protected]>

Merge branch 'main' into mmqna-audio-query

9ba341b

ashahba requested changes Dec 6, 2024

View reviewed changes

mhbuehler and others added 6 commits December 6, 2024 16:48

Merge pull request #27 from mhbuehler/melanie/whisper_image_name

54c82ac

Fixed build.yaml inconsistency

Merge branch 'main' into mmqna-audio-query

bcabb36

Update repo clones for E2E tests

02b87b0

Signed-off-by: Melanie Buehler <[email protected]>

Merge pull request #30 from mhbuehler/melanie/revert_clones

c421e68

Update repo clones for MultimodalQnA E2E tests

Merge branch 'main' into mmqna-audio-query

674c975

mhbuehler changed the title ~~Adds audio querying to MultimodalQ&A UI~~ Adds audio querying to MultimodalQ&A Example Dec 10, 2024

okhleif-IL and others added 2 commits December 10, 2024 14:54

Merge branch 'main' into mmqna-audio-query

ba1fd52

ashahba removed the WIP label Dec 10, 2024

ashahba approved these changes Dec 10, 2024

View reviewed changes

Merge branch 'main' into mmqna-audio-query

55585ab

lvliang-intel approved these changes Dec 12, 2024

View reviewed changes

lvliang-intel merged commit c760cac into opea-project:main Dec 12, 2024
18 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adds audio querying to MultimodalQ&A Example #1225

Adds audio querying to MultimodalQ&A Example #1225

mhbuehler commented Dec 4, 2024 •

edited

Loading

ashahba left a comment

ashahba left a comment

ashahba left a comment

Adds audio querying to MultimodalQ&A Example #1225

Adds audio querying to MultimodalQ&A Example #1225

Conversation

mhbuehler commented Dec 4, 2024 • edited Loading

Description

Issues

Type of change

Dependencies

Tests

ashahba left a comment

Choose a reason for hiding this comment

ashahba left a comment

Choose a reason for hiding this comment

ashahba left a comment

Choose a reason for hiding this comment

mhbuehler commented Dec 4, 2024 •

edited

Loading