Docker image awsmediatools/livetranscrib - Update for VocabularyFilterName and VocabularyFilterMethod #46

AustinSnow · 2021-06-14T20:59:07Z

Hello AWS Labs,
The Transcribing word filtering isn't working for this repository. It seems the awsmediatools/livetranscrib docker image doesn't have the latest version of transcribe-to-dynamo-withSDK.js script. The ECS log output of this line logs the following:

...
LanguageCode: 'en-US',
MediaEncoding: 'pcm',
MediaSampleRateHertz: 16000,
RequestId: '307db0c6-c708-4988-8202-fbf1239f4ba3',
SessionId: 'da63e9ee-d547-49bd-b0cf-838fad481faa',
TranscriptResultStream: { [Symbol(Symbol.asyncIterator)]: [Function] },
VocabularyName: undefined
...

It should be like the following:
...
LanguageCode: 'en-US',
MediaEncoding: 'pcm',
MediaSampleRateHertz: 16000,
NumberOfChannels: undefined,
RequestId: 'c31323a8-c00a-455d-a217-0ae202bde502',
SessionId: 'b71d30db-d8d9-481d-a244-19eb2c6bd11b',
ShowSpeakerLabel: false,
TranscriptResultStream:
{ [Symbol(Symbol.asyncIterator)]: [AsyncGeneratorFunction: [Symbol.asyncIterator]] },
VocabularyFilterMethod: 'mask',
VocabularyFilterName: 'filter-words-en-US',
VocabularyName: undefined }
...

Can the image be updated?

Thank You
Austin Snow

The text was updated successfully, but these errors were encountered:

AustinSnow · 2021-06-17T11:27:47Z

Hello AWS Labs,
I was able to finally get the word filtering working by creating a updated Docker image, replacing yours (awsmediatools/livetranscribe:v1.1). You may "just" need to update your Docker image. You can find mine on Docker Hub named austinsnow/svuedecstranscribe:latest.

There is also an issue with the transcribe-to-dynamo-withSDK.js where is doesn't run with the latest Node release. The following is the error:

/transcriber/node_modules/@aws-sdk/eventstream-handler-node/dist/cjs/EventStreamPayloadHandler.js:66
throw err;
^
[Error: EAGAIN: resource temporarily unavailable, read] {
errno: -11,
code: 'EAGAIN',
syscall: 'read'
}

And the Load Balancer health check port is missing from your Dockfile. Below is my Dockfile.

# Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved.
# SPDX-License-Identifier: Apache-2.0

# start with version of the node.js with alpine os docker image
# The transcribe-to-dynamo-withSDK.js script doesn't work with Node v12.22.1 or v14.17.1
FROM node:12.14.1-alpine

# create the application directory
RUN mkdir /transcriber
WORKDIR /transcriber

# Install Build Dependencies for the docker image. 
RUN apk add --no-cache --virtual .gyp \
        python3 \
        make \
        g++ \
        ffmpeg

# install application dependencies
RUN npm install aws-sdk aws-signature-v4 query-string sleep websocket bcrypt @aws-sdk/client-transcribe-streaming@gamma @aws-sdk/eventstream-marshaller @aws-sdk/util-utf8-node 

# copy the application files
COPY transcribe-to-dynamo-withSDK.js healthcheck.py run.sh ./

RUN ["chmod", "+x", "run.sh"]

# Expose the port for UDP
EXPOSE 7950/udp

# Expose the health check
EXPOSE 8080/tcp

# Run this inside the docker container
# CMD ./ffmpeg -re -i video.mp4 -f mpegts udp://localhost:7950

# run it when the container starts -- requires environment vars
CMD sh run.sh

Thank You
Austin Snow

eggoynes · 2021-09-22T20:22:09Z

Hello Austin,
Thank you for finding this. We have this in our backlog.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Docker image awsmediatools/livetranscrib - Update for VocabularyFilterName and VocabularyFilterMethod #46

Docker image awsmediatools/livetranscrib - Update for VocabularyFilterName and VocabularyFilterMethod #46

AustinSnow commented Jun 14, 2021

AustinSnow commented Jun 17, 2021

eggoynes commented Sep 22, 2021

Docker image awsmediatools/livetranscrib - Update for VocabularyFilterName and VocabularyFilterMethod #46

Docker image awsmediatools/livetranscrib - Update for VocabularyFilterName and VocabularyFilterMethod #46

Comments

AustinSnow commented Jun 14, 2021

AustinSnow commented Jun 17, 2021

eggoynes commented Sep 22, 2021