Skip to content

A pipeline architecture for temporal segmentation of video lectures.

License

Notifications You must be signed in to change notification settings

eduardorochasoares/easytopic

Repository files navigation

easytopic 👨‍🏫

Description

easytopic is a versatile architecture for automatically performing topic segmentation on video lectures.

Video lectures are very popular nowadays. Following the new teaching trends, people are increasingly seeking educational videos on the web for the most different purposes: learn something new, review content for exams or just out of curiosity. Unfortunately, finding specific content in this type of video is not an easy task. Many video lectures are extensive and cover several topics, and not all of these topics are relevant to the user who has found the video. The result is that the user spends so much time trying to find topic of interest in the middle of content irrelevant to him. The temporal segmentation of video lectures in topics can solve this problem allowing users to navigate of a non-linear way through all topics of a video lecture. However, temporal video lecture segmentation is not an easy task and needs to be automatized.

And that's where easytopic comes in. The architecture provides the entire processing pipeline from feature extraction to timestamp topic detection. The use of only extracted audio features automatically makes it a versatile approach that can be employed in a large universe of video lectures, as it does not depend on the availability of other sources such as slide shows, textbooks, or subtitle manually generated.

This architecture is derived from my master's thesis and you can check the Publications section for aditional information.

Understanding easytopic

easytopic is a software architecture that implements the entire processing pipeline to segment video lectures into topics. The approach implemented by easytopic uses only features from the audio track of video lectures to perform segmentation. This makes our approach very versatile that can be applied to different types of video lectures without relying on any resources such as subtitles, slides or textbooks.

Our architecture is composed of several modules where each one is responsible for one stage of processing:

  • API REST: architecture entrypoint where video lectures are sent to be processed.
  • RabbitMQ: message broker responsible for the management of the processing queues that are consumed by the processing workers
  • Audio Extractor: responsible for extract the audio track from the input video
  • Voice Activity Detector: detects and split the audio into fully voiced chunks, minimizing the silence times
  • ASR: Automatic Speech Recognition module
  • Acoustic Feature Extractor: extracts low level features (pitch, volume, pause rates, etc) from audio chunks
  • Flow Aggregator: aggregates the feature extraction results to be used by the topic segmentation module
  • Topic Segmentation: module that segments the video lecture based on the features extracted
  • PostgreSQL: Database used to store metadata from processing
  • MongoDB: Database used to store the binary files from processing

The communication schema is shown below: communication

Also, the processing pipeline of a video lecture in our architecture is given by the diagram below: workflow

Requirements

All you need is Docker and Docker Compose to run easytopic. To install it, just follow the guide for your OS:

Running dockerized version

⚠️This guide is valid only to Unix-based distributions. If you have Windows, some commands may be different

First of all, we need to get some models used by our architecture.

The first is the automatic speech recognition model trained using the Kaldi toolkit.

To do this, open the Terminal program in your computer. Then, execute the following commands:

sudo mkdir /media/kaldi_models
cd /media/kaldi_models
wget https://phon.ioc.ee/~tanela/tedlium_nnet_ms_sp_online.tgz
tar -zxvf tedlium_nnet_ms_sp_online.tgz

Next, we need to download the Word2Vec model used by our segmentation algorithm. For this, follow the instructions:

sudo mkdir /media/word2vec
cd /media/word2vec

wget --save-cookies cookies.txt --keep-session-cookies --no-check-certificate 'https://docs.google.com/uc?export=download&id=0B7XkCwpI5KDYNlNUTTlSS21pQmM' -O- | sed -rn 's/.*confirm=([0-9A-Za-z_]+).*/Code: \1\n/p'

This will generate a ID as output. So, you have to type the following command:

Make sure to replace the confirm ID with yours.

wget --load-cookies cookies.txt 'https://docs.google.com/uc?export=download&confirm=YOURCODEID&id=0B7XkCwpI5KDYNlNUTTlSS21pQmM' -O GoogleNews-vectors-negative300.bin.gz

After these steps, all you need is to extract the word2vec model:

gunzip GoogleNews-vectors-negative300.bin.gz

Now, you have downloaded the models required by the architecture, you need to clone or download the repository.

git clone https://github.com/eduardorochasoares/easytopic

If you opted for the download, unzip the file and change into your downloads directory in the Terminal. Otherwise, just change into easytopic's folder:

cd easytopic

Lastly, to bring up the architecthure just run the command below. It launches two all containers that compose the architecture.

docker-compose up

Pratical example

So that you can test the easytopic architecture, we provide a very simple example where a HTTP client sends a video lecture for processing by the architecture and waits for processing to finish before printing the result to the terminal.

To run this example, make sure you have follow the instructions from the previous section.

First of all, we going to bring-up the architecture. Open the Terminal program in your computer and execute the following commands:

cd easytopic

docker-compose up

Once the architecture has been initialized and all containers are running, open a new Terminal tab and enter the "example" folder.

cd easytopic/example

Before running the script segment.py We'll need to install some python3 libraries:

You can install them simply by running:

pip3 install -r requirements.txt

Now, we can finally run our example locally:

python3 segment.py

or if the architecture is already running in a remote server, you can just do that:

python3 segment.py --server_ip SERVER_IP

If everything is okay, it should print something like this:

Sleep before checking again...
Sleep before checking again...
Sleep before checking again...
Sleep before checking again...
Sleep before checking again...
Sleep before checking again...
Job completed: {'topics': [0, 166.77000000000555]}
Sleep before checking again...
All jobs done:
[{'video': 'data/0NgCZKFEmGU.mp4', 'segmentation': {'topics': [0, 166.77000000000555]}}]

The topics object from the output structure is a list of timestamps correspondent to the beginning time of topics of a video lecture.

Publications

About

A pipeline architecture for temporal segmentation of video lectures.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published