Create transcript of any Audio file , it uses sliding window to create small audio chunks from large audio, Uses Hugging Face Transformer for transcript generation . For further enhancement it also capture troublesome overlapping regions
Classify underlying 7 types of basic emotions(for both male & female,so total 14 labels) on an audio clip using CNN-LSTM network, It also identifies the confusing type of emotion
TimeFM (https://github.com/google-research/timesfm) a pretrained time-series foundation model developed by Google Research for time-series forecasting. Used it to predict the Future price of a stock