Skip to content

v0.1.0-alpha

Latest
Compare
Choose a tag to compare
@MagnusS0 MagnusS0 released this 25 Mar 20:46
· 11 commits to main since this release

Full Changelog: https://github.com/MagnusS0/HuginnHears/commits/v0.1.0-alpha

First release of Huginn Hears

Transcribe audio and summarize text, all locally on your machine using SOTA models.
Huginn Hears is optimized for Norwegian using Nb-Whisper for transcribing, but can also handle English.

Features

  • Audio Transcription: Utilizes the Nb-Whisper speech-to-text model within Faster-Whisper's CTranslate2 engine, offering high accuracy fine-tuned for Norwegian.
  • Text Summarization: Supports any large language model compatible with Llama.cpp for flexible, efficient summarization.
  • Prompt Compression: Features LLMLingua-2 for effective prompt compression, reducing summary generation time.

System Requirements

  • OS: Windows 10/11
    • Install from source for use on other systems.
  • Disk Space: 10GB minimum free space required for installation and models. (Without the models it's around 1GB)
  • RAM: 8 GB RAM is all you need for the lowest settings.
  • GPU: Not supported in the installed version; for improved performance, consider installing from source and use one of the many llama.cpp supported hardware acceleration backends to speed up inference.

Quick Start Guide

1. Download and Install:

  • Choose one of the install options for Huginn Hears and proceed with the download. (Simplest is the .msi installer)
  • Important: Huginn Hears requires the Microsoft Visual C++ Redistributable Package (MSVC) for proper operation. Ensure that MSVC is installed on your computer:
    • If you do not have MSVC installed, you can download it directly from Microsoft's official link.
    • Alternatively, select the version of Huginn Hears that includes MSVC (labelled as "incl-msvc") from the provided download options.

2. Launch Huginn Hears from the location you installed it.

3. Select an audio file from a meeting to transcribe.

  • During the first run, Huginn Hears will automatically download the necessary models (approx. 6-8GB) for transcription and summarization. This initial setup requires an internet connection, but once downloaded, Huginn Hears can operate entirely offline.

4. Choose to summarize the transcript directly or apply llmlingua-2 for prompt compression.

5. Receive a comprehensive summary of your meeting.

Disclaimer: Using this with just CPU is not going to be fast. For speed I recommend installing from source and use one of the hardware acceleration backends to speed up inference from llama.cpp.

Production Use

Currently, Huginn Hears excels as a proof of concept for individual users looking to enhance their productivity with local audio transcription and text summarization. The goal of this version was to make it run on a laptop.

However I've begun exploring with a server setup for model management, which will optimize performance for multiple users by eliminating the need for repeated model loading. This can also allow you to run the model on a machine with optimized hardware, while still allow for the security of locally running everything. This setup would allow teams to utilize a centralized, powerful server—running in their office—for the heavy lifting, while still maintaining the application's operation on their individual machines.

If you feel like you can contribute feel free to reach out.