mirror of https://github.com/albinhenriksson/ljudanteckning.git synced 2026-04-14 07:40:57 +00:00

Make all your media files text searchable.

Find a file

Albin Albinus Henriksson 474fbd8f47 Update project description in README.md Clarified project name in the README.		2026-02-23 23:02:58 +01:00
.vscode	fix(cli): debugging	2026-02-16 11:31:01 +01:00
docs/screenshots	docs(readme): add screenshots	2026-01-09 18:34:23 +01:00
src/ljudanteckning	fix(cli): banner (new name)	2026-02-16 15:13:36 +01:00
tests	refactor!: rename transkriptor to ljudanteckning	2026-02-16 12:00:16 +01:00
.editorconfig	feat: bootstrap CLI, INI config, and packaging	2026-01-09 12:17:48 +01:00
.gitignore	fix: .gitignore	2026-02-16 12:44:13 +01:00
.pre-commit-config.yaml	feat: bootstrap CLI, INI config, and packaging	2026-01-09 12:17:48 +01:00
LICENSE	feat: bootstrap CLI, INI config, and packaging	2026-01-09 12:17:48 +01:00
ljudanteckning.example.ini	refactor!: rename transkriptor to ljudanteckning	2026-02-16 12:00:16 +01:00
pyproject.toml	refactor!: rename transkriptor to ljudanteckning	2026-02-16 12:00:16 +01:00
README.md	Update project description in README.md	2026-02-23 23:02:58 +01:00

README.md

ljudanteckning

Make your media library searchable: scan folders, split audio, transcribe in parallel across NVIDIA GPUs, then export subtitles and searchable text next to each original file.

Screenshots

Here’s what a real run looks like on a multi-GPU box: chunking with FFmpeg, parallel transcription workers, and live GPU telemetry ("transkriptor" was the work-in-progress name of the project).

Ljudanteckning transcribing chunks across multiple NVIDIA GPUs with live telemetry.

More screenshots

Chunking stage

Ljudanteckning chunking audio using FFmpeg.

Finished outputs next to media

Ljudanteckning finished run showing exported subtitles and transcript files next to media.

Goals

Recursively discover media files (audio + video)
Extract audio and chunk into short WAV segments (FFmpeg)
Transcribe chunks in parallel across N GPUs (faster-whisper / CTranslate2)
Merge timestamps correctly and export:
- .srt
- .vtt
- .whisper.json
- .txt
Clean up temp files after success
Nice terminal UX:
- colored output + banner
- interactive TUI mode (later)
- batch mode (--nocli)

Requirements

Ljudanteckning is designed for NVIDIA GPUs and gets its best performance with CUDA. If you have multiple GPUs available, it can run several transcription workers in parallel.

Core requirements

Linux
Python 3.10+
FFmpeg (ffmpeg and ffprobe)

GPU requirements (recommended / expected)

One or more NVIDIA GPUs
NVIDIA proprietary driver installed (so nvidia-smi works)
CUDA runtime libraries compatible with your driver
(installing the CUDA toolkit/runtime is often the simplest way to get these)

Optional

NVML Python bindings for nicer GPU telemetry in the UI (pip install -e ".[nvml]")

Quick check

nvidia-smi

If that command works and your GPUs show up, you’re ready to run GPU-accelerated transcription.

Install (developer setup)

Arch Linux (fish)

# 1) System packages
sudo pacman -Syu
sudo pacman -S --needed git ffmpeg python python-pip nvidia-utils

# 2) GPU sanity check
# If this fails, fix your NVIDIA driver installation first.
nvidia-smi

# 3) Clone + virtualenv + editable install
git clone https://github.com/albinhenriksson/ljudanteckning.git
cd ljudanteckning

python -m venv .venv
source .venv/bin/activate.fish

python -m pip install -U pip
pip install -e ".[dev,nvml]"

ljudanteckning --help

Debian / Ubuntu (bash)

# 1) System packages
sudo apt-get update
sudo apt-get install -y git ffmpeg python3 python3-venv python3-pip

# 2) GPU sanity check
# If `nvidia-smi` is not found, install the NVIDIA driver/utils for your distro first.
nvidia-smi

# 3) Clone + virtualenv + editable install
git clone https://github.com/albinhenriksson/ljudanteckning.git
cd ljudanteckning

python3 -m venv .venv
source .venv/bin/activate

python -m pip install -U pip
pip install -e ".[dev,nvml]"

ljudanteckning --help

Fedora (bash)

# 1) System packages
sudo dnf install -y git ffmpeg python3 python3-pip

# 2) GPU sanity check
# If `nvidia-smi` is not found, install the NVIDIA driver/utils for your distro first.
nvidia-smi

# 3) Clone + virtualenv + editable install
git clone https://github.com/albinhenriksson/ljudanteckning.git
cd ljudanteckning

python3 -m venv .venv
source .venv/bin/activate

python -m pip install -U pip
pip install -e ".[dev,nvml]"

ljudanteckning --help

Quick start

Note: CLI flags shown below are implemented progressively while the pipeline is being built. Run ljudanteckning run --help to see what’s available in your installed version.

1) Create a local config (recommended)

cp ljudanteckning.example.ini ljudanteckning.ini

(Optional) sanity-check what Ljudanteckning will use:

ljudanteckning config

2) Transcribe a single file

ljudanteckning run --nocli "/mnt/media/Mr.Robot.S01E01.1080p.BluRay.mkv"

3) Transcribe an entire directory (recursive)

ljudanteckning run --nocli "/mnt/media"

4) Pick GPUs explicitly (multi-GPU)

Example with 4 GPUs:

ljudanteckning run --nocli --gpus "0,1,2,3" --jobs 4 "/mnt/media"

Example for an 11-GPU box:

ljudanteckning run --nocli --gpus "0,1,2,3,4,5,6,7,8,9,10" --jobs 11 "/mnt/media"

5) Override model / language

ljudanteckning run --nocli --model large-v3 --language sv "/mnt/media"

Outputs

For each media file, Ljudanteckning writes transcript/subtitle files next to the original, e.g.:

Movie.mkv
Movie.srt
Movie.vtt
Movie.txt
Movie.whisper.json

Temporary chunks and per-chunk JSON are stored in a hidden work folder next to the file (by default: .ljudanteckning/) and are cleaned up according to output.cleanup in your INI.

README.md Unescape Escape