Skip to content

Buy Me a Coffee

whisper-vtt2srt Icon

whisper-vtt2srt

A robust, production-grade library designed to convert WebVTT to SRT, turning messy AI transcripts into clean, usable subtitles.

A post-processing tool designed to clean the output from OpenAI Whisper, YouTube Auto-Captions, and other AI transcription services.
Perfect for TTS pipelines, video dubbing, and dataset preparation.

License: MIT Python 3.10+ Code Style: Black PRs Welcome Issues PyPI version


Why whisper-vtt2srt?

Unlike simple regex-based converters, this tool allows for intelligent cleaning strategies specifically engineered to handle the chaotic output of modern AI transcription services like OpenAI Whisper.

Key Features ๐Ÿš€

  • ๐Ÿ›ก๏ธ Stabilization Strategy: Intelligently detects and merges accumulating text blocks ("Karaoke Effect").
  • ๐ŸŽต Sound Description Removal: Automatically filters out [Music], [Applause], etc.
  • ๐Ÿงน Glitch Filtering: Removes imperceptible <50ms blocks.
  • โœจ Smart Normalization: Strips VTT metadata (align:start, <c>, <b>, <i>) and cleans whitespace.
  • โšก Zero Dependencies: Built with pure Python standard library.
  • ๐Ÿ”ง Configurable Strictness: Every cleaning step is optional.

Installation

pip install whisper-vtt2srt

Quick Start

CLI

# Convert a single file
whisper-vtt2srt video.vtt

# Convert a folder recursively
whisper-vtt2srt ./videos --recursive

Python

from whisper_vtt2srt import Pipeline

pipeline = Pipeline()

with open("video.vtt", "r", encoding="utf-8") as f:
    srt_content = pipeline.convert(f.read())