Skip to content

[GitHub Trending] microsoft/VibeVoice

7.1 relevance
Score Breakdown
technical depth
7
novelty
8
actionability
7
community
6
strategic
7
personal
7

Scored daily by a customisable AI persona to surface the most relevant engineering leadership news.

Open-source frontier voice AI from Microsoft is novel and relevant for AI agent interfaces.

AI/ML github.com
Open-Source Frontier Voice AI. Contribute to microsoft/VibeVoice development by creating an account on GitHub.
Summary

Microsoft open-sourced VibeVoice, a family of frontier voice AI models including VibeVoice-ASR (60-minute single-pass speech-to-text with speaker diarization and timestamps) and VibeVoice-TTS (90-minute multi-speaker text-to-speech, accepted as ICLR 2026 Oral). Core innovations include continuous speech tokenizers at 7.5 Hz and a next-token diffusion framework with an LLM for context understanding. The ASR model is now integrated into Hugging Face Transformers, supports 50+ languages, and offers vLLM inference for faster processing.

Author

microsoft