ElevenLabs Unveils Scribe: A Game-Changer in Speech-to-Text Technology
Techcrunch•10 months ago•
950

ElevenLabs Unveils Scribe: A Game-Changer in Speech-to-Text Technology

Technology
ai
speechrecognition
technology
startups
innovation
Share this content:

Summary:

  • ElevenLabs launches Scribe, a new speech-to-text model.

  • Over 99 languages supported, with 25 languages achieving excellent accuracy.

  • Scribe outperforms Google Gemini 2.0 Flash and Whisper Large V3 in tests.

  • Features include smart speaker diarization and word-level timestamps.

  • Pricing set at $0.40 per hour for transcriptions.

ElevenLabs Takes a Leap with Scribe

ElevenLabs, an AI startup that recently secured a $180 million mega-funding round, is making waves in the tech world with the launch of its stand-alone speech-to-text model called Scribe. Previously recognized for its audio-generation capabilities, the company is now venturing into speech detection, aiming to compete with industry giants like Gladia, Speechmatics, AssemblyAI, Deepgram, and OpenAI’s Whisper.

Key Features of Scribe

Scribe boasts support for over 99 languages at launch, with 25 languages categorized under the excellent accuracy tier, maintaining a word error rate of less than 5%. Notable languages with high accuracy include English (97%), French, German, Hindi, Spanish, and more.

The model has shown superior performance compared to Google Gemini 2.0 Flash and Whisper Large V3 in various benchmark tests, such as FLEURS and Common Voice.

Image Credits: ElevenLabs

Enhanced Speech Detection

For the first time, ElevenLabs is releasing a dedicated speech detection model. CEO Mati Staniszewski emphasized the company's commitment to enhancing the understanding of conversational speech. He stated, “Speech-to-text is often seen as a solved problem, but we believe there is room for improvement, especially for many languages.”

Advanced Features

Scribe includes innovative features such as:

  • Smart speaker diarization to identify speakers
  • Word-level timestamps for precise subtitles
  • Auto-tagging for sound events, like audience laughter

Currently, Scribe is designed for pre-recorded audio and will soon offer a low-latency real-time version, making it ideal for live transcriptions.

Competitive Pricing

The pricing for Scribe stands at $0.40 per hour of transcribed audio, which is competitive but may not be the lowest in the market. Some competitors currently offer lower rates with varying features.

Stay tuned for more updates as ElevenLabs continues to innovate in the field of AI-driven speech technologies.

Comments

0
0/300
Newsletter

Subscribe our newsletter to receive our daily digested news

Join our newsletter and get the latest updates delivered straight to your inbox.

ListMyStartup.app logo

ListMyStartup.app

Get ListMyStartup.app on your phone!