Technology

Appen Provides Private, High-Quality Audio Data to Hugging Face's Open ASR Leaderboard

articleAppen Ltd.May 6, 20264/news/appen-provides-private-high-quality-audio-data-to-hugging-faces-open-asr-leaderboard

About this update from Appen Ltd.

New datasets improve benchmark integrity for a more complete picture of real-world speech recognition performance KIRKLAND, Wash., May 06, 2026 (GLOBE NEWSWIRE) -- Appen Limited (ASX: APX), a leading provider of high-quality data for the AI lifecycle, today announced a collaboration with Hugging Face to bring private, high-quality audio datasets to the Open ASR Leaderboard, one of the most widely used benchmarks in the speech recognition community. Since its launch in September 2023, the Open ASR Leaderboard has been visited more than 700,000 times, underscoring its central role for researchers and enterprises evaluating automatic speech recognition (ASR) models. The leaderboard ranks models by word error rate (WER), a measure of transcription accuracy where lower scores indicate better performance. "The speech AI community has made huge strides in model performance, but the benchmarks used to measure that progress haven't kept pace," said Sergio Bruccoleri, vice president of Delivery at Appen. "Leaderboards only tell the full story when the underlying data reflects how speech technology is actually used. And that's exactly what this collaboration with Hugging Face is all about." As the leaderboard has grown in prominence, so has the risk of "benchmaxxing," the practice of optimizing models specifically to score well on public test sets without achieving equivalent gains in real-world performance. To address this, Appen provides a suite of new, private English-language audio datasets that are incorporated into the leaderboard evaluation framework. Keeping these datasets private makes them significantly harder to game, which increases the trustworthiness of results across the board. What Appen's Datasets AddAppen's contribution covers both scripted and conversational speech across multiple accents, enabling the leaderboard to surface a more nuanced picture of model performance. Specifically, the new private data supports metrics including: These dimensions reflect a core finding from Appen's research: there is no single "catch-all" ASR model. Systems that excel on clean, American-accented audio may underperform on conversational speech or non-native speakers. These new metrics make those tradeoffs visible. “Reliable AI evaluation starts with high-quality data and we’re excited to partner with Appe...

View stock analysis, news, and events for Appen Ltd.