VietTTS is an open-source toolkit providing the community with a powerful Vietnamese TTS model, capable of natural voice synthesis and robust voice cloning. Designed for effective experimentation, ...
Abstract: We present a novel Automatic Speech Recognition (ASR) dataset for the Oromo language, a widely spoken language in Ethiopia and neighboring regions. The dataset was collected through a ...
Abstract: Robust automatic speech recognition (ASR) in packet loss and noisy environments remains a significant challenge. Large pretrained transformer models have made notable strides in improving ...
Sagalee dataset released under the CC BY-NC 4.0 International license, a summary of the license can be found here, and the full license can be found here. finetune_whisper.py is used to fine tune ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results