Society22

SONORA Belarusian Community Project Seeks Those Who Can Help 'Connect' Belarusian Voice to BigTech

In the "TTS in Belarusian" channel, an announcement appeared about the launch of SONORA. This is a Belarusian community project where participants are recording an open Belarusian dataset for TTS — "so that Belarusian sounds natural in modern AI services," writes the Telegram channel Dzik Pic.

Illustrative photo. Photo: freepik.com

What's the problem? As stated on the website, today there are almost no high-quality Belarusian voice datasets specifically recorded for training modern TTS models. At the same time, the Belarusian language has thousands of homographs: identical spelling but different meanings depending on the stress. If the model makes a mistake with stress, both the pronunciation and the meaning are broken.

The second problem is phonetic correctness: softness, 'ў', 'дз/дж', intonation, and rhythm:

«Without high-quality studio material, models repeat errors and sound less natural».

Yes, similar initiatives already exist today, for example, Donar.by or the BexTTS model. But Sonora will continue their path and «bring it to a studio level».

Therefore, enthusiasts have launched crowdfunding to «organize professional studio recording of Belarusian speech using specially selected texts».

Researchers, enthusiasts, startups, and educational initiatives will be able to use the dataset.

In addition, the team plans partnerships with Google, OpenAI, and ElevenLabs — «so that our dataset strengthens their solutions for Belarusians».

Specifically, SONORA is looking for warm intros / direct contacts at:

  • Google;
  • OpenAI;
  • ElevenLabs;
  • Speechify;
  • Meta.

«If you know anyone in these companies and can make an intro — please write to us privately, or write a request on our behalf yourself. The text of the request is here», — the creators ask.

You can listen to how the Belarusian language sounds in technologies today on the homepage.

Comments2

  • ???
    09.03.2026
    Чаму напрыклад гэтым не займаецца Каардынацыйная Рада? Ці гендарныя квоты і лгбт важней чым беларуская мова?
  • Чарговы веласіпед
    10.03.2026
    Так а ў чым розніца паміж дадзеным праектам і Donar.by? Donar.by ж спецыяльна пашыраў функцыянал Common Voice, каб былі датасэты не толькі на распазнаванне гаворкі, але і на агучку.

    Ну а наконт "амаль не існуе якасных беларускамоўных галасавых датасэтаў", то наогул хлусня, на адным толькі Common Voice на дадзены момант запісана і праверана 1800 гадзін ад больш як 8000 чалавек. Такія вялікія датасэты ў свабодным доступе мала для якой мовы існуюць.

Now reading

Dzmitry from Olshany criticized Beryoza stewed meat. He was given many explanations in response 9

Dzmitry from Olshany criticized Beryoza stewed meat. He was given many explanations in response

All news →
All news

Confrontation with the Pentagon intensified competition among leading artificial intelligence developers 5

Belarusian Woman Released on December 13 Didn't Know She Was Pregnant: She Found Out About the Child in Her Fifth Month. 7

Green in Skala Shopping Center in Minsk might close

How Donald Trump's Wealth Changed During a Year of Presidency 2

In Ukraine, a banker published a photo of a client who was undergoing verification «against the background of the Russian flag» and got into a scandal50

Criminal case opened against Katsiaryna Vadanosava for rehabilitation of Nazism 48

"Sometimes I say I work in the beauty industry": how a Belarusian woman does makeup for the deceased 1

Zelenskyy: Ukraine now has 'cards', and everyone understood this 4

Tsapkala says he discussed steps in the US if Trump's efforts to free Belarusian political prisoners fail 4

больш чытаных навін
больш лайканых навін

Dzmitry from Olshany criticized Beryoza stewed meat. He was given many explanations in response 9

Dzmitry from Olshany criticized Beryoza stewed meat. He was given many explanations in response

Main
All news →

Заўвага:

 

 

 

 

Закрыць Паведаміць