Awesome
EdgeTX - Voice packs
This repository contains the files needed to generate the voice packages used in EdgeTX.
The currently supported languages are:
- Chinese Mandarin
- Chinese Taiwan Mandarin
- Chinese Hongkong Cantonese
- Chilean Spanish
- Czech
- Danish
- English
- French
- German
- Italian
- Japanese
- Portuguese
- Russian
- Spanish
- Swedish
- Ukrainian
The following languages are not yet supported:
- Dutch
- Hungarian
- Slovak
Directory structure
SOUNDS
This folder has the audio files already processed and separated by language.
To use them, the language folder (for example, en
) must be under the SOUNDS
folder of your SDCARD
. With the folder added, go to the EdgeTX settings menu and select the language of the audio language that will be used (eg English).
To use any audio on your switches, first copy the file you want to use to your language folder, then you can use this file in your Global Functions
or Special Functions
by selecting a switch for the function and choosing the Play track
option.
SCRIPTS
Inside the language folder there is a folder called SCRIPTS
, which has audio files for commonly used LUA scripts. These audio files are generated with the same voice as the other audio files of their language pack. Each script has their own folder.
BETAFLIGHT
Audio files for Betaflight TX Lua Scripts. Copy the WAV files from SOUNDS/<lang>/SCRIPTS/BETAFLIGHT/
to SOUNDS/en/
to overwrite the original audio files of the script.
INAV
Audio files for iNav Lua Telemetry Flight Status. Copy the WAV files from SOUNDS/<lang>/SCRIPTS/INAV/
to SCRIPTS/TELEMETRY/iNav/<lang>/
to overwrite the original audio files of the script.
YAAPU
Audio files for Yaapu Telemetry Script and Widget. Copy the WAV files from SOUNDS/<lang>/SCRIPTS/YAAPU/
to SOUNDS/yaapu0/<lang>/
to overwrite the original audio files of the script.
Voices
All of the voices used in the EdgeTX voice packs have been picked from the neural voices offered by Microsoft Azure text to speech service, in order to get as close as possible to human-like voices. If you want to see what voices are available, and try different phrases, check out the online demo generator. Using some recording software, you could even save your own phrases and use them in the voice packs.
Generating custom phrases
If you have a Azure Speech Services subscription (there is a free usage tier), phrases can be generated with curl
or a http
client like postman
. After building a text to speech resource in Azure
you can use it by REST
calls (http
requests).
The request url is:
https://<YOUR_RESOURCE_REGION>.tts.speech.microsoft.com/cognitiveservices/v1
You should add the following headers to your request:
Ocp-Apim-Subscription-Key: <YOUR_RESOURCE_KEY>
Content-Type: application/ssml+xml
X-Microsoft-OutputFormat: riff-8khz-16bit-mono-pcm
Note: EdgeTX supports up to 32khz .wav
file but in that range 8khz is the highest value supported by the conversion service. However, it is possible to select higher quality like riff-48khz-16bit-mono-pcm
and convert to 32khz afterwards with another tool (i.e. ffmpeg -i input.wav -ar 32000 output.wav
) if you want the best possible audio quality.
And in the request body (raw) place your ssml
(change the voice name according to your preference, the full list is here):
<speak version='1.0' xml:lang='en-US'>
<voice xml:lang='en-US' xml:gender='Female' name='en-US-MichelleNeural'>YOUR_PHRASE_HERE</voice>
</speak>
How to build yourself
In order to generate the voice packages and do the release processing, you will need a Linux environment to run in. Ubuntu 18.04 is recommended as it is a LTS release. Newer versions and other flavours of Linux will most likely work also, but are not supported.
You will also need to have ffmpeg
, spx
and ffmpeg-normalize
packages installed.
ffmpeg
is used to clip any silence from the audio files. ffmpeg-normalise
is used to normalise the audio files.
spx
is the tool that generates the audio files using Microsoft Azure Text to Speech processing.
Installing SPX can be a little tricky, but can be installed as follows:
wget https://packages.microsoft.com/config/ubuntu/20.04/packages-microsoft-prod.deb -O packages-microsoft-prod.deb
sudo dpkg -i packages-microsoft-prod.deb
sudo apt-get update; \
sudo apt-get install -y apt-transport-https && \
sudo apt-get update && \
sudo apt-get install -y dotnet-sdk-6.0
dotnet tool install --global Microsoft.CognitiveServices.Speech.CLI
After you have installed SPX, you will also need to create a Microsoft Azure account if you don't have one already. There are both free and paid options, but the free one is sufficient for this purpose - it is just rate limited. After you have done that, follow the quick start guide to configure the required region and subscription keys.
Alternatives
- Mike has created a python script that can be used to generate the audio using Googles Text to Speech service - https://github.com/xsnoopy/edgetx-sdcard-sounds
- The OpenTX Speaker voice generator (Windows only) uses the built in text to speech engine of Microsoft Windows, and can be used to generate new audio also. https://www.open-tx.org/2014/03/15/opentx-speaker
Contributing
See CONTRIBUTING.md