This page will walk you through how to use WSL's Respellings system to shape a word's pronunciation.
WSL's respellings system allows you to give precise pronunciation instructions to the AI.
Respellings let you format a word within the text body param in a manner that tells the AI exactly what sound each syllable should make—and which syllables should be emphasized.
A respelling is composed of the following components:
- each individual syllable, spelled using consonants and vowels representing specific sounds (please reference the Respellings Chart for those mappings)
- a dash separating each individual syllable
- at least one capitalized syllable to indicate word stress
- unstressed syllables remain in lowercase
Every respelling must be contained within two colons.
For example, "tuple" is respelled as ::TUH-puhl::. The noun form of "live" is respelled as ::LYV::.
To make a call using a respelling, simply include the formatted respelling in your raw data input in the text body parameter:
curl --location --request POST 'https://api.wellsaidlabs.com/v1/tts/stream' \
--header 'X-Api-Key: YOUR_API_KEY' \
--header 'Accept: audio/mpeg' \
--header 'Content-Type: application/json' \
--data-raw '{
"text": "Coming in ::LYV:: from Seattle!",
"speaker_id": "3"
}' > live.mp3
Respellings Reference Chart
Vowels | Consonants | |||||||
---|---|---|---|---|---|---|---|---|