Implementation of SSML markup for controlling intonation and pauses. SSML (Speech Synthesis Markup Language) is an XML dialect for precise control of pronunciation, pauses, intonation, and tempo in TTS. A W3C standard, supported by Google TTS, Azure, Amazon Polly, and Yandex Speech
Kit. ### Basic SSML tags```xml
Добрый день.
Ваш заказ
Компания
Позвоните
<mstts:express-as style="customerservice">
Чем могу помочь?
</mstts:express-as>
### SSML generator for IVRpython
from xml.etree.ElementTree import Element, SubElement, tostring
class SSMLBuilder: def init(self, lang: str = "ru-RU"): self.speak = Element("speak", { "version": "1.0", "xmlns": "http://www.w3.org/2001/10/synthesis", "xml:lang": lang })
def add_text(self, text: str) -> "SSMLBuilder":
self.speak.text = (self.speak.text or "") + text
return self
def add_pause(self, ms: int) -> "SSMLBuilder":
br = SubElement(self.speak, "break", {"time": f"{ms}ms"})
return self
def add_prosody(self, text: str, rate: str = "medium",
pitch: str = "medium") -> "SSMLBuilder":
p = SubElement(self.speak, "prosody",
{"rate": rate, "pitch": pitch})
p.text = text
return self
def build(self) -> str:
return tostring(self.speak, encoding="unicode")
Использование
ssml = (SSMLBuilder()
.add_text("Добрый день! ")
.add_pause(300)
.add_prosody("Ваш баланс составляет пять тысяч рублей.", rate="slow")
.build())
```### SSML Support by Provider | Tag | Google | Azure | AWS Polly | Yandex | |-----|--------|--------|-----------|---------| <break> | ✓ | ✓ | ✓ | ✓ | | <prosody> | ✓ | ✓ | ✓ | partially | | <say-as> | ✓ | ✓ | ✓ | limited | | <phoneme> | ✓ | ✓ | ✓ | — | | <emphasis> | ✓ | ✓ | ✓ | — | | <lang> | ✓ | ✓ | — | — | Timeframe: SSML template development for IVR — 2–3 days. SSML generator for dynamic responses — 3–5 days.







