Speech-to-Text (STT)

Edited

STT is responsible for:

Converting caller audio into transcript text
Detecting end-of-turn pauses
Supporting language detection
Providing confidence signals

Transcript accuracy directly affects:

Intent identification
Question sequencing
Location capture
Summary quality

STT Configuration Elements

Depending on deployment, configurable elements may include:

Language model selection
End-of-turn sensitivity
Smart formatting options
Confidence thresholds

Improper STT configuration can lead to:

Misidentified intent
Excessive clarification
Conversation instability

How to Validate STT Performance

Place controlled test call.
Speak clearly structured data (address, phone number).
Review transcript in Triage.
Confirm:
- Numbers formatted correctly
- No missing words
- No repeated fragments
Compare transcript against actual speech.

Frequent transcript errors may indicate provider or tuning issues.

STT