Skip to content

feat(soniox): support stt-rt-v5 with endpoint_sensitivity option#6126

Merged
tinalenguyen merged 3 commits into
livekit:mainfrom
mihafabcic-soniox:feat/soniox-stt-v5
Jun 18, 2026
Merged

feat(soniox): support stt-rt-v5 with endpoint_sensitivity option#6126
tinalenguyen merged 3 commits into
livekit:mainfrom
mihafabcic-soniox:feat/soniox-stt-v5

Conversation

@mihafabcic-soniox

Copy link
Copy Markdown
Contributor

Updates the LiveKit Soniox plugin for the v5 model.

  • Add endpoint_sensitivity to STTOptions (float | None, range -1.0 to 1.0). Controls how quickly the model commits endpoints. Higher values finalize sooner. Only supported by v5; earlier models reject the field. Skipped on the wire when None so the server uses its default.
  • Default STT model is now stt-rt-v5.
  • Default max_endpoint_delay_ms raised from 500 (the API minimum) to 2000. The old default was too aggressive on phone-call audio: short pauses between word or digit groups would cause Soniox to finalize a segment too early, before the model had enough context. 2000 matches the Soniox API's own default.

2000ms is the Soniox API's own default and works well in practice. The
previous value of 500ms, the API minimum, is too aggressive and can
cause word recognition issues when the model finalizes tokens too early.
@mihafabcic-soniox mihafabcic-soniox requested a review from a team as a code owner June 16, 2026 15:00

@devin-ai-integration devin-ai-integration Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 potential issue.

Open in Devin Review

enable_language_identification: bool = True

max_endpoint_delay_ms: int = 500
max_endpoint_delay_ms: int = 2000

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 Breaking default change: max_endpoint_delay_ms 500 → 2000

The default max_endpoint_delay_ms changed from 500 to 2000. This is a 4× increase in the maximum endpoint detection delay, meaning existing users who rely on the default will experience noticeably later speech finalization. While this appears intentional for the v5 model, it is a behavioral breaking change for any caller that constructs STTOptions() without explicitly setting this field. The livekit-plugins-inworld plugin references soniox/stt-rt-v4 in comments (livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/stt.py:55) — that plugin may also need updating if it depends on these defaults.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

@tinalenguyen tinalenguyen left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, left a small comment

@@ -121,17 +119,25 @@ class STTOptions:
enable_speaker_diarization: bool = False
enable_language_identification: bool = True

max_endpoint_delay_ms: int = 500
max_endpoint_delay_ms: int = 2000

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mihafabcic-soniox could you provide more context on why you changed the default here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some context is in the PR description. The change came from a real customer who hit transcription issues from endpoints firing too aggressively at 500ms (e.g. extra digits added to phone numbers). Bumping to 2000ms (the Soniox API's own default) resolved them.

@tinalenguyen tinalenguyen left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you for the PR!

@tinalenguyen tinalenguyen merged commit b681172 into livekit:main Jun 18, 2026
19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants