Skip to content

pua_dialoginfo: fix early state lifetime and clean up dangling early branches#3915

Merged
bogdan-iancu merged 1 commit into
OpenSIPS:masterfrom
NormB:fix/pua-dialoginfo-early-lifetime
Jun 11, 2026
Merged

pua_dialoginfo: fix early state lifetime and clean up dangling early branches#3915
bogdan-iancu merged 1 commit into
OpenSIPS:masterfrom
NormB:fix/pua-dialoginfo-early-lifetime

Conversation

@NormB

@NormB NormB commented Jun 10, 2026

Copy link
Copy Markdown
Member

Fixes #3802, as requested in #3802 (comment).

This is the patch from the Feb 13 comment, tested by @andrewyager in pre-prod/prod against 3.4, rebased onto current master with the following adjustments:

  • The failure-handler loop now iterates only the last set of active branches (t->first_branch .. t->nr_of_outgoings) instead of all MAX_BRANCHES, per @bogdan-iancu's review comment.
  • Adapted to the new branch bitmask API from f0223cc (BRANCH_BM_TST_IDX/BRANCH_BM_SET_IDX instead of raw long long shifts) and to TM_BRANCH().
  • The handler now also sets bitmask_failed for each branch it terminates, so a late-arriving negative reply on that branch cannot publish a duplicate "terminated".
  • The handler honors the DLG_PUB_A/DLG_PUB_B publishing flags, consistent with the rest of the callback.
  • No locking in the failure path: TMCB_ON_FAILURE already runs under the transaction's reply lock (via t_should_relay_response() -> run_failure_handlers()), so the bitmask test/set is safe and re-taking t->reply_mutex would deadlock.

Summary of the fix

  1. On 1xx replies, use DEFAULT_CREATED_LIFETIME for the published state lifetime instead of fr_timer.time_out - get_ticks() — the fr_timer is updated only after TMCB_RESPONSE_IN runs, so a provisional reply arriving close to the timer expiration produced a near-zero lifetime and orphaned presence records.
  2. Hook on TMCB_ON_FAILURE and publish "terminated" for any branch that entered "early" state but never received a final negative reply, so the longer lifetime cannot leave a dangling "early" state when the UAS goes silent after ringing.

Re-validation on current master

Re-ran the scenarios from the issue against this rebased patch (master @ 4ddb8f0, loopback, fr_timeout=10, fr_inv_timeout=15):

  • Late 180 (8s delay): early state published with expires=3600 (previously 7s remainder); 200 OK then publishes confirmed with the dialog lifetime — original bug fixed.
  • Silent UAS (180 then nothing): on fr_inv_timeout, "terminated" with expires=0 is published for both directions before the failure route runs, and pua deletes the record from its hash table — no dangling early state.
  • Successful call: no spurious "terminated" from the failure handler.

…branches

The expire for "trying"/"early" publications was computed from the
remainder of the branch fr_timer, which is updated only AFTER the
TMCB_RESPONSE_IN callback runs. A provisional reply arriving close to
the timer expiration produced a near-zero lifetime, so the published
state expired before the dialog module could update it, leaving orphan
presence records. Use DEFAULT_CREATED_LIFETIME instead.

As the longer lifetime may leave a dangling "early" state when a branch
never receives a final negative reply (e.g. UAS goes silent after
ringing), also hook on TMCB_ON_FAILURE and publish "terminated" for any
branch from the last set of active branches which entered early state
but never failed.

Closes OpenSIPS#3802
@bogdan-iancu bogdan-iancu merged commit 6755c14 into OpenSIPS:master Jun 11, 2026
168 of 176 checks passed
@bogdan-iancu

Copy link
Copy Markdown
Member

Thank you @NormB

@bogdan-iancu bogdan-iancu self-assigned this Jun 11, 2026
@bogdan-iancu bogdan-iancu added this to the 4.0.0-rc1 milestone Jun 11, 2026
bogdan-iancu added a commit that referenced this pull request Jun 11, 2026
pua_dialoginfo: fix early state lifetime and clean up dangling early branches

(cherry picked from commit 6755c14)
bogdan-iancu added a commit that referenced this pull request Jun 11, 2026
pua_dialoginfo: fix early state lifetime and clean up dangling early branches

(cherry picked from commit 6755c14)
@andrewyager

Copy link
Copy Markdown

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] pua_dialoginfo: TM callback uses FR timer remainder as presence lifetime, causing orphan records

3 participants