The three StoppableQueueBlockingRunnable threads (MainLoop, NetworkSender, PortStorageWriter) block in interruptible_get on a queue.get() with no timeout. By design this get() is indefinitely blocking so the engine only advances when a real message or the RUNNABLE_STOP marker arrives, and the Scala side mirrors this. That design is intentional and should be preserved: the loop must not return to receive() on a quiet queue.
This is a defensive-hardening request, not a report of a reproducible failure. As the code stands today, stop() reliably sets and delivers the RUNNABLE_STOP marker through correctly-locked queues, so no shutdown hang has been observed or reproduced in CI. The goal is to decouple "stop was requested" from "the marker wakeup was delivered," so that a future change (a new stop path that forgets the marker, or a change to the queue's notify logic) cannot silently reintroduce a shutdown hang.
Proposed approach (data path and blocking semantics preserved):
- Add a
threading.Event stop flag; stop() sets it in addition to enqueueing the marker.
interruptible_get polls with a short timeout and treats queue.Empty as "loop and wait again" (continue), so it never returns control to receive() / the handling loop on a timeout. It only returns on a real item, or raises InterruptRunnable when the flag is set. This keeps the indefinite-blocking semantics for the data path intact while ensuring a stop request is honored within one poll interval even if the single marker wakeup were ever missed.
- Thread an optional
timeout through Getable.get -> InternalQueue.get -> LinkedBlockingMultiQueue.get (the last via Condition.wait_for). Default timeout=None keeps the existing blocking behavior unchanged; only the stoppable threads opt into polling.
Open question: whether this hardening is worth the added surface area (a timeout on a queue that is intentionally infinite-blocking) given no failure has been reproduced. See discussion in #5326.
The three
StoppableQueueBlockingRunnablethreads (MainLoop,NetworkSender,PortStorageWriter) block ininterruptible_geton aqueue.get()with no timeout. By design thisget()is indefinitely blocking so the engine only advances when a real message or theRUNNABLE_STOPmarker arrives, and the Scala side mirrors this. That design is intentional and should be preserved: the loop must not return toreceive()on a quiet queue.This is a defensive-hardening request, not a report of a reproducible failure. As the code stands today,
stop()reliably sets and delivers theRUNNABLE_STOPmarker through correctly-locked queues, so no shutdown hang has been observed or reproduced in CI. The goal is to decouple "stop was requested" from "the marker wakeup was delivered," so that a future change (a new stop path that forgets the marker, or a change to the queue's notify logic) cannot silently reintroduce a shutdown hang.Proposed approach (data path and blocking semantics preserved):
threading.Eventstop flag;stop()sets it in addition to enqueueing the marker.interruptible_getpolls with a short timeout and treatsqueue.Emptyas "loop and wait again" (continue), so it never returns control toreceive()/ the handling loop on a timeout. It only returns on a real item, or raisesInterruptRunnablewhen the flag is set. This keeps the indefinite-blocking semantics for the data path intact while ensuring a stop request is honored within one poll interval even if the single marker wakeup were ever missed.timeoutthroughGetable.get->InternalQueue.get->LinkedBlockingMultiQueue.get(the last viaCondition.wait_for). Defaulttimeout=Nonekeeps the existing blocking behavior unchanged; only the stoppable threads opt into polling.Open question: whether this hardening is worth the added surface area (a
timeouton a queue that is intentionally infinite-blocking) given no failure has been reproduced. See discussion in #5326.