Skip to content

Don't wake a runtime-suspended dGPU to service NVPCF/GPS ACPI notifies#1181

Open
ElXreno wants to merge 1 commit into
NVIDIA:mainfrom
ElXreno:fix/rtd3-no-wake-on-acpi-notify
Open

Don't wake a runtime-suspended dGPU to service NVPCF/GPS ACPI notifies#1181
ElXreno wants to merge 1 commit into
NVIDIA:mainfrom
ElXreno:fix/rtd3-no-wake-on-acpi-notify

Conversation

@ElXreno

@ElXreno ElXreno commented Jun 6, 2026

Copy link
Copy Markdown

On RTD3 laptops the discrete GPU sits in D3cold while idle. On some machines the platform keeps delivering ACPI Notify() events (to the NVPCF device, and as GPS status changes) even while the GPU is suspended, for example around battery/AC transitions or when the SBIOS pushes a new thermal or power-limit hint.

The two handlers that service those notifies, rm_acpi_nvpcf_notify() and RmHandleGPSStatusChange(), both call os_ref_dynamic_power() unconditionally. That resumes the GPU purely to deliver an event that does nothing while it's powered down, and it then re-suspends. On a fair number of laptops the next notify lands right away, so the GPU never settles in D3cold. Folks in #860 describe it cycling D0/D3cold every ~11 seconds on battery, and the current workaround is to patch the ACPI tables to strip the Notify(NPCF, 0xC0).

This skips the resume when the GPU is already runtime-suspended (NV_DYNAMIC_POWER_STATE_IDLE_INDICATED), the same guard rm_pmu_perfmon_get_load() already uses a few functions away. Why it's safe:

  • the NVPCF event is only consumed by nvidia-powerd, which has nothing to do while the GPU is asleep
  • GPS/SBIOS data is re-read on the next StateLoad, so a skipped sync is recovered when the GPU next powers up
  • power-source (AC/battery) changes go through a separate path (rm_power_source_change_event) and are left untouched

I traced the wake on an RTX 4060 mobile (ASUS TUF, open module):

acpi_ev_notify_dispatch -> rm_acpi_nvpcf_notify -> os_ref_dynamic_power
  -> nv_indicate_not_idle -> ... -> acpi_pci_set_power_state (D3cold -> D0)

With the guard in place the handlers still run, but the GPU stays in D3cold. Verified on 595.45.04 and 610.43.02.

…notifies

On RTD3 laptops the dGPU is runtime-suspended (D3cold) while idle. Some
platforms still deliver ACPI Notify() events for the NVPCF device and for
GPS status changes while the GPU is suspended (for example around battery/AC
transitions, or when the SBIOS pushes a thermal or power-limit hint).

rm_acpi_nvpcf_notify() and RmHandleGPSStatusChange() both call
os_ref_dynamic_power() unconditionally, resuming the GPU only to deliver an
event that is meaningless while it is powered down. The GPU then re-suspends,
and where the next notify arrives immediately it never settles in D3cold,
cycling D0/D3cold and draining the battery (see NVIDIA#860, where users work around
it by patching the ACPI tables to drop the Notify(NPCF, 0xC0)).

Skip the work when the GPU is already runtime-suspended
(NV_DYNAMIC_POWER_STATE_IDLE_INDICATED), the same guard that
rm_pmu_perfmon_get_load() already uses. The NVPCF event is only consumed
while the GPU is active, and GPS/SBIOS state is re-read on the next
StateLoad, so no state is lost.
@CLAassistant

CLAassistant commented Jun 6, 2026

Copy link
Copy Markdown

CLA assistant check
All committers have signed the CLA.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants