Skip to content

perf(dpa4): opt so3grid (with pt_expt GridProduct wrapping fix)#5552

Merged
OutisLi merged 3 commits into
deepmodeling:masterfrom
wanghan-iapcm:fix/dpa4-so3grid-pt-expt-wrap
Jun 18, 2026
Merged

perf(dpa4): opt so3grid (with pt_expt GridProduct wrapping fix)#5552
OutisLi merged 3 commits into
deepmodeling:masterfrom
wanghan-iapcm:fix/dpa4-so3grid-pt-expt-wrap

Conversation

@wanghan-iapcm

@wanghan-iapcm wanghan-iapcm commented Jun 18, 2026

Copy link
Copy Markdown
Collaborator

Based on #5517 (perf(dpa4): opt so3grid by @OutisLi) — this branch contains all of its commits plus one fix commit that addresses the CI failures on that PR.

Problem

#5517 introduces a new parameter-free GridProduct NativeOP in deepmd/dpmodel/descriptor/dpa4_nn/grid_net.py for the so3grid optimization, but it has no serialize/deserialize and is not registered via register_dpmodel_mapping. The pt_expt backend auto-wraps every dpmodel NativeOP sub-component through _auto_wrap_native_op, which requires the op to be serializable (or registered) to build its dynamic torch wrapper. Otherwise it raises:

TypeError: Cannot auto-wrap GridProduct: it must implement serialize()/deserialize()
           or be explicitly registered via register_dpmodel_mapping().

This broke every Test Python shard that loads a DPA4 pt_expt model (e.g. source/tests/pt_expt/model/test_get_model_dpa4.py::TestGetModelDPA4::test_pair_exclude_types_from_descriptor) on #5517.

Fix

Add trivial serialize/deserialize to GridProduct (no state — mirrors the GridMLP @class/@version convention). _auto_wrap_native_op then passes its hasattr(value, "serialize") guard and returns wrapped_cls.deserialize(value.serialize()) cleanly.

Notes

  • The sibling GridMLP (also new in perf(dpa4): opt so3grid #5517) already implements serialize/deserialize; only the parameter-free GridProduct was missing them.
  • Verified by tracing the _auto_wrap_native_op code path (deepmd/pt_expt/common.py:138-170); the actual pt_expt DPA4 test runs in CI here.
  • Once green, this can supersede perf(dpa4): opt so3grid #5517, or the single fix commit can be cherry-picked onto it.

Summary by CodeRabbit

Release Notes

  • Refactor

    • Restructured internal grid-net coefficient processing for improved efficiency.
    • Consolidated grid operation selection logic and refactored supporting utility functions.
  • Documentation

    • Clarified docstrings for grid-path configuration options.
  • Tests

    • Extended parity test coverage for grid operations.

OutisLi and others added 3 commits June 16, 2026 09:04
The parameter-free `GridProduct` NativeOP (added for the so3grid
optimization) has no `serialize`/`deserialize` and is not registered via
`register_dpmodel_mapping`. The pt_expt backend auto-wraps every dpmodel
NativeOP sub-component through `_auto_wrap_native_op`, which requires the
op to be serializable (or registered) to build its dynamic torch wrapper;
otherwise it raises:

    TypeError: Cannot auto-wrap GridProduct: it must implement
    serialize()/deserialize() or be explicitly registered via
    register_dpmodel_mapping().

This broke every `Test Python` shard that loads a DPA4 pt_expt model
(e.g. test_get_model_dpa4.py). Add trivial `serialize`/`deserialize`
(no state, mirroring the GridMLP @class/@Version convention) so the op
auto-wraps cleanly.
@dosubot dosubot Bot added the bug label Jun 18, 2026
@wanghan-iapcm wanghan-iapcm requested a review from OutisLi June 18, 2026 04:00
@coderabbitai

coderabbitai Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: d0767291-fe51-44c1-8041-53190779878d

📥 Commits

Reviewing files that changed from the base of the PR and between 53c5968 and f011367.

📒 Files selected for processing (5)
  • deepmd/dpmodel/descriptor/dpa4_nn/block.py
  • deepmd/dpmodel/descriptor/dpa4_nn/ffn.py
  • deepmd/dpmodel/descriptor/dpa4_nn/grid_net.py
  • deepmd/pt/model/descriptor/sezm_nn/grid_net.py
  • source/tests/pt/model/test_dpa4_dpmodel_parity.py

📝 Walkthrough

Walkthrough

Ports op_type='mlp' into the dpmodel S2 grid-net layer by introducing new GridProduct and GridMLP NativeOP classes. Refactors GridBranch and GridMLP in both the dpmodel and PT backends to accept coefficient-space operands with injected to_grid/from_grid callables. Updates BaseGridNet dispatch, generalizes projector shape inference, expands serialization coverage, wires EquivariantFFN, and extends parity tests accordingly.

Changes

Port GridMLP and refactor grid-op projector contract

Layer / File(s) Summary
New GridProduct and GridMLP NativeOP classes (dpmodel)
deepmd/dpmodel/descriptor/dpa4_nn/grid_net.py
Adds parameter-free GridProduct and trainable GridMLP NativeOP classes with call(), serialize()/deserialize(), and PT state-dict–keyed _load_variables(); updates module docstrings and TYPE_CHECKING-guarded Callable import to reflect the expanded porting scope.
GridBranch.call injected-projector refactor (dpmodel)
deepmd/dpmodel/descriptor/dpa4_nn/grid_net.py
Refactors GridBranch.call to the new contract: accepts left/right coefficient operands plus to_grid/from_grid callables; performs channel mixing at coefficient resolution, quadratic product in grid space, softmax routing, and projects back to coefficient space.
BaseGridNet construction, dispatch, and projector generalization (dpmodel)
deepmd/dpmodel/descriptor/dpa4_nn/grid_net.py
Updates BaseGridNet to instantiate GridMLP/GridBranch/GridProduct by op_type; simplifies call() to take coeff_out directly from _apply_grid_op; delegates _apply_grid_op to grid_op with injected callables; generalizes _to_grid/_from_grid to infer channel width from tensor shape; extends S2GridNet.serialize()/deserialize() to include grid_op.* variables for mlp and branch.
_project_frames helper and GridMLP/GridBranch coefficient-space refactor (PT)
deepmd/pt/model/descriptor/sezm_nn/grid_net.py
Introduces _project_frames for per-frame ChannelLinear application; extends GridMLP and GridBranch constructors with n_frames; replaces grid-based forward signatures with coefficient-resolution implementations accepting left, right, scalar_pair, to_grid, from_grid.
BaseGridNet wiring and _to_grid/_from_grid generalization (PT)
deepmd/pt/model/descriptor/sezm_nn/grid_net.py
Wires n_frames into GridMLP/GridBranch constructors; updates BaseGridNet.forward to call self.grid_op directly with injected callables; eliminates _apply_grid_op; generalizes _to_grid/_from_grid to infer channel width via reshape(..., -1).
EquivariantFFN op_type selection and docstring updates
deepmd/dpmodel/descriptor/dpa4_nn/ffn.py, deepmd/dpmodel/descriptor/dpa4_nn/block.py
Consolidates grid_op selection to a single nested conditional (branch > mlp > glu); removes stale "not-ported" NotImplementedError comments and docstring wording from ffn.py and block.py.
Parity test expansion for mlp op_type, GridMLP, and GridBranch
source/tests/pt/model/test_dpa4_dpmodel_parity.py
Refactors _build_grid_nets with a grid_op_params mapping; adds op_type='mlp' parametrization; updates GridBranch test to pass n_frames=1; adds GridBranch and GridMLP forward-parity checks with identity callables and serialization roundtrip assertions; removes NotImplementedError guard expectations.

Sequence Diagram(s)

sequenceDiagram
  participant EquivariantFFN
  participant S2GridNet
  participant BaseGridNet
  participant GridOp as GridProduct/GridMLP/GridBranch
  participant _to_grid
  participant _from_grid

  EquivariantFFN->>S2GridNet: forward(left, right, scalar_pair)
  S2GridNet->>BaseGridNet: call(left, right, scalar_pair)
  BaseGridNet->>BaseGridNet: _apply_grid_op(left, right, scalar_pair)
  BaseGridNet->>GridOp: call(left, right, scalar_pair, to_grid=_to_grid, from_grid=_from_grid)
  GridOp->>_to_grid: project coefficients to grid space
  _to_grid-->>GridOp: grid tensor (channel width inferred from shape)
  GridOp->>GridOp: quadratic product / MLP / branch routing in grid space
  GridOp->>_from_grid: project grid back to coefficient space
  _from_grid-->>GridOp: coeff_out
  GridOp-->>BaseGridNet: coeff_out
  BaseGridNet-->>S2GridNet: coeff_out
  S2GridNet-->>EquivariantFFN: output
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • deepmodeling/deepmd-kit#5515: Introduced the initial DPA4 dpmodel S2 grid-net/FFN implementation in dpa4_nn/grid_net.py and ffn.py, which this PR directly extends by porting GridMLP and refactoring the GridBranch/projector contract.
  • deepmodeling/deepmd-kit#5522: Modified BaseGridNet's _to_grid/_from_grid coefficient↔grid projection path in the same file, immediately adjacent to the generalized shape-inference changes introduced here.

Suggested labels

enhancement, Python, Core

Suggested reviewers

  • njzjz
  • OutisLi
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title mentions 'opt so3grid' and references a 'pt_expt GridProduct wrapping fix', which directly aligns with the main changes: adding GridProduct/GridMLP implementations and fixing pt_expt wrappability.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov

codecov Bot commented Jun 18, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 96.26168% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 82.24%. Comparing base (84f8541) to head (f011367).
⚠️ Report is 15 commits behind head on master.

Files with missing lines Patch % Lines
deepmd/dpmodel/descriptor/dpa4_nn/grid_net.py 94.93% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5552      +/-   ##
==========================================
+ Coverage   82.21%   82.24%   +0.02%     
==========================================
  Files         892      894       +2     
  Lines      101532   102084     +552     
  Branches     4240     4276      +36     
==========================================
+ Hits        83475    83955     +480     
- Misses      16753    16828      +75     
+ Partials     1304     1301       -3     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@OutisLi OutisLi added this pull request to the merge queue Jun 18, 2026
Merged via the queue into deepmodeling:master with commit d0a1959 Jun 18, 2026
73 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants