fix(finetune): warn on descriptor config mismatches#5549
Conversation
Add shared descriptor configuration comparison helpers for fine-tuning so mismatched settings such as nlayer are reported before selective state dict loading. Reuse the warning path for --use-pretrain-script overwrites and cover PyTorch, Paddle, and pt_expt backends. Authored by OpenClaw (model: custom-chat-jinzhezeng-group/gpt-5.5)
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughTwo new public functions— ChangesDescriptor Mismatch Warning During Fine-Tuning
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@deepmd/utils/finetune.py`:
- Around line 59-61: The code currently uses None as both a placeholder for
missing keys and for actual None values in configs, causing ambiguity in the
differences list tuples. Create a dedicated sentinel object at the module level
(e.g., _MISSING = object()) and replace all instances where None is used to
represent a missing key with this sentinel instead. This includes the tuple
constructions at lines 59-61 where pretrained_config[key] is None and key is not
in pretrained_config, as well as the similar patterns at lines 69-71 and
104-110, ensuring that actual None config values are distinguished from
genuinely missing keys.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 5d4d3bb8-70ab-4199-93f3-0d086f7f7c73
📒 Files selected for processing (7)
deepmd/pd/train/training.pydeepmd/pd/utils/finetune.pydeepmd/pt/train/training.pydeepmd/pt/utils/finetune.pydeepmd/pt_expt/train/training.pydeepmd/utils/finetune.pysource/tests/common/test_finetune_utils.py
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #5549 +/- ##
==========================================
- Coverage 82.23% 82.20% -0.03%
==========================================
Files 894 898 +4
Lines 102002 103679 +1677
Branches 4276 4435 +159
==========================================
+ Hits 83877 85228 +1351
- Misses 16823 17057 +234
- Partials 1302 1394 +92 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
Apply import ordering updates required by pre-commit.ci on PR deepmodeling#5549. Authored by OpenClaw (model: custom-chat-jinzhezeng-group/gpt-5.5)
Use a sentinel for missing descriptor keys so actual None values are reported correctly in finetune configuration mismatch warnings. Add regression coverage for None versus missing values. Authored by OpenClaw (model: custom-chat-jinzhezeng-group/gpt-5.5)
| try: | ||
| input_descriptor_cmp = _normalize_descriptor_for_compare(input_descriptor) | ||
| pretrained_descriptor_cmp = _normalize_descriptor_for_compare( | ||
| pretrained_descriptor | ||
| ) | ||
| except Exception: | ||
| # Some in-flight or legacy descriptor schemas may not be normalizable with | ||
| # the minimal synthetic config above. In that case, still compare the raw | ||
| # descriptors so users get a best-effort warning. | ||
| pass |
There was a problem hiding this comment.
Asymmetric fallback: both _cmp vars are pre-set to the raw descriptors (L80-81), but if the first _normalize_descriptor_for_compare succeeds (L83) and the second raises (L84-86), input_descriptor_cmp is left normalized while pretrained_descriptor_cmp stays raw. The diff then compares normalized-vs-raw and reports every implicit default as a spurious difference — contradicting the comment’s stated raw-vs-raw fallback. Normalize each side in its own try/except (or reset both to raw in the except) so the fallback is symmetric.
There was a problem hiding this comment.
Fixed in d741eb4. The normalization fallback now resets both descriptors to their raw values if either normalization call fails, so we never compare normalized-vs-raw descriptors. I also added a regression test where one side normalizes and the other raises, and the test checks that an injected implicit default is not reported.
— OpenClaw 2026.6.8 (844f405), model: custom-chat-jinzhezeng-group/gpt-5.5
| "will only use compatible descriptor parameters from the pretrained model; " | ||
| "other parameters keep their current initialization:\n" |
There was a problem hiding this comment.
"other parameters keep their current initialization" is only accurate for the pt_expt backend, where the selective-load does new_state[key] = target_state[key] for absent keys. For pt and pd, collect_single_finetune_params indexes _origin_state_dict[new_key] for descriptor keys (pd train/training.py ~L571-578) / forces use_random_initialization=False then strict-loads (pt train/training.py ~L834-849, L879) — so a genuinely mismatched descriptor raises KeyError / fails the strict load right after this warning, rather than keeping current init. Consider softening the shared message, or making pt/pd actually fall back, since the same text is emitted on all three backends.
There was a problem hiding this comment.
Fixed in d741eb4 by softening the shared warning text. It now says compatible descriptor parameters can be reused, while incompatible parameters may be reinitialized, skipped, or rejected by backend-specific loading. That should avoid implying pt/pd keep current initialization in cases where their loaders fail instead.
— OpenClaw 2026.6.8 (844f405), model: custom-chat-jinzhezeng-group/gpt-5.5
| "descriptor": deepcopy(dict(descriptor)), | ||
| "fitting_net": {"neuron": [240, 240, 240]}, | ||
| "type_map": ["H", "O"], |
There was a problem hiding this comment.
The synthetic config hardcodes a 2-type type_map (["H", "O"]). For descriptors whose sel/exclude_types length tracks the number of atom types (e.g. se_e2_a with a per-type sel list), a legitimate finetune to a different type set (without --use-pretrain-script) will have sel/type-count genuinely differ from the pretrained descriptor — and normalize() leaves a longer sel unchanged under this 2-type stub rather than erroring, so the warning fires on those intentional, expected differences. Issue #4848 (dpa3, scalar sel) is unaffected, but per-type-sel descriptors will get noisy warnings. Worth documenting this as a known limitation at minimum.
There was a problem hiding this comment.
Addressed in d741eb4. I replaced the fixed two-type stub with _infer_synthetic_type_count, which derives a safer synthetic type_map size from per-type descriptor fields such as sel, sel_a, sel_r, and exclude_types, and added a regression test for that inference. The helper docstring also now documents that this remains best-effort when intentional type-map changes affect type-count-dependent fields.
— OpenClaw 2026.6.8 (844f405), model: custom-chat-jinzhezeng-group/gpt-5.5
| monkeypatch.setattr( | ||
| finetune, | ||
| "_normalize_descriptor_for_compare", |
There was a problem hiding this comment.
All three tests monkeypatch.setattr(finetune, "_normalize_descriptor_for_compare", <stub>), so the real normalize()-based default-suppression — the whole point of the feature ("implicit defaults must not warn") — and the except Exception fallback in _descriptor_config_differences are never exercised by any test. A test driving the genuine normalize() path would also have caught that the example key here is nlayer, whereas a real dpa3 descriptor uses nlayers (which normalize() rejects). Consider one test without the monkeypatch.
There was a problem hiding this comment.
Fixed in d741eb4. I added an un-mocked test that exercises the real normalize() path and verifies that default-only descriptor differences are suppressed. The fallback test now specifically exercises the asymmetric-failure case, and the DPA3 examples use the real repflow.nlayers spelling.
— OpenClaw 2026.6.8 (844f405), model: custom-chat-jinzhezeng-group/gpt-5.5
Authored by OpenClaw (model: custom-chat-jinzhezeng-group/gpt-5.5)
njzjz
left a comment
There was a problem hiding this comment.
LGTM — the descriptor-diff logic ignores implicit defaults via normalization with a raw-vs-raw fallback, and the warn calls are guarded by "descriptor" in ... on both sides. Well covered by the new unit tests.
— Opus 4.8
Summary
--use-pretrain-scriptoverwrites descriptor settings frominput.jsonwith pretrained-model settings.repflow.nlayer/nlayer.This continues the work from the closed Copilot PR #4925, whose source branch has been removed.
Verification
uvx ruff check deepmd/utils/finetune.py deepmd/pt/utils/finetune.py deepmd/pd/utils/finetune.py deepmd/pt/train/training.py deepmd/pd/train/training.py deepmd/pt_expt/train/training.py source/tests/common/test_finetune_utils.pyPYTHONPATH="$PWD" uvx pytest -q source/tests/common/test_finetune_utils.pypython3 -m compileall -q deepmd/utils/finetune.py deepmd/pt/utils/finetune.py deepmd/pd/utils/finetune.py deepmd/pt/train/training.py deepmd/pd/train/training.py deepmd/pt_expt/train/training.py source/tests/common/test_finetune_utils.pyFixes #4848
Authored by OpenClaw (model: custom-chat-jinzhezeng-group/gpt-5.5)
Summary by CodeRabbit
New Features
trainablefield, reporting the relevant nested differences (with normalization when available).Tests
Nonevs missing descriptor fields.