Skip to content

fix(finetune): warn on descriptor config mismatches#5549

Open
njzjz-bot wants to merge 4 commits into
deepmodeling:masterfrom
njzjz-bothub:fix/issue-4848
Open

fix(finetune): warn on descriptor config mismatches#5549
njzjz-bot wants to merge 4 commits into
deepmodeling:masterfrom
njzjz-bothub:fix/issue-4848

Conversation

@njzjz-bot

@njzjz-bot njzjz-bot commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Add shared descriptor-configuration comparison helpers for fine-tuning warnings.
  • Warn when --use-pretrain-script overwrites descriptor settings from input.json with pretrained-model settings.
  • Warn before selective fine-tune state-dict loading when descriptor settings differ, including nested settings such as repflow.nlayer/nlayer.
  • Reuse the warning logic across PyTorch, Paddle, and pt_expt paths, with normalization to avoid default-value-only false positives.

This continues the work from the closed Copilot PR #4925, whose source branch has been removed.

Verification

  • uvx ruff check deepmd/utils/finetune.py deepmd/pt/utils/finetune.py deepmd/pd/utils/finetune.py deepmd/pt/train/training.py deepmd/pd/train/training.py deepmd/pt_expt/train/training.py source/tests/common/test_finetune_utils.py
  • PYTHONPATH="$PWD" uvx pytest -q source/tests/common/test_finetune_utils.py
  • python3 -m compileall -q deepmd/utils/finetune.py deepmd/pt/utils/finetune.py deepmd/pd/utils/finetune.py deepmd/pt/train/training.py deepmd/pd/train/training.py deepmd/pt_expt/train/training.py source/tests/common/test_finetune_utils.py

Fixes #4848

Authored by OpenClaw (model: custom-chat-jinzhezeng-group/gpt-5.5)

Summary by CodeRabbit

  • New Features

    • Added fine-tuning descriptor mismatch warnings that alert you when descriptor configurations differ between the input setup and the selected pretrained branch, including during resume and selective per-branch weight loading.
    • Warnings suppress differences that come only from implicit defaults and ignore the trainable field, reporting the relevant nested differences (with normalization when available).
  • Tests

    • Expanded pytest coverage to verify warning details for nested mismatches, ensure warnings are not emitted for default-only differences, and confirm clear logging for None vs missing descriptor fields.

Add shared descriptor configuration comparison helpers for fine-tuning so mismatched settings such as nlayer are reported before selective state dict loading. Reuse the warning path for --use-pretrain-script overwrites and cover PyTorch, Paddle, and pt_expt backends.

Authored by OpenClaw (model: custom-chat-jinzhezeng-group/gpt-5.5)
@coderabbitai

coderabbitai Bot commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 9753d848-2015-41ec-8aac-7b4e616bc642

📥 Commits

Reviewing files that changed from the base of the PR and between 36d5bb9 and d741eb4.

📒 Files selected for processing (2)
  • deepmd/utils/finetune.py
  • source/tests/common/test_finetune_utils.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • deepmd/utils/finetune.py

📝 Walkthrough

Walkthrough

Two new public functions—warn_descriptor_config_differences and warn_configuration_mismatch_during_finetune—are added to deepmd/utils/finetune.py. They compute recursive descriptor config diffs (with normalization and ignored keys) and emit log warnings. These are wired into the get_finetune_rule_single helpers in deepmd/pt/utils/finetune.py and deepmd/pd/utils/finetune.py, and into the finetune resume paths in deepmd/pt/train/training.py, deepmd/pd/train/training.py, and deepmd/pt_expt/train/training.py.

Changes

Descriptor Mismatch Warning During Fine-Tuning

Layer / File(s) Summary
Core diff and warning utilities
deepmd/utils/finetune.py, source/tests/common/test_finetune_utils.py
Adds logger, constants, _infer_synthetic_type_count, _normalize_descriptor_for_compare, recursive diff computation (_iter_descriptor_config_differences, _descriptor_config_differences), diff formatter (_format_config_value, _format_descriptor_differences), and two public functions: warn_descriptor_config_differences and warn_configuration_mismatch_during_finetune. Tests verify nested-mismatch reporting, suppression of default-value-only differences, graceful fallback when normalization fails, and distinction between None values and missing fields.
Per-backend finetune util wiring
deepmd/pt/utils/finetune.py, deepmd/pd/utils/finetune.py
Imports warn_descriptor_config_differences and calls it inside get_finetune_rule_single when change_model_params is enabled and both the current and pretrained configs include a descriptor.
Training loop integration
deepmd/pt/train/training.py, deepmd/pd/train/training.py, deepmd/pt_expt/train/training.py
Imports warn_configuration_mismatch_during_finetune; refactors pretrained_model_params local variable assignment; calls the warning function per model branch when both input and pretrained model params contain a descriptor during finetune checkpoint loading.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested reviewers

  • wanghan-iapcm
  • njzjz
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 36.36% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix(finetune): warn on descriptor config mismatches' accurately summarizes the main change—adding configuration mismatch warnings for fine-tuning descriptor settings.
Linked Issues check ✅ Passed The PR fully addresses the objective of #4848 by implementing descriptor configuration comparison logic, warning mechanisms, and validation across all three backends to alert users when descriptor parameters differ.
Out of Scope Changes check ✅ Passed All changes are directly related to the PR's primary objective of warning on descriptor configuration mismatches during fine-tuning, with no unrelated modifications detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@deepmd/utils/finetune.py`:
- Around line 59-61: The code currently uses None as both a placeholder for
missing keys and for actual None values in configs, causing ambiguity in the
differences list tuples. Create a dedicated sentinel object at the module level
(e.g., _MISSING = object()) and replace all instances where None is used to
represent a missing key with this sentinel instead. This includes the tuple
constructions at lines 59-61 where pretrained_config[key] is None and key is not
in pretrained_config, as well as the similar patterns at lines 69-71 and
104-110, ensuring that actual None config values are distinguished from
genuinely missing keys.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 5d4d3bb8-70ab-4199-93f3-0d086f7f7c73

📥 Commits

Reviewing files that changed from the base of the PR and between 4a552e3 and 04104dc.

📒 Files selected for processing (7)
  • deepmd/pd/train/training.py
  • deepmd/pd/utils/finetune.py
  • deepmd/pt/train/training.py
  • deepmd/pt/utils/finetune.py
  • deepmd/pt_expt/train/training.py
  • deepmd/utils/finetune.py
  • source/tests/common/test_finetune_utils.py

Comment thread deepmd/utils/finetune.py Outdated
@codecov

codecov Bot commented Jun 17, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 99.03846% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 82.20%. Comparing base (4a552e3) to head (d741eb4).
⚠️ Report is 12 commits behind head on master.

Files with missing lines Patch % Lines
deepmd/utils/finetune.py 98.71% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5549      +/-   ##
==========================================
- Coverage   82.23%   82.20%   -0.03%     
==========================================
  Files         894      898       +4     
  Lines      102002   103679    +1677     
  Branches     4276     4435     +159     
==========================================
+ Hits        83877    85228    +1351     
- Misses      16823    17057     +234     
- Partials     1302     1394      +92     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Apply import ordering updates required by pre-commit.ci on PR deepmodeling#5549.

Authored by OpenClaw (model: custom-chat-jinzhezeng-group/gpt-5.5)
@njzjz njzjz requested review from iProzd and wanghan-iapcm June 18, 2026 16:41
Use a sentinel for missing descriptor keys so actual None values are reported correctly in finetune configuration mismatch warnings. Add regression coverage for None versus missing values.

Authored by OpenClaw (model: custom-chat-jinzhezeng-group/gpt-5.5)
Comment thread deepmd/utils/finetune.py Outdated
Comment on lines +82 to +91
try:
input_descriptor_cmp = _normalize_descriptor_for_compare(input_descriptor)
pretrained_descriptor_cmp = _normalize_descriptor_for_compare(
pretrained_descriptor
)
except Exception:
# Some in-flight or legacy descriptor schemas may not be normalizable with
# the minimal synthetic config above. In that case, still compare the raw
# descriptors so users get a best-effort warning.
pass

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Asymmetric fallback: both _cmp vars are pre-set to the raw descriptors (L80-81), but if the first _normalize_descriptor_for_compare succeeds (L83) and the second raises (L84-86), input_descriptor_cmp is left normalized while pretrained_descriptor_cmp stays raw. The diff then compares normalized-vs-raw and reports every implicit default as a spurious difference — contradicting the comment’s stated raw-vs-raw fallback. Normalize each side in its own try/except (or reset both to raw in the except) so the fallback is symmetric.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in d741eb4. The normalization fallback now resets both descriptors to their raw values if either normalization call fails, so we never compare normalized-vs-raw descriptors. I also added a regression test where one side normalizes and the other raises, and the test checks that an injected implicit default is not reported.

— OpenClaw 2026.6.8 (844f405), model: custom-chat-jinzhezeng-group/gpt-5.5

Comment thread deepmd/utils/finetune.py Outdated
Comment on lines +159 to +160
"will only use compatible descriptor parameters from the pretrained model; "
"other parameters keep their current initialization:\n"

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"other parameters keep their current initialization" is only accurate for the pt_expt backend, where the selective-load does new_state[key] = target_state[key] for absent keys. For pt and pd, collect_single_finetune_params indexes _origin_state_dict[new_key] for descriptor keys (pd train/training.py ~L571-578) / forces use_random_initialization=False then strict-loads (pt train/training.py ~L834-849, L879) — so a genuinely mismatched descriptor raises KeyError / fails the strict load right after this warning, rather than keeping current init. Consider softening the shared message, or making pt/pd actually fall back, since the same text is emitted on all three backends.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in d741eb4 by softening the shared warning text. It now says compatible descriptor parameters can be reused, while incompatible parameters may be reinitialized, skipped, or rejected by backend-specific loading. That should avoid implying pt/pd keep current initialization in cases where their loaders fail instead.

— OpenClaw 2026.6.8 (844f405), model: custom-chat-jinzhezeng-group/gpt-5.5

Comment thread deepmd/utils/finetune.py Outdated
Comment on lines +31 to +33
"descriptor": deepcopy(dict(descriptor)),
"fitting_net": {"neuron": [240, 240, 240]},
"type_map": ["H", "O"],

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The synthetic config hardcodes a 2-type type_map (["H", "O"]). For descriptors whose sel/exclude_types length tracks the number of atom types (e.g. se_e2_a with a per-type sel list), a legitimate finetune to a different type set (without --use-pretrain-script) will have sel/type-count genuinely differ from the pretrained descriptor — and normalize() leaves a longer sel unchanged under this 2-type stub rather than erroring, so the warning fires on those intentional, expected differences. Issue #4848 (dpa3, scalar sel) is unaffected, but per-type-sel descriptors will get noisy warnings. Worth documenting this as a known limitation at minimum.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in d741eb4. I replaced the fixed two-type stub with _infer_synthetic_type_count, which derives a safer synthetic type_map size from per-type descriptor fields such as sel, sel_a, sel_r, and exclude_types, and added a regression test for that inference. The helper docstring also now documents that this remains best-effort when intentional type-map changes affect type-count-dependent fields.

— OpenClaw 2026.6.8 (844f405), model: custom-chat-jinzhezeng-group/gpt-5.5

Comment on lines +10 to +12
monkeypatch.setattr(
finetune,
"_normalize_descriptor_for_compare",

@wanghan-iapcm wanghan-iapcm Jun 21, 2026

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All three tests monkeypatch.setattr(finetune, "_normalize_descriptor_for_compare", <stub>), so the real normalize()-based default-suppression — the whole point of the feature ("implicit defaults must not warn") — and the except Exception fallback in _descriptor_config_differences are never exercised by any test. A test driving the genuine normalize() path would also have caught that the example key here is nlayer, whereas a real dpa3 descriptor uses nlayers (which normalize() rejects). Consider one test without the monkeypatch.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in d741eb4. I added an un-mocked test that exercises the real normalize() path and verifies that default-only descriptor differences are suppressed. The fallback test now specifically exercises the asymmetric-failure case, and the DPA3 examples use the real repflow.nlayers spelling.

— OpenClaw 2026.6.8 (844f405), model: custom-chat-jinzhezeng-group/gpt-5.5

Authored by OpenClaw (model: custom-chat-jinzhezeng-group/gpt-5.5)

@njzjz njzjz left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — the descriptor-diff logic ignores implicit defaults via normalization with a raw-vs-raw fallback, and the warn calls are guarded by "descriptor" in ... on both sides. Well covered by the new unit tests.

— Opus 4.8

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] Changing nlayer lead no error report while fine-tuning

3 participants