feat(persistent): in-place solver updates (Solver framework + HiGHS/Gurobi/Xpress/Mosek)#718
feat(persistent): in-place solver updates (Solver framework + HiGHS/Gurobi/Xpress/Mosek)#718FabianHofmann wants to merge 41 commits into
Conversation
Tracks per-Constraint coefficient mutation via a single boolean slot, flipped in coeffs/vars/lhs setters. Pure-constant rhs writes now short-circuit and leave coeffs/vars buffers untouched (by identity), so rhs-only updates don't trigger expensive coefficient recompare on the persistent-solver fast path.
Pure-Python snapshot primitives for the persistent-solver Phase 1. Deep-copies value-side fields (var_lb/ub, con_rhs/sign, obj_linear), holds vlabels/clabels by reference, stores canonical CSR (indptr, indices) per constraint container. No Solver import.
Pure-function diff for the persistent-solver Phase 1. Detects structural, coord, sparsity, quadratic-objective, value-only var/con, and objective-linear/sense changes. Supports same_model fast path via _coef_dirty and cross-model full re-scan. Includes a focused test suite covering capture, mutation paths, deep-copy invariant, and the same_model toggle.
- supports_persistent_update class flag (default False) - snapshot/_rebuilds/_in_place_updates/_last_rebuild_reason fields - snapshot capture at end of direct _build, _clear_coef_dirty helper - apply_update stub raising UnsupportedUpdate - solve(model, assign) dispatcher with diff-or-rebuild path - update(model, apply=True) primitive returning ModelDiff - threading.Lock around diff+apply+resnapshot - __getstate__/__setstate__ drop native handle and snapshot
…date support Skip diff computation entirely when supports_persistent_update is False on apply, per plan: 'dispatcher checks flag before calling — if False, skips diffing entirely and goes to rebuild.'
Replace xarray-based snapshot and CSR pattern compare with per-row canonicalised numpy buffers; new ContainerVarUpdate / ContainerRowUpdate payloads. Gurobi/HiGHS apply_update rewritten around batched setAttr / changeColsBounds / changeColsCost / changeColsIntegrality; coefficient writes touch only changed cells. Cross-model diff now ~matches same-model cost for bound/rhs/coef-value sweeps.
compute_diff/Solver.solve/Solver.update grow an ignore_dims kwarg. None (default) keeps the current no-coord-check behaviour; any iterable opts into per-container coord-equality on every dim not in the set, supporting rolling-horizon workflows where e.g. the snapshot dim is expected to drift.
…_rebuild - Solver.from_name now accepts model=None; the first solve(m, ...) builds. - compute_diff folded into ModelDiff.from_snapshot classmethod; new ModelDiff.from_models diffs two linopy models directly. - Solver.solve grows disallow_rebuild=True, which raises RebuildRequiredError instead of falling back to a rebuild.
…m_models - Add `track_updates` flag (default False) to Solver; skip ModelSnapshot capture when disabled. Raise UpdatesDisabledError on solve(model)/update() if a built solver was constructed without tracking. - Rewrite ModelDiff.from_models to build directly from two models without capturing snapshots; share helpers with from_snapshot. - Update persistent tests to opt into track_updates=True; add coverage for the disabled path.
Cross-instance resolves now diff via from_models against the previously built model, with no snapshot. Same-instance mutation still raises UpdatesDisabledError. Snapshot recapture is skipped in this mode. Add cross-instance solve/update tests for the no-snapshot path.
Collapse _diff_objective QUAD_OBJ branches; cache n_coef_updates; short-circuit _canonicalize_rows when rows already sorted; tighten buffer extraction. Introduce VarKind enum used across snapshot/diff and HiGHS/Gurobi apply_update; reuse linopy.constants sign tokens. Move _clear_coef_dirty into ModelSnapshot.capture.
Source con buffers from Constraint.to_matrix_with_rhs, replacing the dense (n_rows, max_n_term) arrays with CSR (indptr, indices, data). Sign dtype adopts 'U1' across the persistent layer and apply_update in HiGHS/Gurobi consumes CSR-slice payloads instead of -1 masks. Deletes _canonicalize_rows and the _INT64_MAX sentinel.
Replace per-container ContainerVarUpdate/ContainerRowUpdate dicts with flat arrays (var_bounds_*, var_type_*, con_coef_* COO, con_rhs_*, con_sign_*) plus VarSlice/ConSlice per-container offsets for diagnostics. Add con_rhs_as_bounds() for ranged-row solvers. Backend apply_update bodies collapse to flat-array calls; remove duplicated label->position resolution.
Implement in-place model updates for Xpress (chgbounds/chgrhs/chgmcoef/ chgrowtype/chgobj/chgobjsense/chgcoltype) and Mosek (chgvarbound/ chgconbound/putaijlist/putclist/putvartypelist/putobjsense). Mosek rejects constraint sign change to trigger rebuild. Consolidate gurobi/highs apply_update tests into a single parametrized file that also covers xpress and mosek.
for more information, see https://pre-commit.ci
|
@FBumann here we go, if you want to take a first look. docs to come |
Ill have a look ltoday |
* hold solver lock through _run_direct so two threads calling solve(model) on the same Solver no longer race on the native handle (HiGHS returned 0.0 from the second concurrent solve). * narrow Optional ndarrays in persistent.diff.push_var / push_con and in HiGHS/Gurobi/Xpress/Mosek apply_update objective paths. * widen Constraint.rhs setter to ExpressionLike | VariableLike | ConstantLike to match the as_expression call in the body. * widen Constraints.__getitem__(str) return type to Constraint (the dominant case) so tests can set .rhs/.coeffs/.sign without ignores. * add docs for in-place solver updates.
|
@FabianHofmann Sorry, I wont be able to review this today. |
take your time, there is no hurry. I'll do some integration tests anyway |
|
Are there any conflicts with #717? Are we sure we want to merge and publish this before we go v1? Just making sure... |
|
Benched PR #718 ModelDiff vs hand-rolled
Diff compute + Manually calling Suggestion: expose an opt-in on the persistent API so users with presolve-heavy LPs can drop the warm basis without bypassing the encapsulation. Either:
Zero structural rebuilds observed across all 7 re-solves in both ModelDiff routes ( Caveats:
|
|
Took a first cut at A6 (incremental snapshot advance) as a follow-up branch off this PR's head, so it's ready to rebase + open once #718 lands. Branch: What
Bench (M1, median of 7, isolated snapshot-refresh step)
Speedup scales as Scope notes
Tests
Happy to open this as a real PR against |
|
@FabianHofmann happy to fold the A6 commit ( Two ways:
Either works. Let me know which you prefer. |
coroa
left a comment
There was a problem hiding this comment.
Ok, first set of comments. I still need to go over the solvers implementation.
Everything looks sane in general, i'd still try to make a couple of improvements:
persistent/diff.pycan be cleaned up a bit. Nothing major, mostly moving stuff around. Maybe a preference to not use a COO constraint coeff diff.persistent/snapshot.pyandconstraints.py: I'd like to make ModelSnapshots (or more explicitly their con and maybe var buffers) share the explicit underlying numpy objects with the Constraint objects in the case of CSRConstraints to save memory. coef_dirty is then not necessary anymore (for the CSR ones), because you can directly test whether it is still the same object. Arguments are memory and cpu efficiency.
ModelDiff.from_snapshot/from_models return RebuildReason on rebuild (NONE dropped); diff walkers moved onto _DiffBuilder with context in __init__, single _cat helper. Snapshot buffers share constraint arrays (identity fast path); CSRConstraint.sanitize_zeros copy-on-write. Use isinstance(val, ConstantLike) in Variable._validate_update.
|
@coroa thanks, addressed in 11b02e6:
Two pushbacks:
Solvers part of your review still pending from your side — no rush. |
|
@coroa thanks, I addressed most of the points. The refactoring was really needed (already pointed out by @FBumann). COO still looks like the right choice. Note AI written
Two pushbacks:
|
# Conflicts: # doc/release_notes.rst
could move into a property on the model diff maybe instead of changing the coefficient. rewriting the constraint would be faster. unsure whether this would lead to having to change the constraint mapping. |
Variable.fix()/unfix() set both bounds atomically via update() instead of sequential deprecated setters (tripped new lower<=upper cross-validation). Fail fast on quadratic lhs in Constraint.update; type-narrowing fixes for mypy.
ModelDiff stores per-container _CoefDelta (changed rows referencing the CSR buffers); con_coef_rows/cols/vals materialize on first access via cached property. Expansion is now vectorized; backends guard on n_coef_updates. Follows coroa's suggestion on PR 718.
…(A2+A6) _DiffBuilder records target buffers/coords; ModelDiff.snapshot replaces the O(nnz) re-capture after in-place updates. ModelSnapshot.capture no longer mutates the model: the _coef_dirty clear moves to the solver, coupled to snapshot adoption (build + successful apply, never on apply=False).
Base Solver orchestrates the diff sections and validates up front (sign support; Mosek semi-continuous now fails before any native mutation); backends implement _apply_* hooks. Binary [0,1] re-clamp lifted to base with Gurobi no-op (VType 'B' implies bounds natively). self.sense now set uniformly; HiGHS vtype map cached; Xpress/Mosek list-conversion helpers.
…y (A5) update(model, apply=False) computes the diff without the solver lock (immutable snapshot buffers, same_model=False since _coef_dirty cannot be trusted concurrently). solve keeps the coarse lock: apply->run must be atomic and native handles are not thread-safe. Tests pin the non-blocking preview and the preview/apply asymmetry for raw .values mutations.
coroa
left a comment
There was a problem hiding this comment.
Ok, i also went over solvers. Not many comments.
I am only thinking about whether we can easily test whether removing and re-creating changed constraints might be faster for individual solvers? At least after some threshold of what was changed.
| if diff.con_rhs_indices.size: | ||
| self._apply_con_rhs(ctx, diff) | ||
| if diff.con_sign_indices.size: | ||
| self._apply_con_signs(ctx, diff.con_sign_indices, diff.con_sign_values) | ||
| if diff.n_coef_updates: | ||
| self._apply_con_coefs( | ||
| ctx, diff.con_coef_rows, diff.con_coef_cols, diff.con_coef_vals | ||
| ) |
There was a problem hiding this comment.
i know from discussions with @brynpickering that he did experiments (with Gurobi, i think) where deleting and re-creating constraint rows was often faster than updating coefs. Can we test this?
There was a problem hiding this comment.
Worth measuring, and the code is set up well to test it. The coef path is a scalar loop today: HiGHS changeCoeff and Gurobi chgCoeff are called per element in _apply_con_coefs, so this is exactly the case where the delete-and-re-add-row approach would batch better. Xpress (chgmcoef) and Mosek (putaijlist) already submit the whole COO list in one call, so there's less to win there.
Concrete shape: the _coef_dirty mask already identifies the changed rows and the COO arrays are grouped by row, so we know the exact row set to swap. Add a threshold: if the share of dirty coefs in a row (or the dirty-row count) exceeds a cutoff, delConstrs/deleteRows those rows and re-add them from the new CSR instead of per-element coefficient changes; below the cutoff keep the current path.
One caveat to fold into the measurement: deleting rows invalidates the saved basis for them. For Gurobi that's mostly free, but for HiGHS row deletion is one of the things that forces presolve, so it interacts with the basis-reuse regression in follow-up #1. We'd want to compare net solve time, not just the modify-call cost.
I'd suggest wiring both paths plus a threshold sweep into the same rolling-horizon PyPSA-Eur benchmark that produced the follow-up findings, and tracking it as a follow-up rather than blocking this PR: the scalar loop is correct, just potentially suboptimal past some change fraction.
🤖 AI-edited (Claude) reply, reviewed by @MaykThewessen.
There was a problem hiding this comment.
this is valid. though I defer this to a follow up. the deletion of constraints interacts with the state of the solvers. for example for highs this will lead to a hard pre-solve which for some cases we want to avoid. we could introduced some more fine-grained control later
|
I would say we should get this working and release, and try to work on performance tweaks from there, with the soon to be added benchmark tools |
Replace _update_locked(apply=...) with _compute_diff + _apply_locked, dropping the dead apply=False branches. Read supports_* off the instance and give Gurobi's apply context a _GurobiApplyCtx namedtuple.
ModelDiff.from_models now captures a snapshot of model_a and defers to from_snapshot(same_model=False), removing the duplicated baseline-extraction and diff loops.
The global _structural_reason precheck (via vlabels/clabels, the concatenation of per-container active_labels) already pins each container's active_labels and shape before diff_var/diff_con run, so the per-container shape/active_labels mismatch branches were unreachable duplication that silently degraded to a rebuild. Removed from diff_var, diff_con, and diff_objective.
Adds coverage for QUAD_OBJ, variable-type change, sign-only mutation, top-level STRUCTURAL_LABELS (vlabels/clabels), indices SPARSITY, binary/ integer/semi-continuous capture, and ModelDiff inspect/repr helpers. persistent/diff.py 81->96%, snapshot.py 87->99%.
Changes proposed in this Pull Request
Adds an opt-in persistent-update framework so a built solver can be re-solved against a mutated
Modelwithout a full rebuild.Core
linopy.persistent:ModelSnapshot,ModelDiff,StructuralKey,VarKind,RebuildReason.ModelDiffstores changes in flat-native arrays (bounds / var-types / RHS / signs / COO coefs / linear objective / sense) plus per-containerVarSlice/ConSliceviews.ModelDiff.from_snapshot(snap, model)andModelDiff.from_models(m1, m2)— snapshot-based and snapshot-free diffs._coef_dirtyflag on constraints with RHS-setter short-circuit so RHS-only edits skip the coefficient re-walk.Solver orchestration
Solvergainstrack_updates, lazy-build (firstsolve(model, …)builds),apply_update,update,disallow_rebuild, and structured rebuild reasons. Backends without persistent-update support short-circuit to rebuild.apply_update:changeColsBounds/changeColsIntegrality/changeRowBounds/changeCoeff/changeColsCost/changeObjectiveSense. Sign change → rebuild.setAttr(LB/UB/VType/RHS/Sense/Obj),chgCoeff,ModelSense. In-place sign change.chgbounds/chgcoltype/chgrhs/chgrowtype/chgmcoef/chgobj/chgobjsense. In-place sign change.chgvarbound/chgconbound/putvartypelist/putaijlist/putclist/putobjsense. Sign change → rebuild.Tests
test_persistent_snapshot_diff.pycovering allModelDiffsemantics.test_persistent_apply_update.pyrunning 9 cases × 4 backends (skipped per backend when license/installation is unavailable).test_persistent_solver_extras.py.Follow-ups (tracked)
Findings from a symmetric rolling-horizon benchmark on a PyPSA-Eur network (426k vars / 1.28M cons, 168h window, 24h step, HiGHS):
Basis policy afterTo be done by the user, add to docsapply_update: the in-place path reuses the stale basis unconditionally. With HiGHS a valid basis skips presolve, so a 24h-shifted window hot-starts dual simplex on the full unpresolved LP — 126–219s vs 9.4s afterclearSolver()(13–23x regression). Expose areset_solver-style option onSolver.solveand/or auto-clear the basis when the diff touches a large fraction of rows.Solver.solve(model=...)path:Model.solverunssanitize_zeros/sanitize_infinitiesbefore building (drops ~11% of rows on the benchmark network); the bareSolver.solvepath never sanitizes, so persistent users submit a larger problem. Can't be bolted on unconditionally — sanitize-induced mask changes between solves can themselves trigger rebuilds (STRUCTURAL_LABELS/SPARSITY).Checklist
doc.doc/release_notes.rstof the upcoming release is included.