refactor(scorer): score batches over typed change details#207
Closed
behinddwalls wants to merge 2 commits into
Closed
refactor(scorer): score batches over typed change details#207behinddwalls wants to merge 2 commits into
behinddwalls wants to merge 2 commits into
Conversation
## Summary ### Why? The change provider already produces rich per-URI facts (author, changed files, line counts), but its value types lived in the extension layer and the data was thrown away — validate fetched ChangeInfo only to log a file count, and ChangeRecord stored an opaque Metadata JSON string that was never written. Nothing downstream could read typed change facts. ### What? - Move the change value types into entities: entity.User, entity.ChangedFile (now with LinesModified), entity.ChangeDetails (the facts), and entity.ChangeInfo (URI -> Details), with aggregation helpers. The changeprovider extension and GitHub impl now produce these. - Replace ChangeRecord.Metadata (opaque string) with typed Details (ChangeDetails); the change table's metadata JSON column becomes details. - Add ChangeStore.UpdateDetails — a version-guarded conditional write, following the optimistic-locking contract (arithmetic in the controller). - validate now persists each fetched ChangeInfo onto the request's change records (per-URI, idempotent; ErrVersionMismatch is a benign no-op). This is the producer half: typed details now exist and are persisted. The score controller consumes them in a follow-up. ## Test Plan - ✅ `make build`, `make test`, `make lint`, `make check-mocks/gazelle/tidy` - ✅ `make integration-test` (storage contract suite round-trips Details and covers UpdateDetails create/update/version-mismatch)
## Summary ### Why? The scorer took entity.Change (just URIs), so it could not score on real change size — the example heuristic counted URIs as a placeholder. With typed change details now persisted on change records, the scorer can score a batch on its actual lines/files changed. ### What? - Add entity.BatchChanges — the normalized, batch-level view of all changes in a batch (BatchID, Queue, []ChangeInfo) with aggregation helpers. - Scorer.Score now takes entity.BatchChanges; the heuristic ValueFunc and the composite scorer operate over it. - The score controller resolves each request's change records, flattens their details into BatchChanges, and scores the batch once — replacing the per-request multiplicative product over len(URIs). - Example wiring buckets by total lines changed. Consumes the typed details persisted by the change-details change. ## Test Plan - ✅ `make build`, `make test`, `make lint`, `make check-mocks/gazelle/tidy` - ✅ `make integration-test`, `make e2e-test` (start -> validate enrich -> score normalizes the batch and scores on real change size)
This was referenced Jun 5, 2026
Collaborator
Author
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Summary
Why?
The scorer took entity.Change (just URIs), so it could not score on real
change size — the example heuristic counted URIs as a placeholder. With
typed change details now persisted on change records, the scorer can score a
batch on its actual lines/files changed.
What?
in a batch (BatchID, Queue, []ChangeInfo) with aggregation helpers.
composite scorer operate over it.
details into BatchChanges, and scores the batch once — replacing the
per-request multiplicative product over len(URIs).
Consumes the typed details persisted by the change-details change.
Test Plan
make build,make test,make lint,make check-mocks/gazelle/tidymake integration-test,make e2e-test(start -> validate enrich ->score normalizes the batch and scores on real change size)
Test Plan
Issues
Stack