[codex] Fix media union matching#3462
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
📝 WalkthroughWalkthroughThis PR extends union and type matching logic to discriminate media values by their ChangesMedia Kind Discrimination in Unions
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Review rate limit: 6/8 reviews remaining, refill in 12 minutes and 18 seconds.Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@baml_language/crates/bex_engine/src/conversion.rs`:
- Around line 550-556: The current object_media_kind path treats any instance
whose `_data` field is Adt(Media(_)) as Ty::Media and thus may misclassify
ordinary user classes before Ty::Class matching; update the logic in
object_media_kind (and the companion branch around media_kind_matches/Ty::Media)
to only consider the `_data` ADT-as-media heuristic when the object's runtime
class is one of the stdlib media wrapper classes (e.g., check the class
identity/name/ID against the known stdlib media wrappers) and otherwise fall
back to normal Ty::Class handling; after changing the guard add a regression
test demonstrating a user-defined class (e.g., `class Holder { _data pdf }`)
does not get treated as the `pdf` media variant.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 5d3b6d6a-cd3b-4474-9845-ed74eb408b74
📒 Files selected for processing (2)
baml_language/crates/bex_engine/src/conversion.rsbaml_language/crates/bex_engine/tests/media_roundtrip.rs
| if let Some(actual_media_kind) = object_media_kind(obj) { | ||
| if let Some(member) = members.iter().find(|m| { | ||
| matches!(m, Ty::Media(kind, _) if media_kind_matches(actual_media_kind, *kind)) | ||
| }) { | ||
| return Some(member); | ||
| } | ||
| } |
There was a problem hiding this comment.
Limit _data-based media detection to actual media classes.
These helpers currently treat any instance whose _data field contains Adt(Media(_)) as Ty::Media. That changes union discrimination for ordinary user classes before Ty::Class matching runs, e.g. class Holder { _data pdf } against Holder | pdf can now resolve to pdf. Please gate this path on the instance/class being one of the stdlib media wrappers, then add a regression for a non-media class with _data.
Possible guard
fn external_media_kind(value: &BexExternalValue) -> Option<MediaKind> {
match value {
BexExternalValue::Adt(BexExternalAdt::Media(media)) => Some(media.kind),
- BexExternalValue::Instance { fields, .. } => {
+ BexExternalValue::Instance { class_name, fields }
+ if is_media_class_name(class_name) =>
+ {
fields.get("_data").and_then(external_media_kind)
}
BexExternalValue::Union { value, .. } => external_media_kind(value),
_ => None,
}
}
fn instance_media_kind(instance: &bex_vm_types::Instance) -> Option<MediaKind> {
let class_obj = unsafe { instance.class.get() };
let Object::Class(class) = class_obj else {
return None;
};
+ if !is_media_type_name(&class.name) {
+ return None;
+ }
class
.fields
.iter()
.zip(instance.fields.iter())Also applies to: 624-662
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@baml_language/crates/bex_engine/src/conversion.rs` around lines 550 - 556,
The current object_media_kind path treats any instance whose `_data` field is
Adt(Media(_)) as Ty::Media and thus may misclassify ordinary user classes before
Ty::Class matching; update the logic in object_media_kind (and the companion
branch around media_kind_matches/Ty::Media) to only consider the `_data`
ADT-as-media heuristic when the object's runtime class is one of the stdlib
media wrapper classes (e.g., check the class identity/name/ID against the known
stdlib media wrappers) and otherwise fall back to normal Ty::Class handling;
after changing the guard add a regression test demonstrating a user-defined
class (e.g., `class Holder { _data pdf }`) does not get treated as the `pdf`
media variant.
Merging this PR will improve performance by 44.04%
|
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| ⚡ | WallTime | vm_array_push_50k |
9.4 ms | 8 ms | +18.26% |
| ⚡ | WallTime | vm_array_iter_10k |
5.8 ms | 4.6 ms | +24.53% |
| ⚡ | WallTime | vm_closure_call_50k |
10.1 ms | 9 ms | +12.49% |
| ⚡ | WallTime | vm_field_access_50k |
8.4 ms | 7.5 ms | +11.71% |
| ⚡ | WallTime | vm_nested_loop |
6.1 ms | 5.2 ms | +17.49% |
| ⚡ | WallTime | vm_mixed_ops |
20.8 ms | 16.9 ms | +23.54% |
| ⚡ | WallTime | vm_string_concat_5k |
54.1 ms | 45.3 ms | +19.44% |
| ⚡ | WallTime | vm_fib_20 |
5.2 ms | 4.5 ms | +14.65% |
| ⚡ | WallTime | engine_init_cost |
2.9 ms | 2 ms | +44.04% |
| ⚡ | WallTime | vm_call_chain_100_x_5k |
37.9 ms | 32.7 ms | +16% |
Comparing aaron/media-fix (9a358bf) with canary (953cd2f)
Binary size checks passed✅ 7 passed
Generated by |
Summary
Fix runtime union discrimination for BAML media values so media outputs can match media union arms such as
image | string.Root Cause
Media values were converted to external
Adt(Media(_))values, but union matching did not inspect the underlyingMediaValue.kind. That made the matcher report a genericmediavalue as not matchingimageorstring.Changes
MediaValue.kindagainstTy::Mediaunion members._data.pdf | stringreturns.Validation
cargo fmt --checkcargo test -p bex_engine mediaNote
Medium Risk
Changes union member selection logic for
Ty::Mediaduring VM↔external conversions, which can affect runtime type wrapping and downstream consumers of union metadata. Scope is focused and covered by new unit and integration tests.Overview
Fixes union discrimination so BAML
mediavalues correctly matchTy::Mediaarms (e.g.image | string) by inspecting the underlyingMediaValue.kindrather than treating media as an opaque ADT.Adds helpers to extract media kind from both external values (
Adt(Media(_)), union wrappers, and class-shaped instances with_data) and VM objects (includingRustDataand instances), and extends tests to cover raw media union wrapping plus an engine-levelpdf | stringreturn selecting thepdfarm.Reviewed by Cursor Bugbot for commit 9a358bf. Bugbot is set up for automated code reviews on this repo. Configure here.
Summary by CodeRabbit
New Features
Tests