Skip to content

VariantGet RFC#58

Merged
AdamGS merged 3 commits into
developfrom
adamg/update-variant-canonical
Jun 8, 2026
Merged

VariantGet RFC#58
AdamGS merged 3 commits into
developfrom
adamg/update-variant-canonical

Conversation

@AdamGS
Copy link
Copy Markdown
Collaborator

@AdamGS AdamGS commented May 5, 2026

RFC for the VariantGet expression, with lessons and thoughts learned through vortex-data/vortex#7494.

Signed-off-by: Adam Gutglick <adam@spiraldb.com>
@AdamGS AdamGS force-pushed the adamg/update-variant-canonical branch from 09a29d1 to 26f1e73 Compare May 5, 2026 11:43
@AdamGS AdamGS marked this pull request as ready for review May 5, 2026 11:44
Comment thread rfcs/0058-variant-get-expr.md
Comment thread rfcs/0058-variant-get-expr.md Outdated
Comment thread rfcs/0058-variant-get-expr.md Outdated
Comment thread rfcs/0058-variant-get-expr.md
Comment thread rfcs/0058-variant-get-expr.md
Comment thread rfcs/0058-variant-get-expr.md
Comment thread rfcs/0058-variant-get-expr.md
Comment thread rfcs/0058-variant-get-expr.md
Comment thread rfcs/0058-variant-get-expr.md
Comment thread rfcs/0058-variant-get-expr.md
@robert3005
Copy link
Copy Markdown

I think that I am a bit dumb when reading this - what would be advantage of skipping shredded values in the canonical array?

@AdamGS
Copy link
Copy Markdown
Collaborator Author

AdamGS commented May 5, 2026

That's the current thing, it makes the canonical array a really weird thing that is basically a pass-through to a bunch of things, with delicate rules around it to make sure everything is pushdown down.
Moving the shredded child into it makes it both more useful and allows us to generalize the behavior better, we can theoretically have a JSON core_storage child with some shredded fields, making the behavior bigger than specific encodings.

Comment thread rfcs/0058-variant-get-expr.md
Comment thread rfcs/0058-variant-get-expr.md
Comment thread rfcs/0058-variant-get-expr.md
Comment thread rfcs/0058-variant-get-expr.md
Comment thread rfcs/0058-variant-get-expr.md Outdated
Comment thread rfcs/0058-variant-get-expr.md
Signed-off-by: Adam Gutglick <adam@spiraldb.com>
Signed-off-by: Adam Gutglick <adam@spiraldb.com>
AdamGS added a commit to vortex-data/vortex that referenced this pull request May 15, 2026
## Summary

This PR includes two big changes as Variant moves closer to readiness.

1. Potentially holding the `shredded` child of a variant array in the
canonical VariantArray
2. A `VariantGet` expression that can pull extract data out of variant
arrays, either in a typed way or as a more opaque `Variant`.

For reviewers, some relevant context might be:
1. The [VariantGet](vortex-data/rfcs#58) RFC:
this RFC takes some lessons I've learned working on this into account
and reflects my updated view of this problem.
2. The original [Variant
type](https://vortex-data.github.io/rfcs/rfc/0015.html) RFC

I think the Parquet spec is also a pretty good read and a very heavy
influence of this work -
[`Shredding`](https://parquet.apache.org/docs/file-format/types/variantshredding/)
and the [`Variant
type`](https://parquet.apache.org/docs/file-format/types/variantencoding/).

---------

Signed-off-by: "Adam Gutglick" <adam@spiraldb.com>
Signed-off-by: Adam Gutglick <adam@spiraldb.com>
@AdamGS AdamGS merged commit c31aa70 into develop Jun 8, 2026
4 checks passed
@AdamGS AdamGS deleted the adamg/update-variant-canonical branch June 8, 2026 11:40
@AdamGS AdamGS temporarily deployed to github-pages June 8, 2026 11:41 — with GitHub Actions Inactive
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants