Design: interactive grid for the operator result pane #5395
Replies: 9 comments
-
|
Similar to my comments in #5394, please include diagrams. |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
|
Thanks. @mengw15 Please chime in to provide comments. |
Beta Was this translation helpful? Give feedback.
-
|
Thanks for putting this together - a few comments / questions:
|
Beta Was this translation helpful? Give feedback.
-
|
Thanks @mengw15, I agree that full-result backend filtering/search/sort has concerns, especially for large outputs. Iceberg predicate pushdown can help in some cases, but because operator results are written in arrival order, pruning may be weak. To make the feature safer and easier to review, I would like to split the work into phases.
To answer on above points-
Does this split and details sound reasonable? If yes, I can narrow the implementation to the interactive result grid first and leave backend full-result filtering/search/sort for a separate design and follow-up PR. Or open to any other suggestions. Thanks |
Beta Was this translation helpful? Give feedback.
-
|
@tanishqgandhi1908 If the details are hard to discuss online, we can do an offline discussion and report the results here. |
Beta Was this translation helpful? Give feedback.
-
|
I agree that an offline meeting would be helpful. My main concern is that I am not fully convinced these parts need to be changed. For example, why do we need to replace the current result table with ag-grid? Also, the floating result panel is an intentional design choice, so I am not sure why we should refactor it into a static/bottom-docked panel. I also believe Texera's frontend currently depends on NG-ZORRO, so I am not sure whether we want to introduce a second table dependency. cc @aglinxinyuan For the fourth point, my opinion is that Texera is a data-analysis workflow platform. If users want to see the result of selection/filter/sort as part of the analysis, they should add the corresponding operator to the workflow, especially when the cost is comparable. Result-panel filtering/sorting can be useful for temporary inspection, but I would prefer to keep the current design. This is just my current opinion. |
Beta Was this translation helpful? Give feedback.
-
|
Oh, got it. Sure, let's connect offline and discuss. |
Beta Was this translation helpful? Give feedback.
-
It's very obvious that the result panel should be floating panel. It's ok to introduce a new table library if ng-zorro cannot provide the features we need. |
Beta Was this translation helpful? Give feedback.




Uh oh!
There was an error while loading. Please reload this page.
-
Design conversation for #5394.
Making sort, filter, and row search work on the full dataset
The frontend only ever holds a small slice of the data, whatever pages the user has scrolled through. If sort and filter were evaluated only on the rows currently in browser memory, the user would silently get wrong results on any non-trivial dataset. To make them meaningful, the filter / sort / row-search criteria need to be evaluated on the backend, where the full dataset lives.
Operator results are already stored as Iceberg / Parquet files. Iceberg has two relevant capabilities for this:
The proposal is to surface these capabilities by extending the existing WebSocket pagination protocol with optional filter / sort / row-search fields, and adding methods to the storage abstraction that execute them through Iceberg:
ResultPaginationRequestgains optionalfilters,sorts, androwSearchfields. Requests without these fields take the same code path as today.VirtualDocumentgainsgetRangeWithQueryandcountWithQuerymethods, defaulted to safe fallbacks so non-Iceberg document types continue to work unchanged.IcebergPredicateBuildertranslates the wire-formatColumnFilterobjects into IcebergExpressions, with type-aware value parsing per column type so we don't silently mis-coerce strings into numbers.IcebergDocumentimplements both new methods. Operators Iceberg supports natively (eq,ne,lt,le,gt,ge,startsWith,isNull,isNotNull,in) are pushed down.containsandendsWitharen't pushdown-capable, so they're evaluated in memory over the iterator returned by the scan.rowSearchcompiles to a multi-columncontainsand runs as a residual.Sort is the one exception. Iceberg has no
ORDER BYpushdown, so a sort is necessarily executed in JVM memory over the filtered iterator. To prevent that from OOM-ing the backend on large filtered sets, sort is capped at a configurable row threshold (storage.result.sort.max-rows, default 100k). When the matched count exceeds the cap, rows are returned in scan order with asortSkippedflag in the response, and the frontend shows a banner explaining how to narrow the filter to enable sorting.Architectural notes
OperatorPaginationResultServiceis populated on response, so revisiting a page is a zero-WS round-trip.columnOffset/columnLimit/columnSearchare kept onResultPaginationRequestwith their defaults; the new frontend simply stops setting them because column virtualization makes the column pager obsolete. New fields are skipped when empty so the no-query path is byte-identical to today's payload.Reference implementation
The hackathon prototype — #5099 — has all of this working end-to-end. It's there for reference.
Happy to discuss more on this!!
Beta Was this translation helpful? Give feedback.
All reactions