Skip to content

HIVE-28911: Improve SEARCH expansion to exploit <> operator#6503

Merged
zabetak merged 13 commits into
apache:masterfrom
rubenada:HIVE-28911
Jun 11, 2026
Merged

HIVE-28911: Improve SEARCH expansion to exploit <> operator#6503
zabetak merged 13 commits into
apache:masterfrom
rubenada:HIVE-28911

Conversation

@rubenada

@rubenada rubenada commented May 21, 2026

Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

Improve SEARCH expansion to exploit <> operator.
SEARCH operator can be used to represent many types of range predicates including the inequality operator (<>).
For example d_dom <> 10 and d_dom <> 20 can be represented as SEARCH($9, Sarg[(-∞..10), (10..20), (20..+∞)]).
Currently, after SEARCH expansion the following expression will be generated OR(<($9, 10), >($9, 20), AND(>($9, 10), <($9, 20))). With the proposed change we shall get the original (and simpler) AND(<>($9, 10), <>($9, 10)).

Why are the changes needed?

Exploit the inequality operator when expanding ranges to generate simpler and slightly more efficient expressions.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Unit test added. A few test plans adjusted reflecting this change.

…d of 'ref <> value1 AND ref <> value2' since the latter can break statistic propagation on partitioned tables (such as Iceberg).

During Conjunctive Normal Form (CNF) expansion, nested inequalities inside 'OR' clauses flatten into structures that Hive's SearchArgument (Sarg) builder and Iceberg's partition-pruning layer cannot natively translate. This may cause the compiler to abandon filter pushdown at the TableScan phase, resetting column statistics from PARTIAL to NONE.

@zabetak zabetak left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes LGTM. One point that is worth clarifying is what to do in FilterSelectivityEstimator and if its worth applying changes there or not.

@zabetak zabetak left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tentative approve assuming that we close the discussion around the FilterSelectivityEstimator changes.

@rubenada

rubenada commented Jun 8, 2026

Copy link
Copy Markdown
Contributor Author

Thanks for the review @zabetak .
As agreed, I have reverted the changes in FilterSelectivityEstimator, and created ticket 29652 to address the potential improvements of range selectivity estimations in the absence of histograms.

@sonarqubecloud

sonarqubecloud Bot commented Jun 8, 2026

Copy link
Copy Markdown

@zabetak zabetak merged commit 45bdea4 into apache:master Jun 11, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants