Skip to content

[SPARK-57199][SQL] Extract the aggregate out-of-memory error into a QueryExecutionErrors factory#56256

Closed
gengliangwang wants to merge 1 commit into
apache:masterfrom
gengliangwang:spark-agg-oom-factory
Closed

[SPARK-57199][SQL] Extract the aggregate out-of-memory error into a QueryExecutionErrors factory#56256
gengliangwang wants to merge 1 commit into
apache:masterfrom
gengliangwang:spark-agg-oom-factory

Conversation

@gengliangwang
Copy link
Copy Markdown
Member

What changes were proposed in this pull request?

The aggregate out-of-memory error (AGGREGATE_OUT_OF_MEMORY) is constructed inline in two places:

  • HashAggregateExec, whose whole-stage codegen emits throw new <SparkOutOfMemoryError>("AGGREGATE_OUT_OF_MEMORY", new java.util.HashMap()); into every generated aggregate class.
  • TungstenAggregationIterator (the interpreted fallback), which throws the same new SparkOutOfMemoryError(...) and needs a // scalastyle:off throwerror suppression.

This PR adds a QueryExecutionErrors.aggregateOutOfMemoryError() factory (next to the existing cannotAcquireMemory* OOM factories) and routes both call sites through it. In the codegen path the emitted Java becomes throw QueryExecutionErrors.aggregateOutOfMemoryError();.

Why are the changes needed?

Sub-task of SPARK-56908 (reduce generated Java size in whole-stage codegen). Dumping the whole-stage codegen of the TPC-DS queries shows the inline throw new org.apache.spark.memory.SparkOutOfMemoryError("AGGREGATE_OUT_OF_MEMORY", new java.util.HashMap()); line 445 times across 142 of 150 generated classes -- it is the single most-repeated throw in the corpus. Replacing it with a factory call shrinks each generated aggregate class and moves the error-class string and the empty message-parameter map out of every generated class's constant pool into one compiled method. It also consolidates the error construction shared with the interpreted path and removes the throwerror scalastyle suppression there.

Does this PR introduce any user-facing change?

No. The same AGGREGATE_OUT_OF_MEMORY error with the same (empty) message parameters is thrown; only where it is constructed changes.

How was this patch tested?

This is a behavior-preserving refactor, covered by the existing aggregate suites (e.g. DataFrameAggregateSuite, 163 tests, pass). The change was additionally verified by re-dumping the TPC-DS whole-stage codegen: all 445 inline throws are now QueryExecutionErrors.aggregateOutOfMemoryError() calls, and every generated subtree still compiles (the Janino default imports already make QueryExecutionErrors available unqualified, as used by other generated error calls such as divideByZeroError). This mirrors the sibling DateTimeExpressionUtils codegen extractions, which likewise relied on existing expression-suite coverage.

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude Code (Opus 4.8)

…ueryExecutionErrors factory

The AGGREGATE_OUT_OF_MEMORY error is constructed inline in HashAggregateExec
codegen (emitted into every generated aggregate class) and in
TungstenAggregationIterator. Add a QueryExecutionErrors.aggregateOutOfMemoryError()
factory and route both call sites through it.

Across the TPC-DS queries the inline
`throw new SparkOutOfMemoryError("AGGREGATE_OUT_OF_MEMORY", new java.util.HashMap());`
appears 445 times (the most-repeated throw in the generated code); the factory
call shrinks each generated class and moves the error-class string + empty
message map into one compiled method. Behavior is unchanged.

Generated-by: Claude Code (Opus 4.8)
@gengliangwang gengliangwang requested a review from LuciferYang June 2, 2026 20:48
gengliangwang added a commit that referenced this pull request Jun 3, 2026
…ueryExecutionErrors factory

### What changes were proposed in this pull request?

The aggregate out-of-memory error (`AGGREGATE_OUT_OF_MEMORY`) is constructed inline in two places:

- `HashAggregateExec`, whose whole-stage codegen emits `throw new <SparkOutOfMemoryError>("AGGREGATE_OUT_OF_MEMORY", new java.util.HashMap());` into every generated aggregate class.
- `TungstenAggregationIterator` (the interpreted fallback), which throws the same `new SparkOutOfMemoryError(...)` and needs a `// scalastyle:off throwerror` suppression.

This PR adds a `QueryExecutionErrors.aggregateOutOfMemoryError()` factory (next to the existing `cannotAcquireMemory*` OOM factories) and routes both call sites through it. In the codegen path the emitted Java becomes `throw QueryExecutionErrors.aggregateOutOfMemoryError();`.

### Why are the changes needed?

Sub-task of SPARK-56908 (reduce generated Java size in whole-stage codegen). Dumping the whole-stage codegen of the TPC-DS queries shows the inline `throw new org.apache.spark.memory.SparkOutOfMemoryError("AGGREGATE_OUT_OF_MEMORY", new java.util.HashMap());` line **445 times** across 142 of 150 generated classes -- it is the single most-repeated `throw` in the corpus. Replacing it with a factory call shrinks each generated aggregate class and moves the error-class string and the empty message-parameter map out of every generated class's constant pool into one compiled method. It also consolidates the error construction shared with the interpreted path and removes the `throwerror` scalastyle suppression there.

### Does this PR introduce _any_ user-facing change?

No. The same `AGGREGATE_OUT_OF_MEMORY` error with the same (empty) message parameters is thrown; only where it is constructed changes.

### How was this patch tested?

This is a behavior-preserving refactor, covered by the existing aggregate suites (e.g. `DataFrameAggregateSuite`, 163 tests, pass). The change was additionally verified by re-dumping the TPC-DS whole-stage codegen: all 445 inline throws are now `QueryExecutionErrors.aggregateOutOfMemoryError()` calls, and every generated subtree still compiles (the Janino default imports already make `QueryExecutionErrors` available unqualified, as used by other generated error calls such as `divideByZeroError`). This mirrors the sibling `DateTimeExpressionUtils` codegen extractions, which likewise relied on existing expression-suite coverage.

### Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude Code (Opus 4.8)

Closes #56256 from gengliangwang/spark-agg-oom-factory.

Authored-by: Gengliang Wang <gengliang@apache.org>
Signed-off-by: Gengliang Wang <gengliang@apache.org>
(cherry picked from commit 4f70b60)
Signed-off-by: Gengliang Wang <gengliang@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants