Skip to content

[AURON #2238] Add support for auron.never.convert.reason in Iceberg scan scenarios #2237

Open
guixiaowen wants to merge 1 commit intoapache:masterfrom
guixiaowen:foriceberg_addnever_convert_reason
Open

[AURON #2238] Add support for auron.never.convert.reason in Iceberg scan scenarios #2237
guixiaowen wants to merge 1 commit intoapache:masterfrom
guixiaowen:foriceberg_addnever_convert_reason

Conversation

@guixiaowen
Copy link
Copy Markdown
Contributor

@guixiaowen guixiaowen commented May 6, 2026

Which issue does this PR close?

Closes #2238

Rationale for this change

What changes are included in this PR?

Are there any user-facing changes?

How was this patch tested?

@guixiaowen guixiaowen changed the title test [AURON #2238] Add support for auron.never.convert.reason in Iceberg scan scenarios May 6, 2026
@guixiaowen guixiaowen force-pushed the foriceberg_addnever_convert_reason branch from 844591e to 7f35ead Compare May 6, 2026 16:48
@slfan1989 slfan1989 requested a review from Copilot May 6, 2026 23:09
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to ensure Iceberg fallback scenarios populate the Spark plan tag auron.never.convert.reason, making fallback reasons visible (e.g., in Spark UI) similarly to other scan conversions.

Changes:

  • Updated AuronConvertProvider to make isEnabled depend on the current SparkPlan, and adjusted Iceberg/Hudi/Paimon providers accordingly.
  • Added exception-based tagging in AuronConverters.convertSparkPlan to set auron.never.convert.reason when conversion is rejected via assertions/exceptions.
  • Added Iceberg integration tests asserting auron.never.convert.reason is present for disabled Iceberg scan and unsupported metadata-column scenarios.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
thirdparty/auron-paimon/src/main/scala/org/apache/spark/sql/hive/auron/paimon/PaimonConvertProvider.scala Updates provider enablement to be plan-type aware.
thirdparty/auron-iceberg/src/test/scala/org/apache/auron/iceberg/AuronIcebergIntegrationSuite.scala Adds integration tests validating auron.never.convert.reason for Iceberg fallback cases.
thirdparty/auron-iceberg/src/main/scala/org/apache/spark/sql/auron/iceberg/IcebergScanSupport.scala Converts multiple “return None” fallbacks into assert-based failures with messages intended for tagging.
thirdparty/auron-iceberg/src/main/scala/org/apache/spark/sql/auron/iceberg/IcebergConvertProvider.scala Updates isEnabled signature and uses assertions to drive fallback reason tagging.
thirdparty/auron-iceberg/pom.xml Adds scala-library dependency (provided scope).
thirdparty/auron-hudi/src/main/scala/org/apache/spark/sql/auron/hudi/HudiConvertProvider.scala Updates provider enablement to be plan-type aware.
spark-extension/src/main/scala/org/apache/spark/sql/auron/AuronConvertProvider.scala Changes isEnabled API to accept SparkPlan.
spark-extension/src/main/scala/org/apache/spark/sql/auron/AuronConverters.scala Uses isEnabled(exec) and adds try/catch-based never-convert reason tagging in generic conversion path.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +31 to +40
override def isEnabled(exec: SparkPlan): Boolean = {
exec match {
case _: BatchScanExec =>
val enabled = SparkAuronConfiguration.ENABLE_ICEBERG_SCAN.get()
assert(enabled, "Conversion disabled: auron.enable.iceberg.scan=false.")
assert(
sparkCompatible,
s"Supported Spark versions: 3.4 to 4.0 (Iceberg ${icebergVersionOrUnknown}).")
return false
enabled
case _ => false
Comment on lines 46 to 56
def plan(exec: BatchScanExec): Option[IcebergScanPlan] = {
val scan = exec.scan
val scanClassName = scan.getClass.getName
// Only handle Iceberg scans; other sources must stay on Spark's path.
if (!scanClassName.startsWith("org.apache.iceberg.spark.source.")) {
return None
}
assert(scanClassName.startsWith("org.apache.iceberg.spark.source."), "Not iceberg scans.")

// Changelog scan carries row-level changes; not supported by native COW-only path.
if (scanClassName == "org.apache.iceberg.spark.source.SparkChangelogScan") {
return None
}
assert(
!(scanClassName == "org.apache.iceberg.spark.source.SparkChangelogScan"),
"Not iceberg cow table.")

Comment on lines +61 to +63
assert(
!(unsupportedMetadataColumns.nonEmpty),
"Has per-row materialization (for example _pos).")
Comment on lines +242 to +260
try {
extConvertProviders.find(h => h.isEnabled(exec) && h.isSupported(exec)) match {
case Some(provider) => tryConvert(exec, provider.convert)
case None =>
Shims.get.convertMoreSparkPlan(exec) match {
case Some(exec) =>
exec.setTagValue(convertibleTag, true)
exec.setTagValue(convertStrategyTag, AlwaysConvert)
exec
} else {
addNeverConvertReasonTag(exec)
}
}
case None =>
if (Shims.get.isNative(exec)) { // for QueryStageInput and CustomShuffleReader
exec.setTagValue(convertibleTag, true)
exec.setTagValue(convertStrategyTag, AlwaysConvert)
exec
} else {
addNeverConvertReasonTag(exec)
}
}
}
Comment on lines +263 to +268
exec.setTagValue(convertToNonNativeTag, true)
exec.setTagValue(convertibleTag, false)
exec.setTagValue(convertStrategyTag, NeverConvert)
exec.setTagValue(
neverConvertReasonTag,
s"${e.getMessage.replaceFirst("^assertion failed: ?", "")}")
Comment on lines +61 to +63
assert(
!(unsupportedMetadataColumns.nonEmpty),
"Has per-row materialization (for example _pos).")
Comment on lines +43 to +46
val neverConvertReasonTag: TreeNodeTag[String] = TreeNodeTag("auron.never.convert.reason")
assert(collectFirst(df.queryExecution.executedPlan) { case batchScanExec: BatchScanExec =>
batchScanExec.getTagValue(neverConvertReasonTag)
}.get.get.equals("Conversion disabled: auron.enable.iceberg.scan=false."))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add support for auron.never.convert.reason in Iceberg scan scenarios

2 participants