Skip to content

[FLUSS][Spark] Add set_cluster_configs and reset_cluster_configs procedures#3204

Open
XuQianJin-Stars wants to merge 3 commits intoapache:mainfrom
XuQianJin-Stars:spark-set-reset-cluster-configs
Open

[FLUSS][Spark] Add set_cluster_configs and reset_cluster_configs procedures#3204
XuQianJin-Stars wants to merge 3 commits intoapache:mainfrom
XuQianJin-Stars:spark-set-reset-cluster-configs

Conversation

@XuQianJin-Stars
Copy link
Copy Markdown
Contributor

Purpose

Linked issue: close #3203

This PR adds set_cluster_configs and reset_cluster_configs procedures to the Spark connector, aligning it with the Flink connector which already provides a complete set of cluster configuration management procedures (get_cluster_configs, set_cluster_configs, reset_cluster_configs).

Currently, Spark users can only read cluster configurations via get_cluster_configs but cannot dynamically modify or reset them through SQL. This change closes that gap.

Brief change log

  • Added SetClusterConfigsProcedure to dynamically set cluster configuration values via CALL sys.set_cluster_configs(config_pairs => ARRAY('key1', 'value1', 'key2', 'value2')).
  • Added ResetClusterConfigsProcedure to reset cluster configurations to their default values via CALL sys.reset_cluster_configs(config_keys => ARRAY('key1', 'key2')).
  • Registered both new procedures in SparkProcedures.
  • Both procedures reuse the existing Admin.alterClusterConfigs() API with AlterConfigOpType.SET and AlterConfigOpType.DELETE respectively.
  • Updated the Spark procedures documentation with syntax, parameters, return types, and examples for both new procedures.

Tests

  • SetClusterConfigsProcedureTest:
    • set_cluster_configs: set a single configuration — sets a config and verifies it via get_cluster_configs
    • set_cluster_configs: set multiple configurations — sets multiple configs in one call
    • set_cluster_configs: empty config_pairs should fail — validates error on empty input
    • set_cluster_configs: odd number of config_pairs should fail — validates error on malformed input
  • ResetClusterConfigsProcedureTest:
    • reset_cluster_configs: set and then reset a configuration — sets a config, resets it, and verifies it's no longer DYNAMIC
    • reset_cluster_configs: reset multiple configurations — resets multiple configs in one call
    • reset_cluster_configs: empty config_keys should fail — validates error on empty input

API and Format

No. This change only adds new Spark SQL procedures. No existing API or storage format is affected.

Documentation

Yes. Updated website/docs/engine-spark/procedures.md with documentation for both set_cluster_configs and reset_cluster_configs procedures, including syntax, parameters, return types, examples, and notes.

@XuQianJin-Stars XuQianJin-Stars changed the title [FLUSS][Spark] Add set_cluster_configs and reset_cluster_configs procedures for Spark connector [FLUSS][Spark] Add set_cluster_configs and reset_cluster_configs procedures Apr 25, 2026
…igsProcedureTest

- Replace DATALAKE_FORMAT=paimon with LOG_REPLICA_MIN_IN_SYNC_REPLICAS_NUMBER=2
  to avoid triggering Paimon LakeCatalog creation (which requires warehouse config).
- Loosen the post-reset assertion in ResetClusterConfigsProcedureTest: describeClusterConfigs
  only returns entries present in initial/dynamic configs, so 0 rows after reset is valid
  when the key was not in the initial configs.
@luoyuxia luoyuxia requested a review from Copilot April 28, 2026 03:56
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds Spark SQL procedures to modify and reset Fluss cluster configurations at runtime, bringing Spark parity with existing Flink procedures.

Changes:

  • Add set_cluster_configs procedure to set dynamic cluster config key/value pairs.
  • Add reset_cluster_configs procedure to delete (reset) dynamic cluster configs back to defaults.
  • Register new procedures in Spark and document usage/examples.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
website/docs/engine-spark/procedures.md Documents the new Spark procedures, including syntax, parameters, and examples.
fluss-spark/fluss-spark-common/src/main/scala/org/apache/fluss/spark/procedure/SetClusterConfigsProcedure.scala Implements sys.set_cluster_configs procedure and its output schema.
fluss-spark/fluss-spark-common/src/main/scala/org/apache/fluss/spark/procedure/ResetClusterConfigsProcedure.scala Implements sys.reset_cluster_configs procedure and its output schema.
fluss-spark/fluss-spark-common/src/main/scala/org/apache/fluss/spark/SparkProcedures.scala Registers the two new procedures in the Spark connector.
fluss-spark/fluss-spark-ut/src/test/scala/org/apache/fluss/spark/procedure/SetClusterConfigsProcedureTest.scala Adds unit tests for setting configs and validating input.
fluss-spark/fluss-spark-ut/src/test/scala/org/apache/fluss/spark/procedure/ResetClusterConfigsProcedureTest.scala Adds unit tests for resetting configs and validating input.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

val configPairs = if (args.numFields > 0 && !args.isNullAt(0)) {
val pairsArray = args.getArray(0)
(0 until pairsArray.numElements())
.map(i => pairsArray.getUTF8String(i).toString)
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pairsArray.getUTF8String(i) will throw if any element in config_pairs is NULL. Since Spark SQL arrays can contain nulls, this can lead to a confusing runtime failure (Spark internal error) rather than a clear validation error. Consider checking pairsArray.isNullAt(i) and throwing an IllegalArgumentException with a user-facing message that identifies the null element (and ideally its position/key).

Suggested change
.map(i => pairsArray.getUTF8String(i).toString)
.map { i =>
if (pairsArray.isNullAt(i)) {
throw new IllegalArgumentException(
s"config_pairs contains a null element at position $i. " +
"Please specify valid configuration key/value pairs.")
}
pairsArray.getUTF8String(i).toString
}

Copilot uses AI. Check for mistakes.
val configKeys = if (args.numFields > 0 && !args.isNullAt(0)) {
val keysArray = args.getArray(0)
(0 until keysArray.numElements())
.map(i => keysArray.getUTF8String(i).toString)
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keysArray.getUTF8String(i) will throw if config_keys contains NULL. This should be validated explicitly so callers get a deterministic IllegalArgumentException (e.g., “config_keys must not contain nulls”) instead of a Spark internal error.

Suggested change
.map(i => keysArray.getUTF8String(i).toString)
.map { i =>
if (keysArray.isNullAt(i)) {
throw new IllegalArgumentException("config_keys must not contain nulls")
}
keysArray.getUTF8String(i).toString
}

Copilot uses AI. Check for mistakes.
Comment on lines +27 to +28
import scala.collection.JavaConverters._

Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This import is unused in the file (no Java/Scala collection conversions are performed). Removing it avoids warnings and keeps the new file minimal.

Suggested change
import scala.collection.JavaConverters._

Copilot uses AI. Check for mistakes.
Comment on lines +27 to +28
import scala.collection.JavaConverters._

Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This import is unused in the file. Please remove it to avoid warnings and keep the new file clean.

Suggested change
import scala.collection.JavaConverters._

Copilot uses AI. Check for mistakes.
**Syntax:**

```sql
CALL [catalog_name.]sys.set_cluster_configs(config_pairs => ARRAY('key1', 'value1', 'key2', 'value2'))
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The syntax examples in this doc section omit the trailing semicolon, while the surrounding examples (e.g., CALL sys.get_cluster_configs();) include semicolons. For consistency and copy/paste friendliness, consider adding ; to the end of the syntax lines.

Copilot uses AI. Check for mistakes.
**Syntax:**

```sql
CALL [catalog_name.]sys.reset_cluster_configs(config_keys => ARRAY('key1', 'key2'))
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same consistency issue as set_cluster_configs: consider adding a trailing semicolon in the syntax block to match other procedure examples in this page.

Suggested change
CALL [catalog_name.]sys.reset_cluster_configs(config_keys => ARRAY('key1', 'key2'))
CALL [catalog_name.]sys.reset_cluster_configs(config_keys => ARRAY('key1', 'key2'));

Copilot uses AI. Check for mistakes.
Comment on lines +60 to +68
override def call(args: InternalRow): Array[InternalRow] = {
val configPairs = if (args.numFields > 0 && !args.isNullAt(0)) {
val pairsArray = args.getArray(0)
(0 until pairsArray.numElements())
.map(i => pairsArray.getUTF8String(i).toString)
.toArray
} else {
Array.empty[String]
}
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new parsing logic should be exercised with arrays containing NULL elements (e.g., array('k1', null) or array(null, 'v1')) to ensure you return a clear validation error rather than a Spark internal exception. Adding one or two negative tests around NULL elements would lock in the intended behavior once validation is added.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Spark] Add set_cluster_configs and reset_cluster_configs procedures for Spark connector

2 participants