[SPARK-50858][PYTHON] Add configuration to hide Python UDF stack trace #49535

wengh · 2025-01-17T00:12:22Z

What changes were proposed in this pull request?

This PR adds new configuration spark.sql.execution.pyspark.udf.hideTraceback.enabled. If set, when handling an exception from Python UDF, only the exception class and message are included. The configuration is turned off by default.
This PR also adds a new optional parameter hide_traceback for handle_udf_exception to override the configuration.

Suggested review order:

python/pyspark/util.py: logic changes
python/pyspark/tests/test_util.py: unit tests
other files: adding new configuration

Why are the changes needed?

This allows library provided UDFs to show only the relevant message without unnecessary stack trace.

Does this PR introduce any user-facing change?

If the configuration is turned off, no user change.
Otherwise, the stack trace is not included in the error message when handling an exception from Python UDF.

Example that illustrates the difference

from pyspark.errors.exceptions.base import PySparkRuntimeError
from pyspark.sql.types import IntegerType, StructField, StructType
from pyspark.sql.udtf import AnalyzeArgument, AnalyzeResult
from pyspark.sql.functions import udtf


@udtf()
class PythonUDTF:
    @staticmethod
    def analyze(x: AnalyzeArgument) -> AnalyzeResult:
        raise PySparkRuntimeError("[XXX] My PySpark runtime error.")

    def eval(self, x: int):
        yield (x,)


spark.udtf.register("my_udtf", PythonUDTF)
spark.sql("select * from my_udtf(1)").show()

With configuration turned off, the last line gives:

...
pyspark.errors.exceptions.captured.AnalysisException: [TABLE_VALUED_FUNCTION_FAILED_TO_ANALYZE_IN_PYTHON] Failed to analyze the Python user defined table function: Traceback (most recent call last):
  File "<stdin>", line 7, in analyze
pyspark.errors.exceptions.base.PySparkRuntimeError: [XXX] My PySpark runtime error. SQLSTATE: 38000; line 1 pos 14

With configuration turned on, the last line gives:

...
pyspark.errors.exceptions.captured.AnalysisException: [TABLE_VALUED_FUNCTION_FAILED_TO_ANALYZE_IN_PYTHON] Failed to analyze the Python user defined table function: pyspark.errors.exceptions.base.PySparkRuntimeError: [XXX] My PySpark runtime error. SQLSTATE: 38000; line 1 pos 14

How was this patch tested?

Added unit test in python/pyspark/tests/test_util.py, testing two cases with the configuration turned on and off respectively.

Was this patch authored or co-authored using generative AI tooling?

No

HyukjinKwon · 2025-01-17T00:46:25Z

I think we should show the UDF exception. Otherwise users won't know what's going on and why their job fails.

allisonwang-db · 2025-01-17T02:01:50Z

@HyukjinKwon I agree, and this PR is just to make it configurable (with the default value set to false - show stack trace by default). There are many user-friendly errors on the Python side, but they are often buried in a long Python-side stack trace. This change is intended to optionally hide these stack traces to improve the user experience.

wengh · 2025-01-21T18:50:27Z

@allisonwang-db @ueshin Could you review this PR that adds configuration to hide Python stack trace from analyze_udtf?

sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala

allisonwang-db · 2025-01-21T19:15:22Z

core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala

@@ -122,6 +122,7 @@ private[spark] abstract class BasePythonRunner[IN, OUT](
  protected val authSocketTimeout = conf.get(PYTHON_AUTH_SOCKET_TIMEOUT)
  private val reuseWorker = conf.get(PYTHON_WORKER_REUSE)
  protected val faultHandlerEnabled: Boolean = conf.get(PYTHON_WORKER_FAULTHANLDER_ENABLED)
+  protected val hideTraceback: Boolean = false


Can we use conf.get(PYSPARK_HIDE_TRACEBACK) here so that we don't need to override every subclass?

The config is defined in org.apache.spark.sql.internal.SQLConf which seems to be inaccessible from here. For reference, PYSPARK_SIMPLIFIED_TRACEBACK is also defined in SQLConf so BasePythonRunner subclasses have to override it.

Is there an advantage for putting it in SQLConf rather than e.g. org.apache.spark.internal.config.Python?

The conf in SQLConf is session-based conf that also can be set in runtime, and any conf in core module or StaticSQLConf is cluster-wide conf and can't be changed while the cluster is running.

python/pyspark/util.py

HyukjinKwon · 2025-01-22T00:29:08Z

sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala

+          "hiding the stack trace.")
+      .version("4.0.0")
+      .booleanConf
+      .createWithDefault(false)


another way is to create this conf as an int, and show the max depth of stacktrace but I don't feel strongly.

Is there a use case where we only want to show only last k frames of the stack? I'm under the impression that we want to show full stack trace for most exceptions, and completely hide stack trace for specific library exceptions when the message is sufficient to identify the reason.

…onf.scala Co-authored-by: Allison Wang <[email protected]>

ueshin

Could you show the example of the error messages, between this is enabled and disabled, in the PR description?

python/pyspark/util.py

sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala

ueshin · 2025-01-23T23:02:59Z

core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala

@@ -122,6 +122,7 @@ private[spark] abstract class BasePythonRunner[IN, OUT](
  protected val authSocketTimeout = conf.get(PYTHON_AUTH_SOCKET_TIMEOUT)
  private val reuseWorker = conf.get(PYTHON_WORKER_REUSE)
  protected val faultHandlerEnabled: Boolean = conf.get(PYTHON_WORKER_FAULTHANLDER_ENABLED)
+  protected val hideTraceback: Boolean = false


The conf in SQLConf is session-based conf that also can be set in runtime, and any conf in core module or StaticSQLConf is cluster-wide conf and can't be changed while the cluster is running.

wengh · 2025-01-23T23:19:25Z

Could you show the example of the error messages, between this is enabled and disabled, in the PR description?

@ueshin

It's in the collapsible section "Example that illustrates the difference"

ueshin · 2025-01-23T23:41:27Z

It's in the collapsible section

ah, I didn't notice that. thanks!

allisonwang-db · 2025-01-28T00:36:02Z

Thanks! Merging to master and branch-4.0.

### What changes were proposed in this pull request? This PR adds new configuration `spark.sql.execution.pyspark.udf.hideTraceback.enabled`. If set, when handling an exception from Python UDF, only the exception class and message are included. The configuration is turned off by default. This PR also adds a new optional parameter `hide_traceback` for `handle_udf_exception` to override the configuration. Suggested review order: 1. `python/pyspark/util.py`: logic changes 2. `python/pyspark/tests/test_util.py`: unit tests 3. other files: adding new configuration ### Why are the changes needed? This allows library provided UDFs to show only the relevant message without unnecessary stack trace. ### Does this PR introduce _any_ user-facing change? If the configuration is turned off, no user change. Otherwise, the stack trace is not included in the error message when handling an exception from Python UDF. <details> <summary>Example that illustrates the difference</summary> ```py from pyspark.errors.exceptions.base import PySparkRuntimeError from pyspark.sql.types import IntegerType, StructField, StructType from pyspark.sql.udtf import AnalyzeArgument, AnalyzeResult from pyspark.sql.functions import udtf udtf() class PythonUDTF: staticmethod def analyze(x: AnalyzeArgument) -> AnalyzeResult: raise PySparkRuntimeError("[XXX] My PySpark runtime error.") def eval(self, x: int): yield (x,) spark.udtf.register("my_udtf", PythonUDTF) spark.sql("select * from my_udtf(1)").show() ``` With configuration turned off, the last line gives: ``` ... pyspark.errors.exceptions.captured.AnalysisException: [TABLE_VALUED_FUNCTION_FAILED_TO_ANALYZE_IN_PYTHON] Failed to analyze the Python user defined table function: Traceback (most recent call last): File "<stdin>", line 7, in analyze pyspark.errors.exceptions.base.PySparkRuntimeError: [XXX] My PySpark runtime error. SQLSTATE: 38000; line 1 pos 14 ``` With configuration turned on, the last line gives: ``` ... pyspark.errors.exceptions.captured.AnalysisException: [TABLE_VALUED_FUNCTION_FAILED_TO_ANALYZE_IN_PYTHON] Failed to analyze the Python user defined table function: pyspark.errors.exceptions.base.PySparkRuntimeError: [XXX] My PySpark runtime error. SQLSTATE: 38000; line 1 pos 14 ``` </details> ### How was this patch tested? Added unit test in `python/pyspark/tests/test_util.py`, testing two cases with the configuration turned on and off respectively. ### Was this patch authored or co-authored using generative AI tooling? No Closes #49535 from wengh/spark-50858-hide-udf-stack-trace. Authored-by: Haoyu Weng <[email protected]> Signed-off-by: Allison Wang <[email protected]> (cherry picked from commit d259132) Signed-off-by: Allison Wang <[email protected]>

dongjoon-hyun · 2025-01-28T01:47:16Z

Hi, @allisonwang-db , @ueshin , @HyukjinKwon . Although this passed CIs, it seems that this causes Python linter failures at both master/branch-4.0 branches. Could you check them please?

master: https://github.com/apache/spark/actions/runs/13001188421/job/36260099936
branch-4.0: https://github.com/apache/spark/actions/runs/13001191589/job/36260101973

starting mypy annotations test...
annotations failed mypy checks:
python/pyspark/ml/classification.py:3578: error: Function is missing a return type annotation  [no-untyped-def]
python/pyspark/ml/classification.py:3598: error: Incompatible return value type (got "tuple[int, Any]", expected "CM")  [return-value]
python/pyspark/ml/classification.py:3616: error: Argument "models" to "OneVsRestModel" has incompatible type "list[None]"; expected "list[ClassificationModel]"  [arg-type]
python/pyspark/ml/connect/readwrite.py:198: error: Incompatible types in assignment (expression has type "OneVsRestWriter", variable has type "Write")  [assignment]
python/pyspark/ml/connect/readwrite.py:199: error: "Write" has no attribute "session"  [attr-defined]
python/pyspark/ml/connect/readwrite.py:200: error: "Write" has no attribute "save"  [attr-defined]
python/pyspark/ml/connect/readwrite.py:209: error: Incompatible types in assignment (expression has type "OneVsRestModelWriter", variable has type "Write")  [assignment]
python/pyspark/ml/connect/readwrite.py:210: error: "Write" has no attribute "session"  [attr-defined]
python/pyspark/ml/connect/readwrite.py:211: error: "Write" has no attribute "save"  [attr-defined]
Found 9 errors in 2 files (checked 1083 source files)
1
Error: Process completed with exit code 1.

allisonwang-db · 2025-01-28T01:48:06Z

@dongjoon-hyun Thanks for letting us know. cc @wengh could you take a look?

dongjoon-hyun · 2025-01-28T01:57:31Z

Oh, never mind. @HyukjinKwon 's revert seems to fix it in some way in both branches already.

0c283a8

dongjoon-hyun · 2025-01-28T01:58:48Z

Thank you for the reply anyway, @allisonwang-db .

HyukjinKwon · 2025-01-28T01:59:16Z

sorry it was my bad 😢

github-actions bot added SQL CORE PYTHON labels Jan 17, 2025

wengh force-pushed the spark-50858-hide-udf-stack-trace branch from c46ecf5 to 174fdea Compare January 17, 2025 00:17

wengh marked this pull request as ready for review January 17, 2025 18:40

allisonwang-db reviewed Jan 21, 2025

View reviewed changes

HyukjinKwon reviewed Jan 22, 2025

View reviewed changes

wengh requested a review from allisonwang-db January 23, 2025 01:22

wengh and others added 6 commits January 23, 2025 10:02

[SPARK-50858][PYTHON] Add configuration to hide Python UDF stack trace

1e2a62c

fix lint

e038bb8

fix lint

c5999fb

Update sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLC…

31261b1

…onf.scala Co-authored-by: Allison Wang <[email protected]>

fix config description

3f509be

add hide_traceback parameter

c4f40b0

wengh force-pushed the spark-50858-hide-udf-stack-trace branch from c412da6 to c4f40b0 Compare January 23, 2025 18:03

allisonwang-db approved these changes Jan 23, 2025

View reviewed changes

ueshin reviewed Jan 23, 2025

View reviewed changes

update description for hideTraceback

8901680

add test for hide_traceback override

6245886

wengh requested a review from ueshin January 24, 2025 17:02

ueshin approved these changes Jan 27, 2025

View reviewed changes

allisonwang-db closed this in d259132 Jan 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-50858][PYTHON] Add configuration to hide Python UDF stack trace #49535

[SPARK-50858][PYTHON] Add configuration to hide Python UDF stack trace #49535

wengh commented Jan 17, 2025 •

edited

Loading

HyukjinKwon commented Jan 17, 2025

allisonwang-db commented Jan 17, 2025 •

edited

Loading

wengh commented Jan 21, 2025

allisonwang-db Jan 21, 2025

wengh Jan 21, 2025 •

edited

Loading

ueshin Jan 23, 2025

HyukjinKwon Jan 22, 2025

wengh Jan 22, 2025 •

edited

Loading

ueshin left a comment •

edited

Loading

ueshin Jan 23, 2025

wengh commented Jan 23, 2025 •

edited

Loading

ueshin commented Jan 23, 2025

allisonwang-db commented Jan 28, 2025

dongjoon-hyun commented Jan 28, 2025

allisonwang-db commented Jan 28, 2025 •

edited

Loading

dongjoon-hyun commented Jan 28, 2025

dongjoon-hyun commented Jan 28, 2025

HyukjinKwon commented Jan 28, 2025

[SPARK-50858][PYTHON] Add configuration to hide Python UDF stack trace #49535

[SPARK-50858][PYTHON] Add configuration to hide Python UDF stack trace #49535

Conversation

wengh commented Jan 17, 2025 • edited Loading

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

HyukjinKwon commented Jan 17, 2025

allisonwang-db commented Jan 17, 2025 • edited Loading

wengh commented Jan 21, 2025

allisonwang-db Jan 21, 2025

Choose a reason for hiding this comment

wengh Jan 21, 2025 • edited Loading

Choose a reason for hiding this comment

ueshin Jan 23, 2025

Choose a reason for hiding this comment

HyukjinKwon Jan 22, 2025

Choose a reason for hiding this comment

wengh Jan 22, 2025 • edited Loading

Choose a reason for hiding this comment

ueshin left a comment • edited Loading

Choose a reason for hiding this comment

ueshin Jan 23, 2025

Choose a reason for hiding this comment

wengh commented Jan 23, 2025 • edited Loading

ueshin commented Jan 23, 2025

allisonwang-db commented Jan 28, 2025

dongjoon-hyun commented Jan 28, 2025

allisonwang-db commented Jan 28, 2025 • edited Loading

dongjoon-hyun commented Jan 28, 2025

dongjoon-hyun commented Jan 28, 2025

HyukjinKwon commented Jan 28, 2025

wengh commented Jan 17, 2025 •

edited

Loading

allisonwang-db commented Jan 17, 2025 •

edited

Loading

wengh Jan 21, 2025 •

edited

Loading

wengh Jan 22, 2025 •

edited

Loading

ueshin left a comment •

edited

Loading

wengh commented Jan 23, 2025 •

edited

Loading

allisonwang-db commented Jan 28, 2025 •

edited

Loading