Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-50959][ML][PYTHON] Swallow the exception of JavaWrapper.__del__ #49615

Closed
wants to merge 1 commit into from

Conversation

wbo4958
Copy link
Contributor

@wbo4958 wbo4958 commented Jan 23, 2025

What changes were proposed in this pull request?

from pyspark.ml.classification import LogisticRegression
from pyspark.ml.linalg import Vectors
from pyspark.sql import SparkSession

spark = SparkSession.builder.master("local[*]").getOrCreate()

dataset = spark.createDataFrame(
    [(Vectors.dense([0.0]), 0.0),
     (Vectors.dense([0.4]), 1.0),
     (Vectors.dense([0.5]), 0.0),
     (Vectors.dense([0.6]), 1.0),
     (Vectors.dense([1.0]), 1.0)] * 10,
    ["features", "label"])
lr = LogisticRegression()
model = lr.fit(dataset)

The above code will raise exception at the end. Even I remove the @try_remote_del from https://github.com/apache/spark/blob/master/python/pyspark/ml/wrapper.py#L58 , the issue is still there.

Exception ignored in: <function JavaWrapper.del_ at 0x70b8caf2f920>_          
Traceback (most recent call last):
  File "/home/xxx/work.d/spark/spark-master/python/pyspark/ml/util.py", line 254, in wrapped
  File "/home/xxx/work.d/spark/spark-master/python/pyspark/ml/wrapper.py", line 60, in __del__
ImportError: sys.meta_path is None, Python is likely shutting down
Exception ignored in: <function JavaWrapper.del_ at 0x70b8caf2f920>_
Traceback (most recent call last):
  File "/home/xxx/work.d/spark/spark-master/python/pyspark/ml/util.py", line 254, in wrapped
  File "/home/xxx/work.d/spark/spark-master/python/pyspark/ml/wrapper.py", line 60, in __del__
ImportError: sys.meta_path is None, Python is likely shutting down

Looks like the __del__ function of JavaWrapper is called after python shutting down and the python related modules is not functional.

Why are the changes needed?

Fix the bug, good user experience.

Does this PR introduce any user-facing change?

No

How was this patch tested?

CI passes

Was this patch authored or co-authored using generative AI tooling?

No

@wbo4958
Copy link
Contributor Author

wbo4958 commented Jan 23, 2025

Hi @HyukjinKwon , @zhengruifeng @WeichenXu123 , please help review it. Thx a lot.

@HyukjinKwon HyukjinKwon changed the title [SPARK-50959][ML][PYTHON] swallow the exception of JavaWrapper.__del__ [SPARK-50959][ML][PYTHON] Swallow the exception of JavaWrapper.__del__ Jan 23, 2025
zhengruifeng pushed a commit that referenced this pull request Jan 23, 2025
### What changes were proposed in this pull request?

``` python
from pyspark.ml.classification import LogisticRegression
from pyspark.ml.linalg import Vectors
from pyspark.sql import SparkSession

spark = SparkSession.builder.master("local[*]").getOrCreate()

dataset = spark.createDataFrame(
    [(Vectors.dense([0.0]), 0.0),
     (Vectors.dense([0.4]), 1.0),
     (Vectors.dense([0.5]), 0.0),
     (Vectors.dense([0.6]), 1.0),
     (Vectors.dense([1.0]), 1.0)] * 10,
    ["features", "label"])
lr = LogisticRegression()
model = lr.fit(dataset)
```

The above code will raise exception at the end. Even I remove the `try_remote_del` from https://github.com/apache/spark/blob/master/python/pyspark/ml/wrapper.py#L58 , the issue is still there.

``` console
Exception ignored in: <function JavaWrapper.del_ at 0x70b8caf2f920>_
Traceback (most recent call last):
  File "/home/xxx/work.d/spark/spark-master/python/pyspark/ml/util.py", line 254, in wrapped
  File "/home/xxx/work.d/spark/spark-master/python/pyspark/ml/wrapper.py", line 60, in __del__
ImportError: sys.meta_path is None, Python is likely shutting down
Exception ignored in: <function JavaWrapper.del_ at 0x70b8caf2f920>_
Traceback (most recent call last):
  File "/home/xxx/work.d/spark/spark-master/python/pyspark/ml/util.py", line 254, in wrapped
  File "/home/xxx/work.d/spark/spark-master/python/pyspark/ml/wrapper.py", line 60, in __del__
ImportError: sys.meta_path is None, Python is likely shutting down
```

Looks like the `__del__` function of JavaWrapper is called after python shutting down and the python related modules is not functional.

### Why are the changes needed?

Fix the bug, good user experience.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?

CI passes

### Was this patch authored or co-authored using generative AI tooling?
No

Closes #49615 from wbo4958/del.bug.

Authored-by: Bobby Wang <[email protected]>
Signed-off-by: Ruifeng Zheng <[email protected]>
(cherry picked from commit 74cf4d8)
Signed-off-by: Ruifeng Zheng <[email protected]>
@zhengruifeng
Copy link
Contributor

merged to master/4.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants