`std_score` could give `nan` instead of `ValueError` if `standard_deviation = 0`? #272

jagerber48 · 2024-11-17T05:37:30Z

Right now the std_score function raises a ValueError if it is called on an AffineScalarFunc with zero standard deviation. Should it return nan, possibly with a warning instead?

uncertainties/uncertainties/core.py

Line 591 in 969324d

except ZeroDivisionError:

The text was updated successfully, but these errors were encountered:

newville · 2024-11-17T20:18:48Z

@jagerber48 I don't have a strong opinion.

I probably lean towards: NaNs are more painful as their effect is not noticed where it originates but in later code where it is used.

It might be reasonable for this function to drop try/except and just be

    return (value - self._nominal_value) / self.std_dev

I think aZeroDivisionError raised from this would be clearer than a ValueError, which would seem to imply that the value passed in for value was in error, say, value was a string or None, or a dict or something. FWIW, that would raise a ValueError, not explicitly caught here.

jagerber48 · 2024-11-17T20:40:37Z

@newville thanks for the feedback. I'm thinking about a case where there is an array of UFloat or something. There could be a chance that 1/1000 has a zero std_dev. In those cases it might be preferable to return nan instead of raising an exception (thus halting processing the 999/1000 "good" values. I know you said you don't have strong opinions, but does thinking about this case change your opinion at all?

I also don't know how prevalently this function is used. I've never used it. I just came across this detail as I was working on the linear combination refactor PR and posted here to track it.

newville · 2024-11-17T21:06:35Z

@jagerber48 If value or self._nominal_value is an numpy array, and self.std_dev is a scaler 0, the result will be an ndarray of np.inf. If self.std_dev is an array and one element is 0, that element in the result with be np.inf. That seems OK, or at least "normal numpy behavior".

I would say that IMHO:
1. Inf is slightly better than NaN for divide-by-zero.
2. raising ZeroDivisionError is slightly better than returning Inf. But Numpy disagrees, and following them makes sense.
3. raising ValueError is kind of confusing, given the existence of ZeroDivisionError.

But also: not a strong preference. And I also never use this function ;).

jagerber48 · 2024-11-17T21:12:12Z

Taking your comments into review:

I agree inf is more appropriate than nan.
Yes, personally I'd tend to go the numpy direction and set inf rather than raise ZeroDivisionError
Yes, I also agree that it makes sense to raise the ZeroDivisionError instead of converting it to a ValueError.

I'll probably leave this discussion here for a while since there are higher priorities. Curious to hear others' thoughts in the meantime.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`std_score` could give `nan` instead of `ValueError` if `standard_deviation = 0`? #272

`std_score` could give `nan` instead of `ValueError` if `standard_deviation = 0`? #272

jagerber48 commented Nov 17, 2024

newville commented Nov 17, 2024

jagerber48 commented Nov 17, 2024

newville commented Nov 17, 2024

jagerber48 commented Nov 17, 2024

std_score could give nan instead of ValueError if standard_deviation = 0? #272

std_score could give nan instead of ValueError if standard_deviation = 0? #272

Comments

jagerber48 commented Nov 17, 2024

newville commented Nov 17, 2024

jagerber48 commented Nov 17, 2024

newville commented Nov 17, 2024

jagerber48 commented Nov 17, 2024

`std_score` could give `nan` instead of `ValueError` if `standard_deviation = 0`? #272

`std_score` could give `nan` instead of `ValueError` if `standard_deviation = 0`? #272