Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unit('0') prints NULL factor argument and NULL unit argument #240

Open
schlunma opened this issue May 6, 2022 · 7 comments
Open

Unit('0') prints NULL factor argument and NULL unit argument #240

schlunma opened this issue May 6, 2022 · 7 comments
Labels
community New: Issue Highlight a new community raised "generic" issue Type: Bug

Comments

@schlunma
Copy link

schlunma commented May 6, 2022

🐛 Bug Report

While reading lots of large netCDF files with iris, I noticed that my entire stderr is cluttered with the following prints:

ut_scale(): NULL factor argument                                                                                                                                                                                                                                             
ut_are_convertible(): NULL unit argument

These come from the use of cf_units.Unit('0') for variables with the unit '0'.

How to Reproduce

Steps to reproduce the behaviour:

Using

from cf_units import Unit
Unit('0')

correctly raises an error, but also prints the messages mentioned above:

ut_scale(): NULL factor argument
ut_are_convertible(): NULL unit argument
Traceback (most recent call last):
  File "/home/b/b309141/tmp/iris_warning.py", line 3, in <module>
    Unit('0')
  File "/work/bd0854/b309141/mambaforge/envs/esm/lib/python3.10/site-packages/cf_units/__init__.py", line 839, in __init__
    raise value_error from None
ValueError: [UT_SUCCESS] Failed to parse unit "0"

These warnings cannot be suppressed using python -W ignore or by using warnings.filterwarnings.

Expected Behaviour

These warnings should not be printed or be issued using the warnings module so they can be handled properly.

Environment

  • OS & Version: Red Hat Enterprise Linux 8.4
  • cf-units Version: 3.0.1
@schlunma schlunma added New: Issue Highlight a new community raised "generic" issue Type: Bug labels May 6, 2022
@bouweandela
Copy link
Member

bouweandela commented May 9, 2022

After looking around in the code a bit, I found that you can use the undocumented function suppress_errors to stop the warnings from being printed:

from cf_units import Unit, suppress_errors
with suppress_errors():
  Unit('0')

@rcomer
Copy link
Member

rcomer commented May 9, 2022

Looks like there are a few undocumented yet public functions in there (e.g. is_vertical, is_time). Maybe we just need to add a new section to the docs?

@bouweandela
Copy link
Member

The current default behaviour of printing to stderr if something goes wrong seems odd for a library, because there is no standard way in which users of the library can do something with that information. I agree with @schlunma that it would be better to use the warnigns module for that (or a logger).

@pp-mo
Copy link
Member

pp-mo commented Oct 30, 2024

Looks like there are a few undocumented yet public functions in there (e.g. is_vertical, is_time). Maybe we just need to add a new section to the docs?

#453 FTW, I think.

@pp-mo
Copy link
Member

pp-mo commented Oct 31, 2024

Hi @bouweandela @schlunma
Sorry for the usual slow response !

I've been looking into this and discovered some things (details to follow).
So, I have some opinions to share, but I'm currently unsure what to do about it.
Which sadly means, I don't think we're ready to fix this in time for a v3.3 release.

Is that going to be a problem, or can we proceed to v3.3. without a fix for this ?

@schlunma
Copy link
Author

Thanks for taking care of this @pp-mo! No, not a problem at all. We are happy with out current solution (that uses suppress_errors)!

@pp-mo
Copy link
Member

pp-mo commented Oct 31, 2024

I've been looking into this and discovered some things

I've inspected the main module, the Cython code and the published C interface spec, in particular the section on error handling.
And I just played around a bit.

So definitely there are other things you can do which result in the same sort of arbitrary error messages printed directly to the console.
For example :

>>> u = Unit("m").root(2)
productRoot(): It's meaningless to take the 2nd root of "m"
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/net/home/h05/itpp/git/cf_units/cf_units/__init__.py", line 1329, in root
    raise value_error from None
ValueError: [UT_MEANINGLESS] Failed to take the root of Unit('m')
>>> 

-- here, the line beginning "productRoot()" is a printed error message.

If the existing code logic is all correct, then presumably :

  • any library function which can error will return a result value which can indicate a problem
  • .. and those functions all also call ut_set_status
  • .. and any call to those functions in the Cython code also invokes raise_error to produce a Python Exception.
    ( N.B. most functions, but not all, call _raise_error via one of wrap_system, wrap_converter or wrap_unit, which check for specific types of return value )

( Note: there definitely are functions which do not have identifiable error return values, such as ut_encode_time.
Presumably, we can trust that these never call ut_set_status, and thus won't emit any error messages
)

So ... assuming all that is correct in the existing code, then any operation which prints an error message should also already be calling _raise_errorto raise a Python Exception.

A better general solution ?

So, I think a better way of handling this would be to permanently set the library error handler to a C function which records the error message somewhere global, and then include the error message text within the raised exception.
( I think you can see from the above example, that this may in some cases add useful interpretation )

Does that sound like an acceptable solution ?

Some odd notes + thoughts :

  • I think that a handler-type solution is required to log the messages, since AFAICT there is no other way of capturing additional arguments passed to the error handler.
  • we may have to handle arbitrary-length output strings, which is awkward, but perhaps calling ut_handle_error_message twice, first with a NUL stream, will be possible. The approach taken in the format wrapper unfortunately doesn't work here, I think, as the handler function interface is less sophisticated.
  • i guess in this case we will have to make suppress_errors a do-nothing. I don't like to remove it, since it is a public function (though it's another one not in the published docs, as previously noted !)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community New: Issue Highlight a new community raised "generic" issue Type: Bug
Projects
Status: No status
Status: Backlog
Development

No branches or pull requests

4 participants