Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unexpected keyboardinterrupt messages #426

Closed
euhruska opened this issue Feb 18, 2020 · 9 comments
Closed

unexpected keyboardinterrupt messages #426

euhruska opened this issue Feb 18, 2020 · 9 comments

Comments

@euhruska
Copy link

euhruska commented Feb 18, 2020

My jobs gives some keyboardinterrupt messages after it runs as expected and finishes once walltime runs out. I didn't keyboard interrupt, it automatically quits due to walltime run out. This error is just cosmetic, doesn't affect me otherwise.

+ re.session.login1.eh22.018310.0003 (json)
+ pilot.0000 (profiles)
+ pilot.0000 (logfiles)
session lifetime: 718.7s                                                      ok
All components terminated
Traceback (most recent call last):
  File "/ccs/home/eh22/.conda/envs/vamp11/lib/python3.7/site-packages/radical/entk/appman/appmanager.py", line 431, in run
    self.terminate()
  File "/ccs/home/eh22/.conda/envs/vamp11/lib/python3.7/site-packages/radical/entk/appman/appmanager.py", line 458, in terminate
    self._prof.prof('term_start', uid=self._uid)
  File "/ccs/home/eh22/.conda/envs/vamp11/lib/python3.7/site-packages/radical/utils/profile.py", line 276, in prof
    self._handle.write(data)
KeyboardInterrupt
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "extasy.py", line 275, in <module>
    appman.run()
  File "/ccs/home/eh22/.conda/envs/vamp11/lib/python3.7/site-packages/radical/entk/appman/appmanager.py", line 439, in run
    raise KeyboardInterrupt
KeyboardInterrupt
radical-stack

  python               : 3.7.6
  pythonpath           : /sw/summit/xalt/1.2.0/site:/sw/summit/xalt/1.2.0/libexec
  virtualenv           : vamp11

  radical.analytics    : 0.90.7-v0.72.0-38-g14b9581@devel
  radical.entk         : 1.0.1-v1.0.1-2-g92797c9@devel
  radical.pilot        : 1.1.1-v1.1.1-2-g8385a7d@devel
  radical.saga         : 1.1.0-v1.1@devel
  radical.utils        : 1.1.1-v1.1.1-2-gfe6c424@devel
@lee212
Copy link
Contributor

lee212 commented Feb 18, 2020

Not able to download the zip file, is the link broken?

@euhruska
Copy link
Author

@andre-merzky
Copy link
Member

TODO AM: test in 3.6
TODO MT: document

@lee212
Copy link
Contributor

lee212 commented Apr 1, 2020

@andre-merzky , were you able to run a test?
@mturilli , any update on document?

@andre-merzky
Copy link
Member

bjobs reports the pilot job state as EXIT, and RS treats EXIT as an error state. LSF documentation describes EXIT as:

                EXIT
                         The job has terminated with a non-zero status -
                         it may have been aborted due to an error in its
                         execution, or killed by its owner or the LSF
                         administrator.

                         For example, exit code 131 means that the job
                         exceeded a configured resource usage limit and
                         LSF killed the job.

so I think RS is doing the right thing here - but alas it can't distinguish between job errors and timeouts just based on the final state.

Thanks for bringing this up, this needs fixing on the SAGA layer where we need to check the exit code to make that distinction. @euhruska : if that is not limiting your runs, I would like to put this on low priority - let me know if this becomes a problem, and I'll up the priority!

@lee212
Copy link
Contributor

lee212 commented Jul 29, 2020

@andre-merzky , to ping, Is the fix applied to SAGA? I think documentation is updated and if SAGA is finished, I can close this ticket.

@lee212
Copy link
Contributor

lee212 commented Jul 29, 2020

@andre-merzky
Copy link
Member

No, I don't think that has been addressed in SAGA. radical-cybertools/radical.saga#777 is linked to keep tack of progress there, but since that ticket is open in RS, feel free to close this one.

@lee212
Copy link
Contributor

lee212 commented Aug 19, 2020

I can close this for now as but happy to revisit anytime if it persists.

@lee212 lee212 closed this as completed Aug 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment