You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
below error message of a run of main_lightning.py:
Failure # 1 (occurred at 2021-05-23_21-45-03)
Traceback (most recent call last):
File "C:\Users\addalin.conda\envs\lidar\lib\site-packages\ray\tune\trial_runner.py", line 880, in _process_trial_save
results = self.trial_executor.fetch_result(trial)
File "C:\Users\addalin.conda\envs\lidar\lib\site-packages\ray\tune\ray_trial_executor.py", line 686, in fetch_result
result = ray.get(trial_future[0], timeout=DEFAULT_GET_TIMEOUT)
File "C:\Users\addalin.conda\envs\lidar\lib\site-packages\ray_private\client_mode_hook.py", line 47, in wrapper
return func(*args, **kwargs)
File "C:\Users\addalin.conda\envs\lidar\lib\site-packages\ray\worker.py", line 1481, in get
raise value.as_instanceof_cause()
ray.exceptions.RayTaskError(OSError): �[36mray::ImplicitFunc.save()�[39m (pid=22632, ip=132.68.58.209)
File "python\ray_raylet.pyx", line 505, in ray._raylet.execute_task
File "python\ray_raylet.pyx", line 449, in ray._raylet.execute_task.function_executor
File "C:\Users\addalin.conda\envs\lidar\lib\site-packages\ray_private\function_manager.py", line 556, in actor_method_executor
return method(__ray_actor, *args, **kwargs)
File "C:\Users\addalin.conda\envs\lidar\lib\site-packages\ray\tune\function_runner.py", line 434, in save
checkpoint_path = TrainableUtil.process_checkpoint(
File "C:\Users\addalin.conda\envs\lidar\lib\site-packages\ray\tune\utils\trainable.py", line 46, in process_checkpoint
with open(checkpoint_path + ".tune_metadata", "wb") as f:
OSError: [Errno 22] Invalid argument: 'C:\Users\addalin\Dropbox\Lidar\lidar_learning\results\main_2021-05-23_19-35-00\main_5831d016_3_bsize=32,dfilter=None,dnorm=False,fc_size=[32],hsizes=[4, 4, 4, 4],lr=0.001,ltype=MAELoss,source=signal_p,use_bg=F_2021-05-23_21-28-18\checkpoint_epoch=3-step=703\.tune_metadata'
This is weird since it failed in the last epoch. And also in other experiments.
running resume with 'ERRORED_ONLY', fix this.
But why would it happen from the beginning?
The text was updated successfully, but these errors were encountered:
A similar error keeps showing throughout runs.
Usually running the resume option with 'ERRORED_ONLY', fix this.
However this time it didn't help, and only a restart of the computer solved this.
This accured on the last experiment in 'main_2021-07-27_18-22-37' , the name starts with 'main_798e6_00015_15....'
See the error file below: error.txt
Is this error related to tune module? or to the file system of Windows?
We should also check if there is any relation between #27, #28, and this one?
below error message of a run of main_lightning.py:
This is weird since it failed in the last epoch. And also in other experiments.
running resume with 'ERRORED_ONLY', fix this.
But why would it happen from the beginning?
The text was updated successfully, but these errors were encountered: