CUDA out of memory / cannot disable CUDA #61

bertsky · 2020-06-19T22:03:06Z

On a CUDA-enabled system with more than 3GB of GPU memory currently free, I get this from dewarp:

INFO OcrdAnybaseocrDewarper - INPUT FILE 105_02_abbr
CustomDatasetDataLoader
dataset [AlignedDataset] was created
lib/python3.6/site-packages/torchvision/transforms/transforms.py:188: UserWarning: The use of the transforms.Scale transform is deprecated, please use transforms.Resize instead.
  "please use transforms.Resize instead.")
pix2pixHD/models/pix2pixHD_model.py:128: UserWarning: volatile was removed and now has no effect. Use `with torch.no_grad():` instead.
  input_label = Variable(input_label, volatile=infer)
Traceback (most recent call last):
  File "bin/ocrd-anybaseocr-dewarp", line 8, in <module>
    sys.exit(ocrd_anybaseocr_dewarp())
  File "lib/python3.6/site-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "lib/python3.6/site-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "lib/python3.6/site-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "lib/python3.6/site-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "lib/python3.6/site-packages/ocrd_anybaseocr/cli/cli.py", line 32, in ocrd_anybaseocr_dewarp
    return ocrd_cli_wrap_processor(OcrdAnybaseocrDewarper, *args, **kwargs)
  File "lib/python3.6/site-packages/ocrd/decorators.py", line 82, in ocrd_cli_wrap_processor
    run_processor(processorClass, ocrd_tool, mets, workspace=workspace, **kwargs)
  File "lib/python3.6/site-packages/ocrd/processor/base.py", line 60, in run_processor
    processor.process()
  File "lib/python3.6/site-packages/ocrd_anybaseocr/cli/ocrd_anybaseocr_dewarp.py", line 130, in process
    self._process_segment(model, dataset, page, page_xywh, page_id, input_file, orig_img_size, n)
  File "lib/python3.6/site-packages/ocrd_anybaseocr/cli/ocrd_anybaseocr_dewarp.py", line 164, in _process_segment
    generated = model.inference(data['label'], data['inst'], data['image'])
  File "pix2pixHD/models/pix2pixHD_model.py", line 216, in inference
    fake_image = self.netG.forward(input_concat)
  File "pix2pixHD/models/networks.py", line 211, in forward
    return self.model(input)             
  File "lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "lib/python3.6/site-packages/torch/nn/modules/container.py", line 92, in forward
    input = module(input)
  File "lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "pix2pixHD/models/networks.py", line 252, in forward
    out = x + self.conv_block(x)
  File "lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "lib/python3.6/site-packages/torch/nn/modules/container.py", line 92, in forward
    input = module(input)
  File "lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "lib/python3.6/site-packages/torch/nn/modules/padding.py", line 163, in forward
    return F.pad(input, self.padding, 'reflect')
  File "lib/python3.6/site-packages/torch/nn/functional.py", line 2865, in pad
    return torch._C._nn.reflection_pad2d(input, pad)
RuntimeError: CUDA out of memory. Tried to allocate 18.00 MiB (GPU 0; 3.93 GiB total capacity; 2.37 GiB already allocated; 18.94 MiB free; 35.58 MiB cached)

Frankly, this does not make any sense to me.

However, I thought, at least I should be able to disable GPU computation. The only parameter that can influence Pytorch setup in dewarp is gpu_id, which would need to be set to 'cpu'. But the tool JSON requires this to be a number!

    raise Exception("Invalid parameters %s" % report.errors)
Exception: Invalid parameters ["[gpu_id] 'cpu' is not of type 'number'"]

The text was updated successfully, but these errors were encountered:

bertsky · 2020-06-20T21:37:39Z

Like so often (with this module), the problem runs deeper.

Even if you:

allow -1 to represent non-GPU/CUDA, and pass that as the empty list to pix2pixHD, since its TestOptions().parse() gets called before gpu_ids is set, it will try to initialize CUDA
translate the param into its respective sys.argv for pix2pix (i.e. '--gpu_ids' and str(parameter['gpu_ids'])), the inference code in pix2pix will try to use .cuda() everywhere

Thus, IMO there's no way to run the dewarper without GPU, or with a CUDA-enabled GPU with "only" 4GB RAM.
👎

kba · 2020-06-22T11:38:18Z

Thanks for trying and detailling how it fails. I will refactor the tool to at least properly integrate pix2pixHD repo as a submodule, installed with the tool and take a look at the parameter handling.

Thus, IMO there's no way to run the dewarper without GPU, or with a CUDA-enabled GPU with "only" 4GB RAM.

I have no access to GPU at all, so I cannot test (unless I the cpu variant working) but at least these glaring shortcomings can be fixed.

kba · 2022-03-20T16:16:32Z

fixed by #89

This was referenced Feb 16, 2022

dewarping: gpu_id does not default to -1 in any case #87

Closed

Use SavedModel instead of HDF5 format, fix dewarping #89

Merged

kba closed this as completed Mar 20, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CUDA out of memory / cannot disable CUDA #61

CUDA out of memory / cannot disable CUDA #61

bertsky commented Jun 19, 2020

bertsky commented Jun 20, 2020

kba commented Jun 22, 2020

kba commented Mar 20, 2022

CUDA out of memory / cannot disable CUDA #61

CUDA out of memory / cannot disable CUDA #61

Comments

bertsky commented Jun 19, 2020

bertsky commented Jun 20, 2020

kba commented Jun 22, 2020

kba commented Mar 20, 2022