RESOURCE_EXHAUSTED: XLA:TPU compile permanent #60

infocodiste · 2024-03-12T12:58:40Z

Hi I m using v38 tpu in GCP and while loading model getting below error :

he above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/home/deep_c/workspace/LWM/lwm/vision_chat.py", line 254, in
run(main)
File "/home/deep_c/miniconda3/envs/large_vision_model/lib/python3.10/site-packages/absl/app.py", line 308, in run
_run_main(main, args)
File "/home/deep_c/miniconda3/envs/large_vision_model/lib/python3.10/site-packages/absl/app.py", line 254, in _run_main
sys.exit(main(argv))
File "/home/deep_c/workspace/LWM/lwm/vision_chat.py", line 250, in main
output = sampler(prompts, FLAGS.max_n_frames)[0]
File "/home/deep_c/workspace/LWM/lwm/vision_chat.py", line 230, in call
output, self.sharded_rng = self._forward_generate(
jaxlib.xla_extension.XlaRuntimeError: RESOURCE_EXHAUSTED: XLA:TPU compile permanent error. Ran out of memory in memory space hbm. Used 21.95G of 15.48G hbm.
Exceeded hbm capacity by 6.47G.

Total hbm usage >= 22.47G:
reserved 530.00M
program 21.95G
arguments 0B

How to fix this?

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RESOURCE_EXHAUSTED: XLA:TPU compile permanent #60

RESOURCE_EXHAUSTED: XLA:TPU compile permanent #60

infocodiste commented Mar 12, 2024

RESOURCE_EXHAUSTED: XLA:TPU compile permanent #60

RESOURCE_EXHAUSTED: XLA:TPU compile permanent #60

Comments

infocodiste commented Mar 12, 2024