Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem running jetson-containers run $(autotag ollama) command on Jetson AGX Orin 32GB #814

Closed
kalustian opened this issue Feb 2, 2025 · 33 comments

Comments

@kalustian
Copy link

kalustian commented Feb 2, 2025

I used to follow below instruction to run jetson-containers run $(autotag ollama) commands:

  1. git clone https://github.com/dusty-nv/jetson-containers
  2. bash jetson-containers/install.sh

recently I decided to flash agan my AGX orin , and reinstall ollama with above commands. This time when using jetson-containers run $(autotag ollama) I got:

Starting ollama server

tail: cannot open '/data/logs/ollama.log' for reading: No such file or directory/bin/bash: line 1: /data/logs/ollama.log: No such file or directory

tail: no files remaining

OLLAMA_MODELS /data/models/ollama/models
OLLAMA_LOGS /data/logs/ollama.log

ollama server is now started, and you can run commands here like 'ollama run llama3'

when trying to run any command like "ollama ls "... or "ollama run llama3.2"...etc, I got:

Error: could not connect to ollama app, is it running?

I never had this issue before...any idea what the issue is ?

NOTE: I have Jeston 6.2

@makoit
Copy link

makoit commented Feb 2, 2025

I have the same issue on my jetson orin nano when running:

jetson-containers run --name ollama dustynv/ollama:main-r36.4.0

or

jetson-containers run --name ollama dustynv/ollama:r36.4.0

The container is starting but it seems like that the ollama service itself is not starting correctly. If I attach the shell and run inside the container ollama start then the service is starting as blocking process and I'm able to run a model from a other shell.

@kalustian
Copy link
Author

kalustian commented Feb 2, 2025

same here, but also looks like ollama models like llama3.2 is running on CPUs and not on GPUs

@makoit
Copy link

makoit commented Feb 2, 2025

@kalustian

same here, but also looks like ollama models like llama3.2 is running on CPUs and not on GPUs

I also recognized that the model inference is really slow. Can you reopen this issue?

@dusty-nv dusty-nv reopened this Feb 2, 2025
@dusty-nv
Copy link
Owner

dusty-nv commented Feb 2, 2025

sorry @kalustian @makoit , my guess is this is related to the build system changes in ollama from this week, was trying to fix this in 51e5449

e1292b4

My guess is OLLAMA_LOGS needs changed or something is wrong with the mounts. We had tried falling back to their native installer but that was not working. Admittedly ollama has persistently been a challenge to maintain which I was initially resistant to bringing on, meanwhile llama.cpp is rock solid 🤷

@tokk-nv had been looking at these patches and can take another look this week

@dusty-nv
Copy link
Owner

dusty-nv commented Feb 2, 2025

EDIT: try overriding OLLAMA_LOGS environment variable in your docker run or jetson-containers run (like -e OLLAMA_LOGS=/tmp)

@PhilWheat
Copy link

One thing I saw is that jetson-containers really wants to be cloned to the root of your SSD. Once I figured that out, things went smoother. (Not sure if this is your issue, but I just sorted it out on my new Nano, so I thought I'd throw it out.)

@albertzsigovits
Copy link

Seeing the same on Orin Nano Super:

jetson-containers run --name ollama $(autotag ollama) -e OLLAMA_LOGS=/tmp

L4T_VERSION=36.4.3   JETPACK_VERSION=6.2   CUDA_VERSION=12.6
dustynv/ollama:main-r36.4.0
...
xauth: unable to rename authority file /tmp/.docker.xauth, use /tmp/.docker.xauth-n
chmod: changing permissions of '/tmp/.docker.xauth': Operation not permitted
...
Starting ollama server

/bin/bash: line 1: /data/logs/ollama.log: No such file or directory
tail: cannot open '/data/logs/ollama.log' for reading: No such file or directory
tail: no files remaining

OLLAMA_MODELS /data/models/ollama/models
OLLAMA_LOGS   /data/logs/ollama.log
...

@johnnynunez
Copy link
Collaborator

Seeing the same on Orin Nano Super:

jetson-containers run --name ollama $(autotag ollama) -e OLLAMA_LOGS=/tmp

L4T_VERSION=36.4.3   JETPACK_VERSION=6.2   CUDA_VERSION=12.6
dustynv/ollama:main-r36.4.0
...
xauth: unable to rename authority file /tmp/.docker.xauth, use /tmp/.docker.xauth-n
chmod: changing permissions of '/tmp/.docker.xauth': Operation not permitted
...
Starting ollama server

/bin/bash: line 1: /data/logs/ollama.log: No such file or directory
tail: cannot open '/data/logs/ollama.log' for reading: No such file or directory
tail: no files remaining

OLLAMA_MODELS /data/models/ollama/models
OLLAMA_LOGS   /data/logs/ollama.log
...

it is not problem from jetson-containers. Ollama is broken in all plattforms.
Waiting for new release...

@makoit
Copy link

makoit commented Feb 5, 2025

I tried out different things on my nvidia jetson nano (8GB):

1. Using jetson-containers cli:

  • jetson-containers run --name ollama $(autotag ollama) and jetson-containers run -d --name ollama dustynv/ollama:main-r36.4.0

-> container starts but in container if using ollama list I see this message: Error: could not connect to ollama app, is it running?

  • jetson-containers run --name ollama $(autotag ollama) -e OLLAMA_LOGS=/tmp

-> container is not starting and following error occurs:

Namespace(packages=['ollama'], prefer=['local', 'registry', 'build'], disable=[''], user='dustynv', output='/tmp/autotag', quiet=False, verbose=False)
-- L4T_VERSION=36.4.0  JETPACK_VERSION=6.1  CUDA_VERSION=12.6
-- Finding compatible container image for ['ollama']
dustynv/ollama:main-r36.4.0
V4L2_DEVICES: 
### DISPLAY environmental variable is already set: ":1"
localuser:root being added to access control list
+ docker run --runtime nvidia -it --rm --network host --shm-size=8g --volume /tmp/argus_socket:/tmp/argus_socket --volume /etc/enctune.conf:/etc/enctune.conf --volume /etc/nv_tegra_release:/etc/nv_tegra_release --volume /tmp/nv_jetson_model:/tmp/nv_jetson_model --volume /var/run/dbus:/var/run/dbus --volume /var/run/avahi-daemon/socket:/var/run/avahi-daemon/socket --volume /var/run/docker.sock:/var/run/docker.sock --volume /home/johbaum8/DEV/data:/data -v /etc/localtime:/etc/localtime:ro -v /etc/timezone:/etc/timezone:ro --device /dev/snd -e PULSE_SERVER=unix:/run/user/1000/pulse/native -v /run/user/1000/pulse:/run/user/1000/pulse --device /dev/bus/usb -e DISPLAY=:1 -v /tmp/.X11-unix/:/tmp/.X11-unix -v /tmp/.docker.xauth:/tmp/.docker.xauth -e XAUTHORITY=/tmp/.docker.xauth --device /dev/i2c-0 --device /dev/i2c-1 --device /dev/i2c-2 --device /dev/i2c-4 --device /dev/i2c-5 --device /dev/i2c-7 --device /dev/i2c-9 -v /run/jtop.sock:/run/jtop.sock --name ollama dustynv/ollama:main-r36.4.0 -e OLLAMA_LOGS=/tmp
docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: exec: "-e": executable file not found in $PATH: unknown.
  • jetson-containers run -d --name ollama $(autotag ollama) bash -c "ollama serve" and jetson-containers run -d --name ollama dustynv/ollama:main-r36.4.0 bash -c "ollama serve"

-> this is working and starts the container and ollama

  • jetson-containers run --name ollama $(autotag ollama) bash -c "ollama serve & sleep 5; ollama run llama3.2:1b"

-> this is working and starts the container, ollama and runs a model

2. Running the container with docker:

  • docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama dustynv/ollama:main-r36.4.0

-> this is not working container starts and crashes

  • docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama dustynv/ollama:main-r36.4.0 bash -c "ollama serve"

-> this is working, starts the container and the ollama service

  • docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama dustynv/ollama:main-r36.4.0 bash -c "ollama serve & sleep 5; ollama run llama3.2:1b"

-> this is working, starts the container and ollama service and runs a model

  1. Using in docker compose:
  ollama:
    image: dustynv/ollama:r36.4.0
    ports:
      - 11434:11434
    volumes:
      - ./ollama/ollama:/root/.ollama
    container_name: ollama
    tty: true
    entrypoint: ["/bin/bash", "-c"]
    command: ["( /bin/ollama serve & sleep 5; /bin/ollama run llama3.2:1B )"]
    environment:
      - OLLAMA_KEEP_ALIVE=24h
      - OLLAMA_HOST=0.0.0.0
    networks:
      - app-network
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: all
              capabilities: [gpu]

** -> this is working, it starts the container, ollama and runs the model

Summary: If using a bash command when running a container or a service in compose you are able to run ollama (and optional directly a model). But I'm not shure if this is best practise or only a workaround.

Comment: It would be nice if we could run a container and ollama starts automatically as service and we could optional pass model_names which are run and also embedding_model_names which automatically are pulled.

@feiticeir0
Copy link

feiticeir0 commented Feb 5, 2025

I'm having the same issue, (Jetson ORIN NX 16GB) but it just started yesterday.

The problem, that I was able to solve was, create the logs directory inside the data directory that the docker command mounts in my home.

mkdir data/logs

After that, everything run smoothly .

@johnnynunez
Copy link
Collaborator

I'm having the same issue, (Jetson ORIN NX 16GB) but it just started yesterday.

The problem, that I was able to solve was, create the logs directory inside the data directory that the docker command mounts in my home.

mkdir data/logs

After that, everything run smoothly .

Can you try with last commit in master?

@kalustian
Copy link
Author

I'm having the same issue, (Jetson ORIN NX 16GB) but it just started yesterday.
The problem, that I was able to solve was, create the logs directory inside the data directory that the docker command mounts in my home.
mkdir data/logs
After that, everything run smoothly .

Can you try with last commit in master?

This commit did not solve the issue as I reported initially, well at list on my AGX JP6.2. I fresh installed git clone https://github.com/dusty-nv/jetson-containers. Here is a screenshot:

Image

@feiticeir0
Copy link

feiticeir0 commented Feb 5, 2025

I'm having the same issue, (Jetson ORIN NX 16GB) but it just started yesterday.
The problem, that I was able to solve was, create the logs directory inside the data directory that the docker command mounts in my home.
mkdir data/logs
After that, everything run smoothly .

Can you try with last commit in master?

This commit did not solve the issue as I reported initially, well at list on my AGX JP6.2. I fresh installed git clone https://github.com/dusty-nv/jetson-containers. Here is a screenshot:
Image

If you create the logs directory inside the data directory that's on your home, it doesn't work ?
That was the problem I was having .

@kalustian
Copy link
Author

kalustian commented Feb 5, 2025

I'm having the same issue, (Jetson ORIN NX 16GB) but it just started yesterday.
The problem, that I was able to solve was, create the logs directory inside the data directory that the docker command mounts in my home.
mkdir data/logs
After that, everything run smoothly .

Can you try with last commit in master?

This commit did not solve the issue as I reported initially, well at list on my AGX JP6.2. I fresh installed git clone https://github.com/dusty-nv/jetson-containers. Here is a screenshot:
Image

If you create the logs directory inside the data directory that's on your home, it doesn't work ? That was the problem I was having .

well...it in a way works...but, let me ask ask a question...when running an inference using any model...did you see GPU utilization ? like looking into jtop at the same time doing inference. For me, when inferencing, all workload is going into the CPU and not GPU.

@johnnynunez
Copy link
Collaborator

I'm having the same issue, (Jetson ORIN NX 16GB) but it just started yesterday.
The problem, that I was able to solve was, create the logs directory inside the data directory that the docker command mounts in my home.
mkdir data/logs
After that, everything run smoothly .

Can you try with last commit in master?

This commit did not solve the issue as I reported initially, well at list on my AGX JP6.2. I fresh installed git clone https://github.com/dusty-nv/jetson-containers. Here is a screenshot:
Image

If you create the logs directory inside the data directory that's on your home, it doesn't work ? That was the problem I was having .

well...it in a way works...but, let me ask ask a question...when running an inference using any model...did you see GPU utilization ? like looking into jtop at the same time doing inference. For me, when inferencing, all workload is going into the CPU and not GPU.

Did you do?

CUDA_VERSION=12.6 jetson-containers build ollama
jetson-containers run (name that generate)

@feiticeir0
Copy link

I'm having the same issue, (Jetson ORIN NX 16GB) but it just started yesterday.
The problem, that I was able to solve was, create the logs directory inside the data directory that the docker command mounts in my home.
mkdir data/logs
After that, everything run smoothly .

Can you try with last commit in master?

This commit did not solve the issue as I reported initially, well at list on my AGX JP6.2. I fresh installed git clone https://github.com/dusty-nv/jetson-containers. Here is a screenshot:
Image

If you create the logs directory inside the data directory that's on your home, it doesn't work ? That was the problem I was having .

well...it in a way works...but, let me ask ask a question...when running an inference using any model...did you see GPU utilization ? like looking into jtop at the same time doing inference. For me, when inferencing, all workload is going into the CPU and not GPU.

Yes, it works.

If it's not working, you probably need to install nvidia-container-toolkit in the Jetson (not the container) .
Here's a picture of a model running in Ollama and the GPU being used:

Image

@kalustian
Copy link
Author

kalustian commented Feb 5, 2025

@johnnynunez - The logs directory has been created and fixed. No need to create the "logs" folder manually, but still not working for me yet as inference is being loaded into the CPU. Let me reflash my AGX again with JP 6.2.

@johnnynunez
Copy link
Collaborator

johnnynunez commented Feb 5, 2025

@johnnynunez - The logs directory has been created and fixed. No need to create the "logs" folder manually, but still not working for me yet as inference is being loaded into the CPU. Let me reflash my AGX again with JP 6.2.

do you modify docker json ?

Image

@kalustian
Copy link
Author

kalustian commented Feb 5, 2025

@johnnynunez - The logs directory has been created and fixed. No need to create the "logs" folder manually, but still not working for me yet as inference is being loaded into the CPU. Let me reflash my AGX again with JP 6.2.

do you modify docker json ?

Image

Yes I did and I double checked is running as "nvidia". Let me reflash and start over.

@kalustian
Copy link
Author

kalustian commented Feb 5, 2025

@johnnynunez , hi Johnny, I have flashed my AGX ( jp6.2)

  1. sudo apt-get update
  2. sudo apt-get upgrade -y
  3. Default Runtime: nvidia is running
  4. git clone https://github.com/dusty-nv/jetson-containers.git
  5. jetson-containers/install.sh
  6. CUDA_VERSION=12.6 jetson-containers build ollama
  7. jetson-containers run $(autotag ollama)
  8. http://192.168.1.34:11434/ ... Ollama is running
  9. ollama run llama3.2
  10. run some inference but still workload is showing on CPU and not GPU.
  11. NOTE: if I install ollama from the ollama webpage, I see GPU being used instead of CPUs
Image

@albertzsigovits
Copy link

albertzsigovits commented Feb 5, 2025

@kalustian Experiencing the same, and also with DEBUG logs:

level=INFO source=gpu.go:226 msg="looking for compatible GPUs"
level=INFO source=gpu.go:392 msg="no compatible GPUs were discovered"

/etc/docker/daemon.json is set to nvidia.

@traklo
Copy link

traklo commented Feb 6, 2025

Same here, reverted to to dustynv/ollama:0.5.1-r36.4.0 and GPU acceleration works again.

@makoit
Copy link

makoit commented Feb 6, 2025

@kalustian why did you closed the issue? The core issue is not fixed!?

@johnnynunez johnnynunez reopened this Feb 6, 2025
@makoit
Copy link

makoit commented Feb 9, 2025

@dusty-nv @johnnynunez is there any update? I have seen that you pushed changes on docker hub on r36.4.0. I pulled it and my workarounds to run ollama and load the model is not working anymore. Would be nice if there is a stable tag which I can use otherwise each change will break everything again.

@johnnynunez
Copy link
Collaborator

johnnynunez commented Feb 9, 2025

@dusty-nv @johnnynunez is there any update? I have seen that you pushed changes on docker hub on r36.4.0. I pulled it and my workarounds to run ollama and load the model is not working anymore. Would be nice if there is a stable tag which I can use otherwise each change will break everything again.

It’s fixed. Only compile docker by yourself.
During week, new docker will be updated

@makoit
Copy link

makoit commented Feb 9, 2025

@dusty-nv @johnnynunez is there any update? I have seen that you pushed changes on docker hub on r36.4.0. I pulled it and my workarounds to run ollama and load the model is not working anymore. Would be nice if there is a stable tag which I can use otherwise each change will break everything again.

It’s fixed. Only compile docker by yourself. During weekend, new docker will be updated

Amazing news! What do you mean with compile docker myself? You mean building the image locally myself?

@johnnynunez
Copy link
Collaborator

@dusty-nv @johnnynunez is there any update? I have seen that you pushed changes on docker hub on r36.4.0. I pulled it and my workarounds to run ollama and load the model is not working anymore. Would be nice if there is a stable tag which I can use otherwise each change will break everything again.

It’s fixed. Only compile docker by yourself. During weekend, new docker will be updated

Amazing news! What do you mean with compile docker myself? You mean building the image locally myself?

  1. git pull
  2. CUDA_VERISON=12.6 jetson-containers build ollama
  3. jetson-containers run $(autotag ollama)

On point 2, you can try CUDA_VERSION=12.8, now it's in master

@kalustian
Copy link
Author

@dusty-nv @johnnynunez is there any update? I have seen that you pushed changes on docker hub on r36.4.0. I pulled it and my workarounds to run ollama and load the model is not working anymore. Would be nice if there is a stable tag which I can use otherwise each change will break everything again.

It’s fixed. Only compile docker by yourself. During weekend, new docker will be updated

Amazing news! What do you mean with compile docker myself? You mean building the image locally myself?

  1. git pull
  2. CUDA_VERISON=12.6 jetson-containers build ollama
  3. jetson-containers run $(autotag ollama)

On point 2, you can try CUDA_VERSION=12.8, now it's in master

@johnnynunez

YES !! ..Now is working as expected for me, now I see GPU been used instead CPU as previously. I used CUDA_VERSION=12.6 jetson-containers build ollama. Thanks a lot for your help.

@johnnynunez
Copy link
Collaborator

@kalustian you can test cuda 12.8. I upload for you https://hub.docker.com/r/johnnync/ollama/tags

@kalustian
Copy link
Author

@johnnynunez

Ok here is my overall test. Again, this is on my AGX Orin 32Gb with JP 6.2:

  1. I tried CUDA_VERSION=12.8 jetson-containers build ollama using the jetson-containers for ollama and it also worded as expected.

  2. Regarding your docker version:

  • sudo docker pull johnnync/ollama:r36.4.3
  • sudo docker run -it -p 11434:11434 johnnync/ollama:r36.4.3
  • tried llama3.2:
  • Downloaded without any issues.
  • Checked ollama is running on port 11434
  • by looking at jtop I see all or most the workload on the GPU.

@kalustian
Copy link
Author

kalustian commented Feb 9, 2025

If I may suggest to updated the https://github.com/dusty-nv/jetson-containers documentation to include CUDA_VERISON=12.6 jetson-containers build ollama. My 2 Cents.

@makoit
Copy link

makoit commented Feb 9, 2025

@dusty-nv @johnnynunez is there any update? I have seen that you pushed changes on docker hub on r36.4.0. I pulled it and my workarounds to run ollama and load the model is not working anymore. Would be nice if there is a stable tag which I can use otherwise each change will break everything again.

It’s fixed. Only compile docker by yourself. During weekend, new docker will be updated

Amazing news! What do you mean with compile docker myself? You mean building the image locally myself?

1. git pull

2. CUDA_VERISON=12.6 jetson-containers build ollama

3. jetson-containers run $(autotag ollama)

On point 2, you can try CUDA_VERSION=12.8, now it's in master

Amazing, thanks a lot for your effort!

I tested it also on my jetson nano (8GB) and this worked also for me. Was able to build the image with cuda 12.6. The container started and also ollama service was running automatically:

Image

Do you also plan to release the image on docker hub with a kind of stable release tag? In my case I have 15 jetson devices and do not want to build on each on them the image manually.

@johnnynunez
Copy link
Collaborator

@dusty-nv @johnnynunez is there any update? I have seen that you pushed changes on docker hub on r36.4.0. I pulled it and my workarounds to run ollama and load the model is not working anymore. Would be nice if there is a stable tag which I can use otherwise each change will break everything again.

It’s fixed. Only compile docker by yourself. During weekend, new docker will be updated

Amazing news! What do you mean with compile docker myself? You mean building the image locally myself?

1. git pull

2. CUDA_VERISON=12.6 jetson-containers build ollama

3. jetson-containers run $(autotag ollama)

On point 2, you can try CUDA_VERSION=12.8, now it's in master

Amazing, thanks a lot for your effort!

I tested it also on my jetson nano (8GB) and this worked also for me. Was able to build the image with cuda 12.6. The container started and also ollama service was running automatically:

Image

Do you also plan to release the image on docker hub with a kind of stable release tag? In my case I have 15 jetson devices and do not want to build on each on them the image manually.

Yeah for sure! @dusty-nv

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants