Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

running docker image #36

Open
daobilige-su opened this issue Sep 26, 2018 · 7 comments
Open

running docker image #36

daobilige-su opened this issue Sep 26, 2018 · 7 comments

Comments

@daobilige-su
Copy link

Hi,

First, thanks for the software. Looks very cool.

I am using the docker image provided. But I have 2 questions about it.

After I read the dockerfile in the image, if I understood correctly, all dependencies are built in the image, but the actual bonnet is not installed inside the image. Is that correct? Because I could not find the lines correspond to installation of bonnet code. If so, do I have to install it by myself on top of the image?

I ran the helloworld.py under the \bonnet-docker folder of the image. Then I got Segmentation fault (core dumped) error. When I execute the code inside helloworld.py line by line, I come to know that it is the import tensorrt causing the error. Does it works fine on you machine?

Thanks for any help or suggestion.

Cheers,
Su

@tano297
Copy link
Contributor

tano297 commented Sep 29, 2018

Hello,

I am not having any problems running the hello world. Can you check that you are using the proper docker version? You should be using nvidia-docker link, not the vanilla one.

You can check if your gpu functions are working inside the docker container running

$ nvidia-smi

@daobilige-su
Copy link
Author

Hi @tano297 ,

Thanks for your reply.

I finally made everything running now. It turns out that I need to delete /usr/local/cuda/lib64/stubs/libcuda.so.1 file to make tensorrt and tensorflow work. Also I need to recompile tensorflow C++ API by adding CC_OPT_FLAGS="-march=native" flag before compiling to support my CPU version.

It is a really nice software, enjoying it now. Thanks.

Cheers,
Su

@tano297
Copy link
Contributor

tano297 commented Sep 29, 2018

I'm glad to hear that! There are sometimes some caveats for each architecture, which I try to minimize, but they escape.

The /usr/local/cuda/lib64/stubs/libcuda.so.1 thing should definitely not be happening, so I will have a look into it. Leaving this issue open until I can reproduce it and fix it.

@hyejun
Copy link

hyejun commented Oct 18, 2018

I'm glad to hear that! There are sometimes some caveats for each architecture, which I try to minimize, but they escape.

The /usr/local/cuda/lib64/stubs/libcuda.so.1 thing should definitely not be happening, so I will have a look into it. Leaving this issue open until I can reproduce it and fix it.

Hello.

I have same error in docker.

In my case, Standalone examples don`t work.

When I execute ./build/bonnet_standalone/session, I got Illegal instruction (core dumped).

I checked

nvidia-smi
Thu Oct 18 01:23:58 2018
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.48 Driver Version: 410.48 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX TITAN Off | 00000000:01:00.0 On | N/A |
| 30% 41C P8 18W / 250W | 585MiB / 6075MiB | 1% Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1558 G /usr/lib/xorg/Xorg 233MiB |
| 0 2304 G /opt/teamviewer/tv_bin/TeamViewer 20MiB |
| 0 2578 G compiz 209MiB |
| 0 23643 G ...quest-channel-token=6920769117415252391 72MiB |
| 0 32701 G ...-token=B9940EAD24EB7BFE7CB48B880BC0A2AE 43MiB |
+-----------------------------------------------------------------------------+

python3
import tensorflow
it is ok.

helloworld.py under the \bonnet-docker folder
it is ok.

I think c++ with tensorflow have some problem.

how to rebuild tensorflow C++ API by adding CC_OPT_FLAGS="-march=native" flag ??

@daobilige-su
Copy link
Author

Hi,

First, you need to make sure the problem arise from tensorflow C++ API. To do that, just run the test program of it.

$ cd /tools/tensorflow_cc/example
$ mkdir build && cd build
$ cmake ..
$ ./example

If the above test program is giving you the same error, then it is surely the tensorflow C++ API is the source of the error. Your CPU version is too old to be supported by the default configuration of tensorflow C++ API. To recompile tensorflow API, do followings:

$ cd /tools/tensorflow_cc/tensorflow_cc
$ mkdir build
$ cd build
$ export CC_OPT_FLAGS="-march=native"
$ cmake -DTENSORFLOW_STATIC=OFF -DTENSORFLOW_SHARED=ON -DTENSORFLOW_TAG="v1.7.0" ..
$ make -j
$ make install
$ rm -rf ~/.cache && cd .. && rm -rf build

after that, you might also needs to re-install tensorflow again, since the installation of tensorflow C++ API will install a different version of tensorflow, which you have to uninstall and install the correct version of tensorflow again.

RUN pip3 uninstall numpy tensorflow-gpu tensorflow matplotlib -y && \
pip3 install -U tensorflow-gpu==1.7.0 protobuf==3.5.1 matplotlib==2.2.2

Hopefully that's it.

Cheers,
Su

@hyejun
Copy link

hyejun commented Oct 24, 2018

I checked problem arise from tensorflow C++ API.
Then, I tried to install tensorflow again.
But, there are some errors while building tensorflow.

So, I tried to install docker to another computer and it is ok.
you said " Your CPU version is too old to be supported by the default configuration of tensorflow C++ API. ", maybe it is right.

Thank you for answering.

@blubbi321
Copy link

blubbi321 commented Nov 3, 2018

I can confirm the issue. No problems following along the instructions on a more recent machine. However, I could not yet resolve all the dependencies for the steps @daobilige-su mentioned above. (Apparently one needs to also install g++-7, which is then in turn incompatible with the cuda libs "/usr/local/cuda-9.0/bin/../targets/x86_64-linux/include/crt/host_config.h:119:2: error: #error -- unsupported GNU version! gcc versions later than 6 are not supported!")

Machine with the trouble is an Intel i7-2600K in case that helps anybody

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants