Skip to content
This repository has been archived by the owner on Jan 22, 2019. It is now read-only.

Linking problem #20

Open
ddresser opened this issue Jan 3, 2018 · 10 comments
Open

Linking problem #20

ddresser opened this issue Jan 3, 2018 · 10 comments

Comments

@ddresser
Copy link

ddresser commented Jan 3, 2018

Hello,
I am trying to cross compile for armv7 using a custom toolchain. I am using bazel 0.8.1 and trying to compile tensorflow HEAD. (30b64a8d78b32db8f30957294efc9cac902b9fd3)

I made a small change to 'tf-crosscompile.patch' to make it patch successfully and used the following command ./cross-compile.sh /home/ddresser/.gradle/var/idexx/compilers/acadia arm-linux-gnueabihf HEAD

I also updated the cross-compile.sh script with '-march=armv7'

Here is the beginning of the build:

using gcc : /home/ddresser/.gradle/var/idexx/compilers/acadia/bin/arm-linux-gnueabihf-gcc version 6.3.1
Cloning into 'tensorflow'...
remote: Counting objects: 281972, done.
remote: Compressing objects: 100% (20/20), done.
remote: Total 281972 (delta 7), reused 9 (delta 2), pack-reused 281950
Receiving objects: 100% (281972/281972), 139.08 MiB | 1.18 MiB/s, done.
Resolving deltas: 100% (220740/220740), done.
Checking connectivity... done.
Your branch is up-to-date with 'origin/master'.
Extracting Bazel installation...
You have bazel 0.8.1 installed.
Please specify the location of python. [Default is /usr/bin/python]:

Found possible Python library paths:
/usr/local/lib/python2.7/dist-packages
/usr/lib/python2.7/dist-packages
Please input the desired Python library path to use. Default is [/usr/local/lib/python2.7/dist-packages]
Do you wish to build TensorFlow with jemalloc as malloc support? [Y/n]: jemalloc as malloc support will be enabled for TensorFlow.

Do you wish to build TensorFlow with Google Cloud Platform support? [Y/n]: Google Cloud Platform support will be enabled for TensorFlow.

Do you wish to build TensorFlow with Hadoop File System support? [Y/n]: Hadoop File System support will be enabled for TensorFlow.

Do you wish to build TensorFlow with Amazon S3 File System support? [Y/n]: Amazon S3 File System support will be enabled for TensorFlow.

Do you wish to build TensorFlow with XLA JIT support? [y/N]: No XLA JIT support will be enabled for TensorFlow.

Do you wish to build TensorFlow with GDR support? [y/N]: No GDR support will be enabled for TensorFlow.

Do you wish to build TensorFlow with VERBS support? [y/N]: No VERBS support will be enabled for TensorFlow.

Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: No OpenCL SYCL support will be enabled for TensorFlow.

Do you wish to build TensorFlow with CUDA support? [y/N]: No CUDA support will be enabled for TensorFlow.

Do you wish to build TensorFlow with MPI support? [y/N]: No MPI support will be enabled for TensorFlow.

Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]:

Add "--config=mkl" to your bazel command to build with MKL support.
Please note that MKL on MacOS or windows is still not supported.
If you would like to use a local MKL instead of downloading, please set the environment variable "TF_MKL_ROOT" every time before build.

Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: Not configuring the WORKSPACE for Android builds.

Configuration finished
launching bazel with flags ''
........
Analyzing: target //tensorflow:libtensorflow.so (16 packages loaded)

Everything seems to compile fine, but I get this error when it tries to link:

ERROR: /home/ddresser/src/tensorflow-build/target/tensorflow/tensorflow/cc/BUILD:422:1: Linking of rule '//tensorflow/cc:ops/logging_ops_gen_cc' failed (Exit 1): gcc failed: error executing command
(cd /home/ddresser/.cache/bazel/_bazel_ddresser/07218640e155d2003dbb6761ed58c2d0/execroot/org_tensorflow &&
exec env -
PATH=/home/ddresser/bin:/home/ddresser/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
PWD=/proc/self/cwd
/usr/bin/gcc -o bazel-out/host/bin/tensorflow/cc/ops/logging_ops_gen_cc '-Wl,-rpath,$ORIGIN/../../../_solib_local/_U_S_Stensorflow_Scc_Cops_Slogging_Uops_Ugen_Ucc___Utensorflow' -Lbazel-out/host/bin/_solib_local/_U_S_Stensorflow_Scc_Cops_Slogging_Uops_Ugen_Ucc___Utensorflow '-Wl,-rpath,$ORIGIN/,-rpath,$ORIGIN/..,-rpath,$ORIGIN/../..' -pthread -B/usr/bin/ -Wl,-z,relro,-z,now -no-canonical-prefixes -pass-exit-codes '-Wl,--build-id=md5' '-Wl,--hash-style=gnu' -Wl,--gc-sections -Wl,-S -Wl,@bazel-out/host/bin/tensorflow/cc/ops/logging_ops_gen_cc-2.params)
bazel-out/host/bin/tensorflow/cc/libcc_op_gen_main.a(cc_op_gen_main.o): In function main': cc_op_gen_main.cc:(.text.startup.main+0x45): undefined reference to tensorflow::port::InitMain(char const*, int*, char***)'
cc_op_gen_main.cc:(.text.startup.main+0x130): undefined reference to tensorflow::StringPiece::find(char, unsigned long) const' cc_op_gen_main.cc:(.text.startup.main+0x137): undefined reference to tensorflow::StringPiece::npos'
cc_op_gen_main.cc:(.text.startup.main+0x24b): undefined reference to tensorflow::OpList::OpList()' cc_op_gen_main.cc:(.text.startup.main+0x250): undefined reference to tensorflow::OpRegistry::Global()'
cc_op_gen_main.cc:(.text.startup.main+0x265): undefined reference to tensorflow::OpRegistry::Export(bool, tensorflow::OpList*) const' cc_op_gen_main.cc:(.text.startup.main+0x291): undefined reference to tensorflow::Env::Default()'
cc_op_gen_main.cc:(.text.startup.main+0x496): undefined reference to tensorflow::io::internal::JoinPathImpl[abi:cxx11](std::initializer_list<tensorflow::StringPiece>)' cc_op_gen_main.cc:(.text.startup.main+0x4dd): undefined reference to tensorflow::Env::FileExists(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&)'
cc_op_gen_main.cc:(.text.startup.main+0x52d): undefined reference to tensorflow::TfCheckOpHelperOutOfLine[abi:cxx11](tensorflow::Status const&, char const*)' cc_op_gen_main.cc:(.text.startup.main+0x575): undefined reference to tensorflow::internal::LogMessageFatal::LogMessageFatal(char const*, int)'
cc_op_gen_main.cc:(.text.startup.main+0x595): undefined reference to tensorflow::internal::LogMessageFatal::~LogMessageFatal()' cc_op_gen_main.cc:(.text.startup.main+0x624): undefined reference to tensorflow::OpList::~OpList()'
bazel-out/host/bin/tensorflow/cc/libcc_op_gen_main.a(cc_op_gen.o): In function tensorflow::(anonymous namespace)::MakeComment(tensorflow::StringPiece, tensorflow::StringPiece)': cc_op_gen.cc:(.text._ZN10tensorflow12_GLOBAL__N_111MakeCommentENS_11StringPieceES1_+0xd7): undefined reference to tensorflow::StringPiece::substr(unsigned long, unsigned long) const'
cc_op_gen.cc:(.text.ZN10tensorflow12_GLOBAL__N_111MakeCommentENS_11StringPieceES1+0x12a): undefined reference to tensorflow::strings::StrAppend(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*, tensorflow::strings::AlphaNum const&, tensorflow::strings::AlphaNum const&, tensorflow::strings::AlphaNum const&, tensorflow::strings::AlphaNum const&)' cc_op_gen.cc:(.text._ZN10tensorflow12_GLOBAL__N_111MakeCommentENS_11StringPieceES1_+0x1b8): undefined reference to tensorflow::strings::StrAppend(std::__cxx11::basic_string<char, std::char_traits, std::allocator >, tensorflow::strings::AlphaNum const&, tensorflow::strings::AlphaNum const&)'
bazel-out/host/bin/tensorflow/cc/libcc_op_gen_main.a(cc_op_gen.o): In function tensorflow::(anonymous namespace)::PrintString(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)': cc_op_gen.cc:(.text._ZN10tensorflow12_GLOBAL__N_111PrintStringERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x40): undefined reference to tensorflow::str_util::CEscapeabi:cxx11'
cc_op_gen.cc:(.text._ZN10tensorflow12_GLOBAL__N_111PrintStringERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x83): undefined reference to tensorflow::strings::StrCat[abi:cxx11](tensorflow::strings::AlphaNum const&, tensorflow::strings::AlphaNum const&, tensorflow::strings::AlphaNum const&)' bazel-out/host/bin/tensorflow/cc/libcc_op_gen_main.a(cc_op_gen.o): In function tensorflow::(anonymous namespace)::OpInfo::GetConstructorDecl(tensorflow::StringPiece, bool) const':
cc_op_gen.cc:(.text._ZNK10tensorflow12_GLOBAL__N_16OpInfo18GetConstructorDeclENS_11StringPieceEb+0x77): undefined reference to tensorflow::strings::StrCat[abi:cxx11](tensorflow::strings::AlphaNum const&, tensorflow::strings::AlphaNum const&, tensorflow::strings::AlphaNum const&)' cc_op_gen.cc:(.text._ZNK10tensorflow12_GLOBAL__N_16OpInfo18GetConstructorDeclENS_11StringPieceEb+0x129): undefined reference to tensorflow::strings::StrAppend(std::__cxx11::basic_string<char, std::char_traits, std::allocator >
, tensorflow::strings::AlphaNum const&, tensorflow::strings::AlphaNum const&, tensorflow::strings::AlphaNum const&)'
cc_op_gen.cc:(.text._ZNK10tensorflow12_GLOBAL__N_16OpInfo18GetConstructorDeclENS_11StringPieceEb+0x166): undefined reference to `tensorflow::strings::StrAppend(std::__cxx11::basic_string<char, std::char_traits, std::allocator >*, tensorflow::strings::AlphaNum const&)'

It is curious to me that it is using /usr/bin/gcc to try to link when I have specified another toolchain. I have confirmed that the .o files that are created are ARM binaries. for example:

file /home/ddresser/.cache/bazel/_bazel_ddresser/07218640e155d2003dbb6761ed58c2d0/execroot/org_tensorflow/bazel-out/armeabi-opt/bin/tensorflow/core/_objs/framework_internal_impl/tensorflow/core/util/tensor_format.pic.o
/home/ddresser/.cache/bazel/_bazel_ddresser/07218640e155d2003dbb6761ed58c2d0/execroot/org_tensorflow/bazel-out/armeabi-opt/bin/tensorflow/core/_objs/framework_internal_impl/tensorflow/core/util/tensor_format.pic.o: ELF 32-bit LSB relocatable, ARM, EABI5 version 1 (SYSV), not stripped

However, the .so file I see in the build output seems to be for x86_64:

file /home/ddresser/.cache/bazel/_bazel_ddresser/07218640e155d2003dbb6761ed58c2d0/execroot/org_tensorflow/bazel-out/host/bin/tensorflow/libtensorflow_framework.so

/home/ddresser/.cache/bazel/_bazel_ddresser/07218640e155d2003dbb6761ed58c2d0/execroot/org_tensorflow/bazel-out/host/bin/tensorflow/libtensorflow_framework.so: ELF 64-bit LSB shared object, x86-64, version 1 (GNU/Linux), dynamically linked, BuildID[md5/uuid]=ae5ad2a42b549582f28457f02a1d5932, not stripped

Any ideas on this error an next steps?

Thanks,
Derek

@fredszaq
Copy link
Collaborator

fredszaq commented Jan 4, 2018

Hello !

I didn't have time to update the scripts for the last versions of TF, so this is quite possibly linked to that. The last version I build this git hash 107cc777af7880c140d089e44ad898a6ba929286 which is basically 1.3.1 with some bazel fixes.

Could you try and build this version with you toolchain so that we know if the problem is related to the toolchain or to the patch / scripts ?

Regarding the use of /usr/bin/gcc this seems odd on that file indeed (but it is normal to use the build system gcc during the build as some of the programs build are executed further in the build) maybe the bazel configuration for the crosstool has changed ? last time I build I was using bazel 0.5.1

Do not hesitate to do a pull request if you manage to get it working !

@MarcTreySonos
Copy link
Contributor

Hello Derek,

I have just build head using the provided script, there are some minors changes

in tensorflow/BUILD : set s3 support to false (was getting some undefined reference to aws)

define_values = {"with_s3_support": "false"}

make sure there is a define for RASPBERRY_PI in

tensorflow/core/platform/platform.h

this last part maybe be the root cause of your issue

@ddresser
Copy link
Author

ddresser commented Jan 4, 2018

Thanks so much for your helpful responses. I have to admit I'm new to bazel and tensorflow so I'm not quite sure how to accomplish what you suggested.

I wasn't sure what 'config_setting' to add the 'define_values = {"with_s3_support": "false"}' to so I tried adding a new one:

config_setting(
name = "without_s3_support",
define_values = {"with_s3_support": "false"},
visibility = ["//visibility:public"],
)

and referencing it on the bazel command line in cross-compile.sh with '--config=without_s3_support'

but got this warning.

43WARNING: Config values are not defined in any .rc file: without_s3_support

Any chance you can provide the patch that worked for you building HEAD?

Thanks again for your help.
-Derek

@MarcTreySonos
Copy link
Contributor

will make a clean patch this weekend , already out of time here :)

you can simply edit the tensorflow/workspace.bzl file and replace all with_s3_support with :

-    define_values = {"with_s3_support": "true"},
+    define_values = {"with_s3_support": "false"},

@ddresser
Copy link
Author

ddresser commented Jan 4, 2018

Thanks a lot. A clean patch would be appreciated. I'll keep plugging away at it. The problem I seem to be running into is that some dependencies are being compiled with the cross compiler, and some are being compiled for x86_64 by /usr/bin/gcc. For example, in my build log, protobuf_archive is being built for x86_64:

SUBCOMMAND: # @protobuf_archive//:js_embed [action 'Compiling external/protobuf_archive/src/google/protobuf/compiler/js/embed.cc [for host]']
   (cd /home/ddresser/.cache/bazel/_bazel_ddresser/07218640e155d2003dbb6761ed58c2d0/execroot/org_tensorflow && \
     exec env - \
       PATH=/home/ddresser/.gradle/var/idexx/compilers/acadia/bin:/home/ddresser/bin:/home/ddresser/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin \
       PWD=/proc/self/cwd \
     /usr/bin/gcc -U_FORTIFY_SOURCE '-D_FORTIFY_SOURCE=1' -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer -g0 -O2 -DNDEBUG -ffunction-sections -fdata-sections -g0 -DGEMMLOWP_ALLOW_SLOW_SCALAR_FALLBACK '-std=c++0x' -g0\
      -MD -MF bazel-out/host/bin/external/protobuf_archive/_objs/js_embed/external/protobuf_archive/src/google/protobuf/compiler/js/embed.d '-frandom-seed=bazel-out/host/bin/external/protobuf_archive/_objs/js_embed/external/protobuf_archive/src/google/protobuf/compiler/j\
     s/embed.o' -iquote external/protobuf_archive -iquote bazel-out/host/genfiles/external/protobuf_archive -iquote external/bazel_tools -iquote bazel-out/host/genfiles/external/bazel_tools -isystem external/bazel_tools/tools/cpp/gcc3 -no-canonical-prefixes -fno-canonica\
     l-system-headers -Wno-builtin-macro-redefined '-D__DATE__="redacted"' '-D__TIMESTAMP__="redacted"' '-D__TIME__="redacted"' -c external/protobuf_archive/src/google/protobuf/compiler/js/embed.cc -o bazel-out/host/bin/external/protobuf_archive/_objs/js_embed/external/p\
     rotobuf_archive/src/google/protobuf/compiler/js/embed.o)^M

...while other dependencies are being compiled for arm using the 'arm-linux-gnueabihf-gcc' compiler:

SUBCOMMAND: # @sqlite_archive//:sqlite [action 'Compiling external/sqlite_archive/sqlite3.c']
 (cd /home/ddresser/.cache/bazel/_bazel_ddresser/07218640e155d2003dbb6761ed58c2d0/execroot/org_tensorflow && \
   exec env - \
     PWD=/proc/self/cwd \
     PYTHON_BIN_PATH=/usr/bin/python \
     PYTHON_LIB_PATH=/usr/local/lib/python2.7/dist-packages \
     TF_NEED_CUDA=0 \
     TF_NEED_OPENCL_SYCL=0 \
   /home/ddresser/.gradle/var/idexx/compilers/acadia/bin/arm-linux-gnueabihf-gcc -U_FORTIFY_SOURCE '-D_FORTIFY_SOURCE=1' -fstack-protector -g0 -O2 -DNDEBUG -ffunction-sections -fdata-sections -DGEMMLOWP_ALLOW_SLOW_SCALAR_FALLBACK '-march=armv5' -DRASPBERRY_PI '-mfpu=\
     vfp' -funsafe-math-optimizations -ftree-vectorize -fomit-frame-pointer -MD -MF bazel-out/armeabi-opt/bin/external/sqlite_archive/_objs/sqlite/external/sqlite_archive/sqlite3.pic.d -fPIC -iquote external/sqlite_archive -iquote bazel-out/armeabi-opt/genfiles/external/\
     sqlite_archive -iquote external/bazel_tools -iquote bazel-out/armeabi-opt/genfiles/external/bazel_tools -isystem external/sqlite_archive -isystem bazel-out/armeabi-opt/genfiles/external/sqlite_archive -isystem external/bazel_tools/tools/cpp/gcc3 -Wno-builtin-macro-r\
     edefined '-D__DATE__="redacted"' '-D__TIMESTAMP__="redacted"' '-D__TIME__="redacted"' -no-canonical-prefixes -fno-canonical-system-headers -c external/sqlite_archive/sqlite3.c -o bazel-out/armeabi-opt/bin/external/sqlite_archive/_objs/sqlite/external/sqlite_archive/\
     sqlite3.pic.o)^M
 SUBCOMMAND: # @lmdb//:lmdb [action 'Compiling external/lmdb/midl.c']
 (cd /home/ddresser/.cache/bazel/_bazel_ddresser/07218640e155d2003dbb6761ed58c2d0/execroot/org_tensorflow && \
   exec env - \
     PWD=/proc/self/cwd \
     PYTHON_BIN_PATH=/usr/bin/python \
     PYTHON_LIB_PATH=/usr/local/lib/python2.7/dist-packages \
     TF_NEED_CUDA=0 \
     TF_NEED_OPENCL_SYCL=0 \
   /home/ddresser/.gradle/var/idexx/compilers/acadia/bin/arm-linux-gnueabihf-gcc -U_FORTIFY_SOURCE '-D_FORTIFY_SOURCE=1' -fstack-protector -g0 -O2 -DNDEBUG -ffunction-sections -fdata-sections -DGEMMLOWP_ALLOW_SLOW_SCALAR_FALLBACK '-march=armv5' -DRASPBERRY_PI '-mfpu=\
     vfp' -funsafe-math-optimizations -ftree-vectorize -fomit-frame-pointer -MD -MF bazel-out/armeabi-opt/bin/external/lmdb/_objs/lmdb/external/lmdb/midl.pic.d -fPIC -iquote external/lmdb -iquote bazel-out/armeabi-opt/genfiles/external/lmdb -iquote external/bazel_tools -\
     iquote bazel-out/armeabi-opt/genfiles/external/bazel_tools -isystem external/bazel_tools/tools/cpp/gcc3 -w -Wno-builtin-macro-redefined '-D__DATE__="redacted"' '-D__TIMESTAMP__="redacted"' '-D__TIME__="redacted"' -no-canonical-prefixes -fno-canonical-system-headers \
     -c external/lmdb/midl.c -o bazel-out/armeabi-opt/bin/external/lmdb/_objs/lmdb/external/lmdb/midl.pic.o)^M

Not sure why.

Thanks for your help.
-Derek

@MarcTreySonos
Copy link
Contributor

MarcTreySonos commented Jan 4, 2018

building protobuf for the host is expected

@ddresser
Copy link
Author

ddresser commented Jan 4, 2018

Thanks. I am having much more success with bazel 0.5.1. Previously I was using the Ubuntu default of 0.8.1. I have been able to compile libtensorflow.so with the 'arm-bcm2708' compiler and with my 'arm-linux-gnueabihf' compiler at the specified commit (107cc777af7880c140d089e44ad898a6ba929286) I'll try head next.

@ddresser
Copy link
Author

ddresser commented Jan 5, 2018

It seems tensorflow HEAD has a check for bazel version >= 0.5.4.

I upgraded to bazel 0.5.4 and am still able to build 107cc777af7880c140d089e44ad898a6ba929286, but am seeing the original linking issues when trying to build head.

mtrey, I am curious what version of bazel you used to successfully build HEAD.

I appreciate the help. I'm learning a bunch about bazel in this process.

@ddresser
Copy link
Author

ddresser commented Jan 8, 2018

Today I discovered that tensorflow cross compiles for the raspberry pi as part of their CI build.

https://github.com/tensorflow/tensorflow/blob/master/tensorflow/tools/ci_build/pi/build_raspberry_pi.sh

I was able to easily modify that script to use my compiler and build tensorflow HEAD. That satisfies my need for now. I have included my patch if it is helpful to anyone.

Thank you both very much for your support on this.
-Derek

build_raspberry_pi.sh.patch.txt

@AntoineWeber
Copy link

AntoineWeber commented Jan 21, 2019

Hi @ddresser,
I know this post is very old but I'm encountering some problems cross-compiling tensorflow for raspberry-pi.
I also want to use my own compiler, hence modified the build_raspberry_pi.sh script, but I encounter the problem

C Compiler (/opt/cross-pi-gcc/bin/arm-linux-gnueabihf-gcc) is something wrong.

Did you remember encountering such a problem ?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants