-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Runtime error: Check failed: IsAligned() #49
Comments
Hi @jarrellmark Thanks for reporting this! That has been addressed in 5cc8cdd. Could you please try it out? SYCL default allocator did not take alignment into consideration. That now has been addressed in Eigen, where we are passing the required alignment to the custom allocator. C++ is great! Thanks, |
Hey @lukeiwanski, The IsAligned() message went away, but I'm getting this message now:
Is there a way to force the GPU? |
Currently we have an issue with memory alignment on Intel GPUs and have set the Intel GPU as "blacklisted" in Eigen. This means Eigen will not try to target Intel GPUs at the moment. We are working on a resolution for this and will update you when we have a fix available. |
Thanks, Luke. I appreciate it and am excited about the progress that tensorflow-opencl is making. |
Hi @lukeiwanski, I'm having the same issue and was wondering if you have added Eigen support for Intel GPUs yet. If not, is there some way I can un-blacklist the Intel GPU? Thanks for your hard work on this project! |
Can you give it a spin on this branch: https://github.com/lukeiwanski/tensorflow/tree/dev/eigen_mehdi ? |
That fixed the error, thanks a lot! An unrelated question, tensorflow keeps telling me I'm running on a SYCL device, but then it calls that device a CPU. When I run sess = tf.Session(config=tf.ConfigProto(log_device_placement=True)), I get the following output: /job:localhost/replica:0/task:0/device:SYCL:0 -> id: 0, type: CPU, name: Intel(R) Core(TM) i5-5300U CPU @ 2.30GHz, vendor: Intel(R) Corporation, profile: FULL_PROFILE Running tensorflow.python.client.device_lib.list_local_devices() gives me the following: [name: "/cpu:0" However, this device is NOT my GPU, as can be seen from when I run clinfo: Platform Name Intel(R) OpenCL . Device Name Intel(R) Core(TM) i5-5300U CPU @ 2.30GHz Thanks for all your help already! |
However, I am later getting this error when I try to run a simple keras model (just 2 dense layers): InternalError: Unknown error detected on device /job:localhost/replica:0/task:0/device:SYCL:0 |
That's interesting.. could you provide code to reproduce that issue? |
I'm having trouble reproducing this issue because the code seems to just be hanging (I'm getting a lot of these messages: But here's my code: |
Ah, ok I've reproduced the earlier error by using LSTM layers. It may be unreasonable for me to expect LSTM layers to work, but I am also having trouble with just dense layers (see above). Here's my code: from keras.models import Sequential And here's a trace of the error message: /home/nicholas/.virtualenvs/tensorflow-luke/local/lib/python2.7/site-packages/keras/models.pyc in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, **kwargs) /home/nicholas/.virtualenvs/tensorflow-luke/local/lib/python2.7/site-packages/keras/engine/training.pyc in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, **kwargs) /home/nicholas/.virtualenvs/tensorflow-luke/local/lib/python2.7/site-packages/keras/engine/training.pyc in _fit_loop(self, f, ins, out_labels, batch_size, epochs, verbose, callbacks, val_f, val_ins, shuffle, callback_metrics, initial_epoch) /home/nicholas/.virtualenvs/tensorflow-luke/local/lib/python2.7/site-packages/keras/backend/tensorflow_backend.pyc in call(self, inputs) /home/nicholas/.virtualenvs/tensorflow-luke/local/lib/python2.7/site-packages/tensorflow/python/client/session.pyc in run(self, fetches, feed_dict, options, run_metadata) /home/nicholas/.virtualenvs/tensorflow-luke/local/lib/python2.7/site-packages/tensorflow/python/client/session.pyc in _run(self, handle, fetches, feed_dict, options, run_metadata) /home/nicholas/.virtualenvs/tensorflow-luke/local/lib/python2.7/site-packages/tensorflow/python/client/session.pyc in _do_run(self, handle, target_list, fetch_list, feed_dict, options, run_metadata) /home/nicholas/.virtualenvs/tensorflow-luke/local/lib/python2.7/site-packages/tensorflow/python/client/session.pyc in _do_call(self, fn, *args) InternalError: Unknown error detected on device /job:localhost/replica:0/task:0/device:SYCL:0 |
Hi @lukeiwanski I am having the same issue ( Do you have a patch I could try to fix this? Or an advice on how to go about fixing it? |
Ping on this issue |
Thanks for the great work on tensorflow-opencl. It's really great.
Summary
I'm getting a runtime error for almost all tensorflow programs:
2017-03-18 12:52:52.241954: F ./tensorflow/core/framework/tensor.cc:488] Check failed: IsAligned()
Aborted (core dumped)
Environment Description
I have an Intel HD Graphics 5500 GPU with the Intel Broadwell i5 CPU x64. I'm using Intel's OpenCL drivers from here: https://software.intel.com/en-us/articles/opencl-drivers .
The OS is Ubuntu 16.04 LTS.
Python version is 3.5. It's running in a conda environment using Anaconda's versions of python, numpy, scipy, pyyaml, h5py, pandas, and jupyter.
However, the tensorflow pip package was compiled using Ubuntu's version of everything as per the compile from source instructions. I disabled Anaconda by removing it from ~/.bashrc, compiled the pip package, re-enabled Anaconda, activated the conda environment, and installed the pip package into the conda environment.
Steps to Reproduce
Here's the only tensorflow program I tried that did not fail:
Changing
NUM_ROWS
andNUM_COLUMNS
to even 1200 resulted in the error above.I also installed keras into the same conda environment using
pip install keras
and ran this script: https://github.com/fchollet/keras/blob/master/examples/lstm_text_generation.py . This resulted in the same error:Check failed: IsAligned()
. The error is displayed afterBuild model...
is outputted to the console.Commit Hash (
git rev-parse HEAD
)Bazel Version
clinfo
computecpp_info
The text was updated successfully, but these errors were encountered: