Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix compile error in OpenCL shader #513

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

skalldri
Copy link

@skalldri skalldri commented Nov 8, 2018

On a fresh install of Beignet on Ubuntu 18.04, I get an "ambiguous function" OpenCL compile error when running the kinec2 depth registration node.

This compile error happens because there are multiple available sqrt() functions which take a variety of inputs. Since the type of the input data is not explicit, the compile fails.

The fix is to explicitly mark the constant value as a float so the OpenCL compiler can pick the correct version of the sqrt() function.

@skalldri
Copy link
Author

@bbferka are you able to merge this PR?

@klokik
Copy link

klokik commented Apr 6, 2019

@skalldri can you update your PR with following changes also?
Otherwise I have successfully compiled kernel (with beignet OpenCL), but

ASSERTION FAILED: Double precision not supported on this device (if this is a literal, use '1.0f' not '1.0')
at file /build/beignet-Bevceu/beignet-1.3.2/backend/src/backend/gen_insn_selection.cpp, function void gbe::ConvertInstructionPattern::convertDoubleToSmallInts(gbe::Selection::Opaque&, const gbe::ir::ConvertInstruction&, bool&) const, line 6386

@@ -111,7 +111,7 @@ void kernel checkDepth(global const int4 *idx, global const ushort *zImg, global
 
   const int4 index = idx[i];
   const ushort zI = zImg[i];
-  const ushort thres = 0.01 * zI;
+  const ushort thres = 0.01f * zI;
   const ushort zIThres = zI + thres;
   const float4 dist2 = dists[i];
 
@@ -176,7 +176,7 @@ void kernel remapDepth(global const ushort *in, global ushort *out, global const
   }
 
   const float avg = (p.s0 + p.s1 + p.s2 + p.s3) / count;
-  const float thres = 0.01 * avg;
+  const float thres = 0.01f * avg;
   valid = isless(fabs(p - avg), (float4)(thres));
   count = abs(valid.s0 + valid.s1 + valid.s2 + valid.s3);
 
@@ -192,5 +192,5 @@ void kernel remapDepth(global const ushort *in, global ushort *out, global const
   const float4 dist = select((float4)(0), tmp - sqrt(dist2), valid);
   const float sum = dist.s0 + dist.s1 + dist.s2 + dist.s3;
 
-  out[i] = (dot(p, dist) / sum) + 0.5;
+  out[i] = (dot(p, dist) / sum) + 0.5f;
 }

@skalldri
Copy link
Author

@klokik absolutely. Doing that now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants