-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Native pointer address! #45
Comments
Sure, I'm always open for talking about this. But maybe with a small disclaimer: It has been quiet around JOCL recently. Also because it has been relatively quiet about OpenCL in general. I might actually have to look up certain points, at a certain technical level...
I'm roughly aware of the efforts for providing foreign memory access functionality in the JDK. Back when https://openjdk.org/projects/panama/ started, I actually talked a bit with Maurizio Cimadamore, about possible uses of the new API - although this rather referred to JCuda and jextract. I wrote a few notes at https://mail.openjdk.org/pipermail/panama-dev/2019-February/004443.html . I tried to follow some of the discussions on the Panama mailing list (regarding the memory interface and the vector API), but haven't been able to really track that closely for a while now..
I always tried to avoid the Unsafe API. Not so much because it is 'unsafe'. (I mean, JNI is as unsafe as it can be. One could do everything in these C functions...). But it always felt a bit clumsy, potentially incompatible (due to (An aside: I'm a bit surprised to see that the article does not mention intrinsics - but that's also a point where one could quickly dive into the deepest guts of modern VMs...)
The But maybe that is the core of the question:
The But this address only makes sense for One difficulty for the
Back when I started JOCL, I tried to handle these cases transparently (including things like pointer arithmetic, with If I had to re-design JOCL from scratch, I'd probably re-consider this. I would certainly change some details of the implementation. But maybe I would even drop the support for non-direct buffers altogether: Most JNI libraries only support direct buffers, because everything else is just very complicated. But I really wanted to be able to access something like a On the JNI side, such a Some higher-level thoughts:
I'll try to allocate some time for that. I occasionally pointed to your repositories as examples for a project that uses JOCL and structs (!), and we talked a bit about that back in 2016/2017, but I'm sure that a lot has changed in the meantime. |
Thank you Marco, This is a well detailed explanation and thank you for your time in responding. This actually made me scratch my head to understand the concepts about its implementation and had to dwell much into your code. The problem with my ray tracer is that it had an issue of throwing out of memory exception after I tried to create an image with more than 2000 x 2000. Due to the large memory allocation of direct buffers through the custom structs I've implemented. I noticed your Pointer class has to have a buffer regardless, and you explained it well above in the numbered bullet section. Hence I couldn't avoid using the buffer class (requires the buffer object for the JNI). The downside about this is that I will have conform with int sized buffer capacity instead of the desired long sized buffers. My code in the ray tracer has to create some complex structs such as:
which is around 80 bytes. So creating such structs for maybe 2000 x 2000 size image would mean creating a lot of bytes in general, but would fit in an array for sure. So, the question is why was the virtual machine throwing out of memory allocation? After scouring through internet, I discovered that the nio classes tend to have such kind of behaviour (out of memory exceptions) even after using large -Xmx heap size or even -XX:MaxDirectMemorySize. Further research I presumed probably is the large amount of work done in the bound checks of the allocation code of direct buffer as shown in the previous website I showed (reference). With that in mind, I allocated direct buffer by a hack using unsafe class by replacing the created address and replacing it with the one created by unsafe.
This actually resolved allocation (temporarily) by by-passing the expensive direct buffer allocation. The problem came now from garbage collection when I re-initialized everything, in which my ray-tracer was crashing every time. Seems the VM does some magic behind to track and garbage collect direct buffers. Almost gave up until I noticed JNA has similar classes of generating direct buffers without the expensive approach done by the standard classes (almost similar to the above approach). I didn't check their implementation, but I just used their API to call necessary classes as shown below.
This actually resolved everything. Crazy how memory management works in java, especially for direct buffers. Now I can rest. Lol. |
I haven't had the chance to really look more closely at the latest state of the ray tracer. But some quick thoughts for now.
My first thought here was: Do you really have to allocate this at once for all the pixels? I could imagine that there was an approach to divide this into "chunks", and actually think that this would have made sense for an OpenCL-based implementation regardless of whether there is memory pressure. So, roughly speaking, I'd have expected some pseudocode like
But of course, this is a very naive sketch, and there certainly are many reasons why it is not as simple as suggested in this pseudocode... My second thought was: This may be related to memory fragmentation. This is going down to a level of memory management that I'm not deeply familiar with (because in 99.99% of all cases, this is not a concern for Java developers). But as far as I know, memory allocation on the operating system level can fail to allocate the required memory in some cases. Again, VERY roughly as in
But again, it might be the case that this is handled by some magic of the operating system nowadays, even if the allocation takes place on the level of JNI.
Yes, I know that there is some 'magic' involved. There was the There is one general problem with native allocations: They are outside the scope of the Garbage Collector. In fact, people occasionally suggested that I could tweak the
one could assume that the pointer is garbage collected at the end of the loop (meaning that Now, ... this doesn't help you much. But when you refer to JNA and say
A word of warning: JNA actually did try to use the Again, I'm not up to date with the details. But I wanted to mention that you still might have to think about the proper freeing/deallocation even if you use |
Hi Marco!
Not sure if this might be appropriate to ask here, but I was more curious of native pointer address concept implementation in JOCL.
Since java might start using a safe api for foreign memory access (currently incubating in jdk 19, and proposed in JEP424), this might open doors to off-heap access of data (outside the jvm) and memory access might be a terabyte size depending on the physical memory available. Of-course native memory can be accessed through unconventional methods, such as the infamous Unsafe API (this API should have been made an official API aeons ago - but good thing the OpenJDK team is making strides to create and implement similar methods from Unsafe), and also through a direct buffer with a possibility of using a hack to get its pointer...
public static long addressOfDirectBuffer(ByteBuffer buffer) {return ((DirectBuffer) buffer).address();}
(caution, this is using sun.com code hence doesn't work in every java version), whereby the buffer parameter is a direct buffer. I'm aware that Direct ByteBuffer does use Unsafe, but unfortunately, the offheap data is still read into heap first, and still limited to the size of 2GB or 2^31.Some great insights here
Based on the knowledge above, I was thinking of the possibility of implementing the "addressOfDirectBuffer", and use the long address and use it as an official pointer address as a way to experiment with native pointers and use of OpenCL USE_HOST_PTR. This is shown below.
Unfortunately, this approach doesn't work, and the data is corrupted in a way. Would you mind giving insights on how native pointers are implemented in jocl. For example, are the pointers accessed in jocl through JNI similar to maybe the hacks describe above? I know this might seem out of scope of what JOCL tries to provide (safe code), but I was a bit curious. There is a project I'm implementing that might require off-heap data only, due to out of memory heap issues.
You can play around with this project. Uses jdk 1.8 (use windows only - the file dialog explorer is quite custom), but good for debugging my other raytracing codes (like implementing new concepts rapidly).
The text was updated successfully, but these errors were encountered: