-
Notifications
You must be signed in to change notification settings - Fork 15
OpenCL
Veil includes a full OpenCL runtime pipeline for running general compute on different devices. Unlike Compute Shaders, OpenCL is not specific to the GPU and allows faster, more feature-rich compute programs.
The API allows fairly low-level access to OpenCL, but there are some general conventions that should be followed. Most
methods throw a CLException
to allow the user to define how errors should be handled. This is a specific design choice
that makes development an order of magnitude easier.
- Each mod should generally request a single static OpenCL environment. This isn't required, but it prevents unexpectedly running code on multiple devices.
- The environment should never be freed by the mod. Veil will free the environment when needed.
- Most calls should be wrapped in a try/catch for CL errors
The environment provides access to all other functionality. VeilOpenCL#getEnvironment()
provides multiple methods to
request environments with specific properties. The default method is fine for most use cases.
NOTE: Some computers may not have support for any OpenCL devices. In that case the environment will return null.
Some devices may also return null if requesting specific features that aren't supported. In that case a more generalized environment should be requested.
import foundry.veil.opencl.CLEnvironment;
import foundry.veil.opencl.VeilOpenCL;
public class ModClass {
public static final CLEnvironment ENVIRONMENT = VeilOpenCL.get().getEnvironment();
}
Veil does not have any built-in API for loading CL programs from a resource pack. This makes it much easier to run the same program on the client and server. There is nothing stopping the user from implementing their own version though.
CLEnvironment#loadProgram
must be called before attempting to create a kernel or an exception will be raised
then CLEnvironment#createKernel
can be called. These two lines are the most error-prone and exceptions must be
properly handled.
Kernels should be freed when they aren't needed anymore. Once all kernels for a specific program have been freed, then
the program will be freed. Programs can be explicitly freed with CLEnvironment#freeProgram
.
NOTE: Loading another program under the same name will free the old program and all associated kernels
import foundry.veil.opencl.CLEnvironment;
import foundry.veil.opencl.CLKernel;
import net.minecraft.resources.ResourceLocation;
public class ModCL {
public static void doThing() {
CLEnvironment environment = ModClass.ENVIRONMENT;
environment.loadProgram(new ResourceLocation("modid", "coolprogram"), """
void kernel cool_kernel(global const int* A, global const int* B, global const int* C, global int* D) {
int i = get_global_id(0);
D[i] = A[i] + B[i] + C[i];
}
void kernel cooler_kernel(global const int* A, global const int* B) {
int i = get_global_id(0);
B[i] = sqrt(A[i]);
}
""");
CLKernel kernel1 = environment.createKernel(new ResourceLocation("modid", "coolprogram"), "cool_kernel");
// Multiple kernels can be created from the same program
CLKernel kernel2 = environment.createKernel(new ResourceLocation("modid", "coolprogram"), "cooler_kernel");
// do stuff
// These should be freed when they aren't needed anymore. A try with resources is the best way to create a single-use kernel
kernel1.free();
kernel2.free();
}
}
Kernels allow code to be executed from a program. They allow the user to upload and read data to/from the CL device
using CL buffers and loaded parameters. Single primitives can be loaded into parameters slots with the methods
in CLKernel
.
To load more than one value into a parameter slot, a CLBuffer
must be used. These function as arbitrary blocks of
memory on the CL device that can be read and written to.
import foundry.veil.opencl.CLEnvironment;
import foundry.veil.opencl.CLException;
import foundry.veil.opencl.CLKernel;
import foundry.veil.opencl.CLBuffer;
import foundry.veil.opencl.VeilOpenCL;
import org.lwjgl.system.MemoryStack;
import java.nio.IntBuffer;
import static org.lwjgl.opencl.CL10.*;
public class Test {
public static void run() throws CLException {
CLEnvironment environment = ModClass.ENVIRONMENT;
environment.loadProgram(new ResourceLocation("modid", "testprogram"), """
void kernel example_kernel(global const int EpicParameter, global const int* A, global const int* B, global const int* C, global int* D) {
int i = get_global_id(0);
D[i] = A[i] + B[i] + C[i];
}
""");
try (CLKernel kernel = environment.createKernel(new ResourceLocation("modid", "coolprogram"), "cool_kernel")) {
// These can either be freed manually, or automatically when the kernel is freed
CLBuffer bufferA = kernel.createBuffer(CL_MEM_READ_ONLY, Integer.BYTES * 4); // size is in bytes
CLBuffer bufferB = kernel.createBuffer(CL_MEM_READ_ONLY, Integer.BYTES * 4);
CLBuffer bufferC = kernel.createBuffer(CL_MEM_READ_ONLY, Integer.BYTES * 4);
CLBuffer bufferD = kernel.createBuffer(CL_MEM_WRITE_ONLY, Integer.BYTES * 4);
try (MemoryStack stack = MemoryStack.stackPush()) {
IntBuffer dataA = stack.ints(1, 2, 3, 4);
IntBuffer dataB = stack.ints(5, 6, 7, 8);
IntBuffer dataC = stack.ints(9, 10, 11, 12);
IntBuffer dataD = stack.mallocInt(4); // This is the output, so we don't fill with data
// The async methods are better, but the synchronous call can be used to wait until complete
bufferA.writeAsync(0, dataA, null);
bufferB.writeAsync(0, dataB, null);
bufferC.writeAsync(0, dataC, null);
// Set the parameter pointers
kernel.setInt(0, 4);
kernel.setPointers(1, dataA);
kernel.setPointers(2, dataB);
kernel.setPointers(3, dataC);
kernel.setPointers(4, dataD);
// Run the actual program
kernel.execute(4, 1);
// This is not required, but will block until all commands have been processed
environment.finish();
// This will block until the data is present
bufferD.read(0, dataD);
System.out.printf("%d, %d, %d, %d%n", dataD.get(0), dataD.get(1), dataD.get(2), dataD.get(3));
}
}
}
}
OpenCL provides a way of subscribing to events fired by the compute device. CLEnvironment#getEventDispatcher()
returns
the event dispatcher for the environment and can subscribe to any returned CL event. In most cases this functionality
can be ignored, but is exposed to allow full programmer freedom.
import foundry.veil.opencl.CLEnvironment;
import foundry.veil.opencl.CLException;
import foundry.veil.opencl.VeilOpenCL;
import org.jetbrains.annotations.Nullable;
import org.lwjgl.PointerBuffer;
import org.lwjgl.system.MemoryStack;
import java.nio.IntBuffer;
import static org.lwjgl.opencl.CL10.CL_COMPLETE;
import static org.lwjgl.opencl.CL10.clEnqueueWriteBuffer;
// Code from CLKernel.java
public void writeAsync(long offset, IntBuffer data, @Nullable Runnable onComplete) throws CLException {
try (MemoryStack stack = MemoryStack.stackPush()) {
PointerBuffer event = onComplete != null ? stack.mallocPointer(1) : null;
VeilOpenCL.checkCLError(clEnqueueWriteBuffer(this.environment.getCommandQueue(), this.pointer, false, offset, data, null, event));
if (event != null) {
// This subscribes to the event, and fires the specified runnable callback
this.environment.getEventDispatcher().listen(event.get(0), CL_COMPLETE, onComplete);
}
}
}