Custom callable function from within the C++ API #1614

polvalente · 2024-11-22T14:15:58Z

polvalente
Nov 22, 2024

If we define a function in Python, it can then be passed to mlx.core.compile along with the inputs for the invokation.
We're writing bindings for Nx, and one of the things we have is a compiler abstraction on our side.
This means that I could, in theory, build a similar callable representation from within Elixir.

However, it is not clear to me if there is an API for, say, defining symbolic MLX arrays that correspond to each function argument that would then trace the graph, or how we would deal with tuple results in this workflow.

Since MLX is mostly lazy-eval, not being able to achieve this isn't a dealbreaker, as we have the Nx.Defn.Evaluator for JIT executing the graph for us, but it seems to me we could benefit from using mlx.core.compile whenever we invoke Nx's compiler.

Answered by awni

Nov 22, 2024

I'm not familiar at all with Nx and how compilation works there but I can say a bit more about how it works in MLX:

def fun(a, b, c):
  return a + b + c

# Step 0: Nothing much has happened here yet other than wrapping `fun` 
# in another function which knows to compile it
compiled_fun = mx.compile(fun)

# Step 1: The first time the compiled function is called it gets partially compiled. We trace the graph
# using the provided inputs and do some optimization passes on the graph
out = compiled_fun(a, b, c)

# Step 2: The rest of the compilation happens the first time you call eval. This is where
# kernel source is actually JIT compiled
eval(out)

# Calling it again on inputs with the same …

View full answer

awni · 2024-11-22T14:36:13Z

awni
Nov 22, 2024
Maintainer

I'm not familiar at all with Nx and how compilation works there but I can say a bit more about how it works in MLX:

def fun(a, b, c):
  return a + b + c

# Step 0: Nothing much has happened here yet other than wrapping `fun` 
# in another function which knows to compile it
compiled_fun = mx.compile(fun)

# Step 1: The first time the compiled function is called it gets partially compiled. We trace the graph
# using the provided inputs and do some optimization passes on the graph
out = compiled_fun(a, b, c)

# Step 2: The rest of the compilation happens the first time you call eval. This is where
# kernel source is actually JIT compiled
eval(out)

# Calling it again on inputs with the same shape and type doesn't recompile
eval(compiled_fun(a, b, c))

I'm not sure if it makes sense to do step 0 (wrapping the fun) when you use Nx's compiler. It might be nice from a user's standpoint to be able to control what get's compiled in MLX and what does not regardless of whether you are compiling the rest of the Nx code. The rational is that it doesn't always make sense to compile MLX functions.

For example, if you are changing the shape of the input from call to call, say for example increasing the shape of an input by one at each call, then compiling can slow you down. Compiling a large graph can take some time (milliseconds) and for latency sensitive applications that can add up. You typically want to amortize the cost of compiling with repeated applications of the compiled function.

4 replies

polvalente Nov 22, 2024
Author

This already shines some light over the issue!

If I have something similar to the following:

def fun(a, b):
  x = a + b
  y = a - b
  return eval(x * y)

If the shapes and types for a and b are static, does eval take advantage of some cached compilation step?
If so, I believe we can handle this on the Nx side without having to rely on mx.compile, as we have static shapes.

polvalente Nov 22, 2024
Author

Re-reading the answer above, I believe the answer to my question is: yes, if we don't change the shape and type, the second eval will not re-compile things.

This is great news, as we have Nx.Defn.Evaluator, which will traverse the Nx graph for us, and I can use it as a first pass for building the graph, follow it up with a call to mx.eval and that seems to be enough for our use case!

awni Nov 22, 2024
Maintainer

yes, if we don't change the shape and type, the second eval will not re-compile things.

Yes that's exactly right. We store the function in a cache along with the shapes and dtypes of the inputs. If it has already been compiled with the given shapes and dtypes then no recompilation happens. There is a detail compile API which let's you specify the unique function id to use for the cache.

polvalente Nov 22, 2024
Author

Awesome! This unlocks things on my end. Thank you so much!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Custom callable function from within the C++ API #1614

{{title}}

Replies: 1 comment 4 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

Custom callable function from within the C++ API #1614

polvalente Nov 22, 2024

Replies: 1 comment · 4 replies

awni Nov 22, 2024 Maintainer

polvalente Nov 22, 2024 Author

polvalente Nov 22, 2024 Author

awni Nov 22, 2024 Maintainer

polvalente Nov 22, 2024 Author

polvalente
Nov 22, 2024

Replies: 1 comment 4 replies

awni
Nov 22, 2024
Maintainer

polvalente Nov 22, 2024
Author

polvalente Nov 22, 2024
Author

awni Nov 22, 2024
Maintainer

polvalente Nov 22, 2024
Author