Compilation: [eval] Attempting to eval an array without a primitive. #690
-
Can a training step with Mixtral model be compiled? state = [model.state, optimizer.state]
@partial(mx.compile, inputs=state, outputs=state)
def train_step(batch):
loss_value_and_grad = nn.value_and_grad(model, default_loss)
(lvalue, toks), grad = loss_value_and_grad(model, *batch)
optimizer.update(model, grad)
return lvalue.item(), toks.item() When attempting to compile, the training step yields the following error:
|
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 4 replies
-
What code are you using there? It should be compilable, but you have to be careful to make sure all the implicit state is captured. Usually you see that error message when you forget to include state in the inputs and/or outputs. That said I don't think you will see much gain (yet) from compiling it since most of the work should be in the matrix multiplications in the MLPs and attention and that is not affected from compile. |
Beta Was this translation helpful? Give feedback.
-
The latest version of mlx-lm introduced the compile in the training step, which does not work with Mixtral model. You can try removing this line from the source code or downgrading mlx-lm. |
Beta Was this translation helpful? Give feedback.
Ah sorry I should have realized earlier, you cannot compile the MOE models right now as they do an implicit graph eval to determine the expert to route to. That needs a workaround which we have not implemented yet.