-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is training supposed to take weeks? #50
Comments
you'll have more luck with a GPU but this code is old and unmaintained. You can probably write something better from scratch these days! |
I simplified your equations and now it finds a solution in seconds. Are you still involved in plasma research, even if not this repo? |
How did you simplify them? I'm no longer doing physics work :) |
I'm actually not sure, at first I simply set E and B to 0 and then it was lightning fast, but even when activating E in the 1D case training takes about a second per step. I did use the approximation f(t,x,|v|) and set the integral domain to (0,1), but that couldn't possibly account for an improvement factor of 30000... The BFGS algorithm starts out 5x slower than ADAM and sometimes spikes to 100x slower whereas ADAM is consistent at 1s/step, so I used ADAM instead. You can inspect the code here: https://gitlab.com/marcus.appelros/fusion/-/blob/main/neural/Vlasov0.jl I'm still working on the |v| approximation in 3D, if I add v_norm(vs...) as a variable and the equation v_norm=norm(vs) then the problem construction complains that v_norm in f(t,xs...,v_norm(vs...)) in sys.dvs doesn't have a name... IE nameof(v_norm) errors. Do you know of a workaround? |
Hi!
I cloned your repo (couldn't add it from the package manager because of conflicting compat requirements involving CUDA) and after some tinkering managed to get it to work. However the initial example in the README takes 10 hours per step (callback triggered once overnight). If I change the strategy max_iter from 1000 to 1 it takes about 5 minutes per step, also I simplified the model chain to just 3 small layers so it is surprising it takes that long...
My laptop is a few years old with 8GB RAM, is the initial example supposed to take weeks to complete on such hardware, or did I introduce a bug with my tinkering?
The text was updated successfully, but these errors were encountered: