Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Memory accumulation over the first 30 minutes of runtime #858

Open
ryan-budde opened this issue Jan 27, 2025 · 2 comments
Open

BUG: Memory accumulation over the first 30 minutes of runtime #858

ryan-budde opened this issue Jan 27, 2025 · 2 comments
Assignees

Comments

@ryan-budde
Copy link

ryan-budde commented Jan 27, 2025

Describe the issue:

Hardware reqs. recommend 32 GB RAM and not much performance gain seen after that. I have 128 GB, and over the first 30 minutes of runtime I see a gradual accumulation of memory usage from ~15 GB up to 128 GB. See attached log. This is 168 minutes of NPX 1.0 data with ~50 channels excluded for being out of brain. Total runtime ~132 minutes.

Is this normal?

In my experience with SpikeInterface / scipy this type of gradual memory accumulation suggests a memory leak, and there may be a benefit to adding some garbage collection in between batches. This feels about the same runtime as KS2.5, so I think the resource accumulation is perhaps not detrimental to KS4 runtime, but 1) in my experience with SI this tends to lead to OOM crashes for even longer files, and 2) it will free up resources for the PC when KS4 runs in the background.

kilosort4.log

Version information:

See log file.
Windows CUDA 12.6
Pytorch 11.8
Windows 11
Python 3.10
Miniforge

@jacobpennington
Copy link
Collaborator

Just clarifying before I look into this, did you run KS4 through spikeinterface or did you run it on its own?

@ryan-budde
Copy link
Author

ryan-budde commented Jan 29, 2025

On its own through the GUI. I haven't gone into the python of it all yet.

The prime suspect is scipy. A lot of their functions have had memory leaks for me on very long NPX files. In spikeinterface the problem was lsqr and similar. I assume with the move to python some scipy functions are being used. To fix in SI I just randomly added gc.collect() statements inside computational loops until it worked. I assume the same will work here.

@jacobpennington jacobpennington self-assigned this Mar 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants