Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IO Error: 'too many open files' when removing many corrupted runs #3224

Open
Engrammae opened this issue Sep 20, 2024 · 0 comments
Open

IO Error: 'too many open files' when removing many corrupted runs #3224

Engrammae opened this issue Sep 20, 2024 · 0 comments
Labels
help wanted Extra attention is needed type / bug Issue type: something isn't working

Comments

@Engrammae
Copy link

🐛 Bug: Removal of many corrupted runs in one go

I ran a larger experiment tracking a lot of runs and apparently I had quite a few corrupted runs (in my case 539).

I tried removing them by calling aim runs rm --corrupted, but got an error "IO too many open files".
I still could remove single corrupted runs with aim runs rm ${hash}.
I tried increasing the limit with ulimit -n up to 2048, but too no effect

To reproduce

Somehow get a lot of corrupted runs and try to remove them at once with aim runs rm --corrupted

Expected behavior

A removal of runs that respects the limit of open files, so that aim runs rm --corrupted also works, if there are many corrupted runs.

Environment

  • Aim v3.24.0
  • Python 3.11.6
  • pip 24.0
  • OS Ubuntu 22.04.4 LTS

Additional context

As a workaround I wrote a short bash-script to remove corrupted runs one by one, but this still quite cumbersome.

#! /bin/bash

aim runs ls --corrupted | head -n 1  | sed 's/\t/\n/g' > corrupted_runs

while  read -r run;
do
    echo "Removing corrupted run: ${run}"
    aim runs rm ${run} -y
done <  corrupted_runs
@Engrammae Engrammae added help wanted Extra attention is needed type / bug Issue type: something isn't working labels Sep 20, 2024
@Engrammae Engrammae changed the title Enable removing of many corrupted runs in one go IO Error: 'too many open files' when removing many corrupted runs Sep 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed type / bug Issue type: something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant