Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Infinite Loops in Student Code #34

Open
YoshikiTakashima opened this issue Mar 6, 2021 · 4 comments
Open

Infinite Loops in Student Code #34

YoshikiTakashima opened this issue Mar 6, 2021 · 4 comments

Comments

@YoshikiTakashima
Copy link

Hi.

Thanks to your help in #33, we're up and running.

We're now encountering an issue where, if the student submits code that infinitely loops. Then we are left with a orphaned process.

The process starts in the container, but somehow the container dies before the process dies, leaving a very strange process whose UID does not match any user on the host machine, but is valid only in the container. Despite the container being stopped, this process somehow is still alive.

We've found a rough mitigation, but I would like to find a patch. It would be great if you can provide some insight with respect to this mystery behavior.

Thanks

~Yoshi

@YoshikiTakashima
Copy link
Author

2021-03-06-115149_1920x1080_scrot

This process has been running all night, but the only running container is barely 4 minutes old. Somehow, it escaped the container.

@jprider63
Copy link
Member

Hi @YoshikiTakashima. It seems odd that the process is running outside the container. Here a few thoughts:

  • Something is misconfigured and the runner is connecting via ssh to the host machine (we usually connect to different machines running a docker swarm).
  • There's a bug or exploit that allows the process to escape the container.

Also, are there multiple runner processes running? You should only run one at a time. You can run multiple webapp process on different machines connected to the database, but it's not necessary.

@YoshikiTakashima
Copy link
Author

YoshikiTakashima commented Mar 11, 2021

Hi @jprider63. Sorry for the delay.

So the "escape" part appears to be an observation error on our part. Rather the loop is occurring because runner keeps re-running the student code cases after timeout.

So the loop is like this

Student submits -> student code runs -> student code times out -> runner receives some info -> runner re-runs student code -> loop.

I am not sure what runner receives here, but it is making the wrong choice about what to rerun.

We only running one instance of the runner.

@jprider63
Copy link
Member

There is a known issue with timeouts that I just created an issue for (#35). This would cause the thread handling a specific team to block and sometimes leave their docker image running (so administrators would have to kill and restart the runner). This sounds like a different issue from what you're seeing though.

The runner chooses to run the latest submission with a pending status. Can you describe how the status in the database changes during the loop? It might help to add prints/logs of the status as the submission changes. Are there any other logs from the runner you can share? If the runner experiences an error, it would reset the submission's state to pending which would cause it to be run again. Also, which version/commit are you running?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants