Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Looping llthreads issue #13

Open
BigJim opened this issue Mar 21, 2017 · 4 comments
Open

Looping llthreads issue #13

BigJim opened this issue Mar 21, 2017 · 4 comments

Comments

@BigJim
Copy link

BigJim commented Mar 21, 2017

First of all again thanks for all your work on this. My aim is to help you make it better.

There is some design issues with llthreads (going back to the original) when fit into the Lua model.
First of all a minor thing, there's IMHO little reason to have a "detach" carry over pthread concept.
Why not just "thread:start()"? If you don't care when thread completes it should be the same, just don't need to call "thread:join()" unless you really want to wait for it to be done.
In either case the thread will just exit and "thread" will be around until you don't need it any more (then normally garbage collected).

Now the bigger issue, consider this little test case:

local thread = llthreads.new("while true do sleep(1) end")
thread:start()
thread = nil
collectgarbage()

You have to ask yourself "What should happen here?"
IMHO ideally in the Lua paradigm the thread would terminate itself (wherever it's at in the script) and then clean up nicely like any other Lua variable (be it a number, table, etc).

This parallels what happens if you run your "test_alive.lua" test.
As I see it now it will cause a hang in garbage collector for that child thread on exit.
The problem has always been there, but you have to catch it with a debugger to see it.
Also It might just hang your lua.exe on the "lua_close(L);" (for main Lua state).

A solution might be to terminate thread (the handle must be there at the end without the "detach" flag nonsense). But then this a problem maybe, can you just terminate a lua pcall() wherever it's at and then call lua_close() it for final clean up? This would IMHO be ideal.

If someone needs to keep a looping script thread like this this they could probably use one of the sharing libs like ZeroMQ and sync up a shutdown path.

At any rate what I suggest, and what I will add to my local version is to simply catch on garbage collection if a thread is still running. This way at least the user will know "Hey you are not doing right!". The user will be aware that there is a problem. Their lua.exe will not just sit there hung up with a clue to why it's like that.
Tests like "test_alive.lua" will have to be fixed so the script loop will only loop for a max count.
Then a "thread:join()" or a looping "thread:alive()" needs to be there to make sure the script thread has actually terminated.

I will post more in this issue once I've coded my "error on thread still running" solution.

P.S. Ты хорошо знаешь английский?

@moteus
Copy link
Owner

moteus commented Mar 22, 2017

So many words :)
This all known problems in original llthreads library.
I faced with this when forget save thread object and at some point my app just hangs up.
It was ever worst because llthread and lua-zmq have some bugs which also lead to deadlock.
(Neopallium/lua-llthreads#5 and Neopallium/lua-llthreads#6).
In my lib I leave compatiable behavior by default.
But you can specify detached flag in start function like

local thread = llthreads.new("while true do sleep(1) end")
thread:start(true)
thread = nil
collectgarbage()

In readme file I describe how this flags works.

PS. Not хорошо

@BigJim
Copy link
Author

BigJim commented Mar 22, 2017

Understood, and thanks.

I think it is a good idea though to add to the garbage ('__gc') collector detection for this.
Detect if the thread is still running. If the thread is still running it's a bad condition. You will not be able to kill it, and again it might hang up the host (with embedded Lua) process.
Right now the problem is being ignored. Maybe we can't solve it now, but we can detect it.

If you report this to the user than they will be aware of the issue and they will know not to make threads that loop forever, use :join() for long threads, and/or make some kind of shutdown signal sync mechanism.

Also could just before showing the error a direct kill thread. Sure maybe it will be a memory leak, but it might keep the process from being a stuck "zombie".

@moteus
Copy link
Owner

moteus commented Mar 23, 2017

It all possible. What you describe in first message is just detached/joinable threads.
You can start thouse like thread:start(true, true).
In this case __gc method will call detach. But it also possible call join method.
But default threads have some unique feature. join method returns Lua values from child thread.
It requires some sync mechanism to support this in detached/joinable threads.
We have to guard child Lua state. It can be done with basic atomic counter I think.

@BigJim
Copy link
Author

BigJim commented Mar 23, 2017

I got it working.
But it's a little hard to show because my new stripped down fork has the "detach" option removed because I don't use it and it makes the code much simpler.

I added a FLAG_EXITED that's set when a thread exits.

In "l_llthread_delete" just before the "llthread_destroy(L, obj)" I added this test:

llthread_t *obj = *pobj;
if (!IS(obj, EXITED) || (!IS(obj, STARTED) && ((intptr_t) obj->thread != 0)))
{
    // Do emergency thread kill. 
    // Yes It's dangerous, might leave memory leak but it's better than a hung zombie process.
    if ((intptr_t) obj->handle != 0)
    {
        #ifndef USE_PTHREAD
	TerminateThread(obj->thread, 1);
	Sleep(0);  // To cause thread context switch
	#else
	pthread_cancel(obj->thread);
	pthread_yield();
	#endif
	obj->handle = 0;
    }

    // Tell user of this bad condition
    return luaL_error(L, "Thread found still alive on garbage collect!");
}

And at the top of "llthread_child_thread_run() add so that the thread can be canceled.

#ifdef USE_PTHREAD
pthread_setcancelstate(PTHREAD_CANCEL_ENABLE, NULL);
pthread_setcanceltype(PTHREAD_CANCEL_ASYNCHRONOUS, NULL);
#endif

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants