Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash when using more than one user #897

Closed
geotro opened this issue Jan 31, 2017 · 24 comments
Closed

Crash when using more than one user #897

geotro opened this issue Jan 31, 2017 · 24 comments
Assignees
Milestone

Comments

@geotro
Copy link

geotro commented Jan 31, 2017

We have an instance of proxysql v1.3.3 which worked very well in production, until the very moment we started using a second mysql user/db utilizing the same config.

As soon as the user was used, proxysql would crash, and this appears to be happening consistently, backtrace and associated info below:

2017-01-31 16:45:55 [INFO] New mysql_replication_hostgroups table
Error: signal 11:
proxysql(_Z13crash_handleri+0x1a)[0x439a1a]
/lib/x86_64-linux-gnu/libc.so.6(+0x354b0)[0x7fd6fdbab4b0]
proxysql(mysql_change_user+0x199)[0x5af0f9]
proxysql[0x5b51e3]
proxysql(my_context_spawn+0x41)[0x5b4491]
2017-01-31 16:46:58 main.cpp:642:ProxySQL_daemonize_phase3(): [ERROR] ProxySQL crashed. Restarting!
2017-01-31 16:46:58 [INFO] Angel process is waiting 1 seconds before starting a new ProxySQL process
2017-01-31 16:46:59 [INFO] Angel process started ProxySQL process 6431
2017-01-31 16:46:59 ProxySQL_Admin.cpp:2853:flush_admin_variables___database_to_runtime(): [ERROR] Impossible to set variable version with value "1.3.1-2-gc762ae9". Resetting to current "1.3.3-0-gf2b677a".
2017-01-31 16:46:59 ProxySQL_Admin.cpp:2902:flush_mysql_variables___database_to_runtime(): [ERROR] Impossible to set not existing variable session_debug with value "(null)". Deleting

@renecannao renecannao self-assigned this Jan 31, 2017
@renecannao renecannao added this to the 1.3.4 milestone Jan 31, 2017
@renecannao
Copy link
Contributor

Hi @geotro . Is it possible to have a core dump?

@geotro
Copy link
Author

geotro commented Jan 31, 2017

I have a core dump, but it's massive and I'd rather not share, but please let me know what you'd like me to run in gdb.

I hope this helps a little:

(gdb) bt full
#0  mysql_change_user (mysql=0x7fd6c0dd5500, user=0x7fd6d11eb258 "test", passwd=<optimized out>, db=<optimized out>)
    at /opt/proxysql/deps/mariadb-client-library/mariadb_client/libmariadb/libmariadb.c:2242
        stmt = 0x771c27f6eefe86f
        li_stmt = 0x7fd6e7316490
        s_cs = <optimized out>
        s_user = 0x7fd6bcbec6e8 "test"
        s_passwd = 0x7fd6c8df7e50 "*11f11e438be1cd5992178a66014dc1c69b4129cf"
        s_db = 0x7fd6bcbec6f0 "test"
        rc = <optimized out>
#1  0x00000000005b51e3 in mysql_change_user_start_internal (d=<optimized out>)
    at /opt/proxysql/deps/mariadb-client-library/mariadb_client/libmariadb/mysql_async.c:846
        parms = <optimized out>
        ret = 48 '0'
        b = 0x7fd6b5bd9500
#2  0x00000000005b4491 in my_context_spawn (c=0x7fd6b5bd9528, f=0x7fd6d11eb278, d=0x771c27f6eefe86f)
    at /opt/proxysql/deps/mariadb-client-library/mariadb_client/libmariadb/my_context.c:205
        ret = 48
(gdb) 

@renecannao
Copy link
Contributor

Hi @geotro : no problem, I think I already understood where the issue is.

at /opt/proxysql/deps/mariadb-client-library/mariadb_client/libmariadb/libmariadb.c:2242

If my guess is right, I need to patch mariadb client library.
Please do not throw away the core dump for now.
I will work on a fix next week: as a temporary workaround, can you use different proxies for different users?

@geotro
Copy link
Author

geotro commented Jan 31, 2017

I can wait with it. Not deleting the core and it's very easy for me to produce a new one.

Which version of the mariadb client library are you using? I don't see anything interesting on line 2242 of v2.3.1

@renecannao
Copy link
Contributor

@geotro : it is using version 2.3.1 , but libmariadb.c is patched, that's is why you don't see anything on line 2242 : the line doesn't match.
My hypothesis is the follows:
ProxySQL tracks prepared statements, it needs to access (and modify) structures inside the library itself. Because of this, when the library executes CHANGE_USER, is now accessing invalid pointers, as they were modified by ProxySQL itself. In fact, stmt = 0x771c27f6eefe86f is clearly not a valid pointer.

@geotro
Copy link
Author

geotro commented Feb 1, 2017

I'm sorry but apparently this was a false alarm. I used the wrong build (debian 8) instead of the ubuntu16 one, that build worked perfectly fine until I started using a second user.

It looks like the right build works perfectly so far so this can be closed for now.

@geotro geotro closed this as completed Feb 1, 2017
@renecannao renecannao changed the title Crash when using more than one user [INVESTIGATE] Crash when using more than one user and wrong build Feb 1, 2017
@renecannao
Copy link
Contributor

@geotro : thank you for letting me know it was a false alarm.
I am anyway re-opening it and I will try to understand why using the wrong build causes this.
Thanks

@renecannao renecannao reopened this Feb 1, 2017
@renecannao renecannao removed the bug label Feb 1, 2017
@renecannao renecannao removed this from the v1.3.4 milestone Feb 1, 2017
@geotro
Copy link
Author

geotro commented Feb 1, 2017

Damn. It crashed again but this time it took a lot longer. I closed it too soon.

@renecannao
Copy link
Contributor

Ok, reserving the subject and the tags.
Thank you

@renecannao renecannao added the bug label Feb 1, 2017
@renecannao renecannao added this to the v1.3.4 milestone Feb 1, 2017
@renecannao renecannao changed the title [INVESTIGATE] Crash when using more than one user and wrong build Crash when using more than one user Feb 1, 2017
@renecannao
Copy link
Contributor

Cross reference to #752

renecannao added a commit that referenced this issue Feb 12, 2017
COM_CHANGE_USER could try to reset PS already invalidated from the PS manager.
renecannao added a commit that referenced this issue Feb 12, 2017
COM_CHANGE_USER could try to reset PS already invalidated from the PS manager.
@renecannao
Copy link
Contributor

@geotro : this bug should be fixed in 1.3.4 : can you please try branch v1.3.4-dev before I release ?
Thanks

@krupal-raiyani
Copy link

krupal-raiyani commented Feb 16, 2017 via email

@geotro
Copy link
Author

geotro commented Feb 16, 2017

It's working fine now. I'll keep running this version until the next release. Good job!

@geotro geotro closed this as completed Feb 16, 2017
@renecannao
Copy link
Contributor

Thanks!

@geotro
Copy link
Author

geotro commented Feb 21, 2017

I've been running the new version, 1.3.4 in production since yesterday, it was working fine until minutes ago I had the crash below. I'm reopening the ticket as it may be related to the previous issue.

I have the core in case you need anything else.

Interestingly, the patched version worked for many days without a crash.

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `proxysql -c /etc/proxysql.cnf -D /var/lib/proxysql'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00000000005cb035 in strcpy (__src=<optimized out>, __dest=<optimized out>) at /usr/include/x86_64-linux-gnu/bits/string3.h:110
110       return __builtin___strcpy_chk (__dest, __src, __bos (__dest));
[Current thread is 1 (Thread 0x7ff60cff8700 (LWP 2900))]
(gdb) bt
#0  0x00000000005cb035 in strcpy (__src=<optimized out>, __dest=<optimized out>) at /usr/include/x86_64-linux-gnu/bits/string3.h:110
#1  madb_reset_stmt (stmt=0x7ff5e977ad80, flags=flags@entry=27)
    at /opt/proxysql/deps/mariadb-client-library/mariadb_client/libmariadb/my_stmt.c:1629
#2  0x00000000005cd24a in mysql_stmt_free_result (stmt=<optimized out>)
    at /opt/proxysql/deps/mariadb-client-library/mariadb_client/libmariadb/my_stmt.c:1121
#3  0x000000000051b719 in MySQL_Connection::async_free_result (this=this@entry=0x7ff5d2586c40) at mysql_connection.cpp:1390
#4  0x000000000051bc48 in MySQL_Connection::~MySQL_Connection (this=0x7ff5d2586c40, __in_chrg=<optimized out>) at mysql_connection.cpp:198
#5  0x000000000044f449 in MySQL_HostGroups_Manager::destroy_MyConn_from_pool (this=0x7ff610865000, c=0x7ff5d2586c40)
    at MySQL_HostGroups_Manager.cpp:907
#6  0x0000000000465af0 in MySQL_Data_Stream::destroy_MySQL_Connection_From_Pool (sq=true, this=0x7ff5dd085c00)
    at ../include/MySQL_Data_Stream.h:204
#7  MySQL_Session::handler_again___status_CHANGING_USER_SERVER (this=this@entry=0x7ff5dd16d940, _rc=_rc@entry=0x7ff60cff7980)
    at MySQL_Session.cpp:1244
#8  0x0000000000468567 in MySQL_Session::handler (this=this@entry=0x7ff5dd16d940) at MySQL_Session.cpp:2406
#9  0x0000000000456af0 in MySQL_Thread::process_all_sessions (this=this@entry=0x7ff60b00d000) at MySQL_Thread.cpp:2715
#10 0x000000000045ed56 in MySQL_Thread::run (this=this@entry=0x7ff60b00d000) at MySQL_Thread.cpp:2555
#11 0x0000000000439224 in mysql_worker_thread_func (arg=0x7ff6108b5eb0) at main.cpp:168
#12 0x00007ff6125056ba in start_thread (arg=0x7ff60cff8700) at pthread_create.c:333
#13 0x00007ff610ee182d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

@renecannao
Copy link
Contributor

Hi @geotro : thank you for the report. I am not sure if it is related to the previous issue or not.
Is it possible to have a core dump?
Thanks

@geotro
Copy link
Author

geotro commented Feb 21, 2017

It's 1.8GB in size and I would prefer not to share it as it contains sensitive info, however I can run any commands you may need to explore it.

@renecannao
Copy link
Contributor

It may have a lot of empty buffers, and easily compressible.
But I understand your concern about sensitive info.
Can you run bt full in gdb ?

@geotro
Copy link
Author

geotro commented Feb 21, 2017

I'm running the debug version.

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `proxysql -c /etc/proxysql.cnf -D /var/lib/proxysql'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00000000005cb035 in strcpy (__src=<optimized out>, __dest=<optimized out>) at /usr/include/x86_64-linux-gnu/bits/string3.h:110
110       return __builtin___strcpy_chk (__dest, __src, __bos (__dest));
[Current thread is 1 (Thread 0x7ff60cff8700 (LWP 2900))]
(gdb) bt full
#0  0x00000000005cb035 in strcpy (__src=<optimized out>, __dest=<optimized out>) at /usr/include/x86_64-linux-gnu/bits/string3.h:110
No locals.
#1  madb_reset_stmt (stmt=0x7ff5e977ad80, flags=flags@entry=27)
    at /opt/proxysql/deps/mariadb-client-library/mariadb_client/libmariadb/my_stmt.c:1629
        mysql = 0xa003f0c6469
        ret = 0 '\000'
#2  0x00000000005cd24a in mysql_stmt_free_result (stmt=<optimized out>)
    at /opt/proxysql/deps/mariadb-client-library/mariadb_client/libmariadb/my_stmt.c:1121
No locals.
#3  0x000000000051b719 in MySQL_Connection::async_free_result (this=this@entry=0x7ff5d2586c40) at mysql_connection.cpp:1390
        __PRETTY_FUNCTION__ = "void MySQL_Connection::async_free_result()"
#4  0x000000000051bc48 in MySQL_Connection::~MySQL_Connection (this=0x7ff5d2586c40, __in_chrg=<optimized out>) at mysql_connection.cpp:198
No locals.
#5  0x000000000044f449 in MySQL_HostGroups_Manager::destroy_MyConn_from_pool (this=0x7ff610865000, c=0x7ff5d2586c40)
    at MySQL_HostGroups_Manager.cpp:907
        mysrvc = <optimized out>
#6  0x0000000000465af0 in MySQL_Data_Stream::destroy_MySQL_Connection_From_Pool (sq=true, this=0x7ff5dd085c00)
    at ../include/MySQL_Data_Stream.h:204
        mc = 0x7ff5d2586c40
#7  MySQL_Session::handler_again___status_CHANGING_USER_SERVER (this=this@entry=0x7ff5dd16d940, _rc=_rc@entry=0x7ff60cff7980)
    at MySQL_Session.cpp:1244
        sqlstate = "#40001\000-21"
        myerr = <optimized out>
        __PRETTY_FUNCTION__ = "bool MySQL_Session::handler_again___status_CHANGING_USER_SERVER(int*)"
        myds = 0x7ff5dd085c00
        myconn = <optimized out>
        st = <optimized out>
        rc = <optimized out>
        __func__ = "handler_again___status_CHANGING_USER_SERVER"
#8  0x0000000000468567 in MySQL_Session::handler (this=this@entry=0x7ff5dd16d940) at MySQL_Session.cpp:2406
        rc = 0
        wrong_pass = false
---Type <return> to continue, or q <return> to quit--- 
        pkt = {ptr = 0x7ff5cd9ca040, size = 293}
        c = <optimized out>
        __PRETTY_FUNCTION__ = "int MySQL_Session::handler()"
        __func__ = "handler"
#9  0x0000000000456af0 in MySQL_Thread::process_all_sessions (this=this@entry=0x7ff60b00d000) at MySQL_Thread.cpp:2715
        n = 73
        total_active_transactions_ = 0
        idle_maintenance_thread = false
        rc = <optimized out>
        total_active_transactions_tmp = <optimized out>
#10 0x000000000045ed56 in MySQL_Thread::run (this=this@entry=0x7ff60b00d000) at MySQL_Thread.cpp:2555
        num_idles = <optimized out>
        maintenance_interval = <optimized out>
        n = <optimized out>
        rc = <optimized out>
        idle_maintenance_thread = <optimized out>
        __PRETTY_FUNCTION__ = "void MySQL_Thread::run()"
        __func__ = "run"
#11 0x0000000000439224 in mysql_worker_thread_func (arg=0x7ff6108b5eb0) at main.cpp:168
        mysql_thread = 0x7ff6108b5eb0
        worker = 0x7ff60b00d000
#12 0x00007ff6125056ba in start_thread (arg=0x7ff60cff8700) at pthread_create.c:333
        __res = <optimized out>
        pd = 0x7ff60cff8700
        now = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140694756755200, -8895907623371664948, 0, 140731597951359, 140694756755904, 0, 
                8892211061113885132, 8892154632378520012}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, 
              cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
        pagesize_m1 = <optimized out>
        sp = <optimized out>
        freesize = <optimized out>
        __PRETTY_FUNCTION__ = "start_thread"
#13 0x00007ff610ee182d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
No locals.

@renecannao
Copy link
Contributor

Thank you.
This doesn't look a valid pointer.

mysql = 0xa003f0c6469

I will investigate it and I will probably ask you more info, but probably not today.

@renecannao
Copy link
Contributor

@geotro : can you please check if there was any error in the log file just before the crash?
Should be something like:

Error during change user: %d, %s

@geotro
Copy link
Author

geotro commented Feb 22, 2017

Yes:

Failed to mysql_change_user()2017-02-21 09:36:01 MySQL_Session.cpp:1237:handler_again___status_CHANGING_USER_SERVER(): [WARNING] Error during change user: 1213, Deadlock found when trying to get lock; try restarting transaction
2017-02-21 09:36:01 MySQL_Session.cpp:2269:handler(): [WARNING] Error during query on (0,10.0.0.16,3306): 1180, Got error 35 "Resource deadlock avoided" during COMMIT
Failed to mysql_change_user()2017-02-21 09:36:01 MySQL_Session.cpp:1237:handler_again___status_CHANGING_USER_SERVER(): [WARNING] Error during change user: 1213, Deadlock found when trying to get lock; try restarting transaction
Error: signal 11:
proxysql(_Z13crash_handleri+0x2d)[0x43ac1d]
/lib/x86_64-linux-gnu/libc.so.6(+0x354b0)[0x7ff610e104b0]
proxysql[0x5cb035]
proxysql(_ZN16MySQL_Connection17async_free_resultEv+0x59)[0x51b719]
proxysql(_ZN16MySQL_ConnectionD2Ev+0x78)[0x51bc48]
proxysql(_ZN24MySQL_HostGroups_Manager24destroy_MyConn_from_poolEP16MySQL_Connection+0xa9)[0x44f449]
proxysql(_ZN13MySQL_Session43handler_again___status_CHANGING_USER_SERVEREPi+0x3d0)[0x465af0]
proxysql(_ZN13MySQL_Session7handlerEv+0xe47)[0x468567]
proxysql(_ZN12MySQL_Thread20process_all_sessionsEv+0x1d0)[0x456af0]
proxysql(_ZN12MySQL_Thread3runEv+0x1596)[0x45ed56]
proxysql(_Z24mysql_worker_thread_funcPv+0x54)[0x439224]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba)[0x7ff6125056ba]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7ff610ee182d]
2017-02-21 09:37:02 main.cpp:642:ProxySQL_daemonize_phase3(): [ERROR] ProxySQL crashed. Restarting!

@renecannao
Copy link
Contributor

Thank you @geotro !
Can you please check if there are other error entries with MySQL_Session.cpp:1237:handler_again___status_CHANGING_USER_SERVER, what error code is returned (1213 in this example) if there is a crash after each of them?
Thank you.

@geotro
Copy link
Author

geotro commented Feb 22, 2017

Interestingly it happened once after the crash, but didn't cause a crash:

2017-02-21 09:37:09 MySQL_Session.cpp:2269:handler(): [WARNING] Error during query on (0,10.0.0.16,3306): 1180, Got error 35 "Resource deadlock avoided" during COMMIT
Failed to mysql_change_user()2017-02-21 09:37:09 MySQL_Session.cpp:1237:handler_again___status_CHANGING_USER_SERVER(): [WARNING] Error during change user: 1213, Deadlock found when trying to get lock; try restarting transaction
2017-02-21 09:37:13 [INFO] Received LOAD MYSQL SERVERS TO RUNTIME command

renecannao added a commit that referenced this issue Feb 22, 2017
minichate pushed a commit to minichate/proxysql that referenced this issue Mar 6, 2017
COM_CHANGE_USER could try to reset PS already invalidated from the PS manager.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants