[Home]

Summary:ASTERISK-12648: [patch] pthread_cancel segmentation faults
Reporter:matti (matti)Labels:
Date Opened:2008-08-27 01:20:37Date Closed:2008-08-29 07:40:02
Priority:MinorRegression?No
Status:Closed/CompleteComponents:Channels/chan_h323
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) 20080828__bug13380.diff.txt
( 1) patch
Description:pthread_cancel segmentation faults when the bind address in h323.conf is not configured in Linux. A detached thread must not be cancelled.
Comments:By: Tilghman Lesher (tilghman) 2008-08-27 09:04:47

Can you show documentation that supports your theory that a detached thread cannot be cancelled?

By: matti (matti) 2008-08-27 13:10:02

http://linux.derkeiler.com/Newsgroups/comp.os.linux.development.apps/2004-04/0632.html

By: Tilghman Lesher (tilghman) 2008-08-27 14:19:19

That's a nice message asserting that it is so, but I don't know of any real documentation that detached threads are non-cancellable.  In fact, the documentation of pthread_cancel both on the manpage, as well as in the POSIX specification make no such restriction on the use of pthread_cancel.  So I'm puzzled as to where that bare assertion came from.

In fact, we cancel other detached threads in Asterisk with no problem at all.  So I think you're finding the wrong solution to the stated problem, and we need to find the real reason this is crashing.  Do you have a backtrace, perhaps?  I've noted in another place recently that there is a possible race when a thread is created and destroyed in the same context switch, although that is probably not happening here.

By: matti (matti) 2008-08-28 00:08:50

The linked messages says that the thread ID can become invalid after detachment. That is a sufficient reason. pthread_detach means that the thread ID can be reused after the exit of the thread.

bash-3.1# gdb /usr/sbin/asterisk /tmp/core.4328
GNU gdb 6.4
Copyright 2005 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i586-akira_i586nptl-linux-gnu"...
(no debugging symbols found)
Using host libthread_db library "/lib/libthread_db.so.1".

(no debugging symbols found)
Core was generated by `/usr/sbin/asterisk -f'.
Program terminated with signal 11, Segmentation fault.

warning: Can't read pathname for load map: Input/output error.
Reading symbols from /lib/libdl.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib/libdl.so.2
Reading symbols from /lib/libpthread.so.0...done.
Loaded symbols for /lib/libpthread.so.0
Reading symbols from /lib/libncurses.so.5...done.
Loaded symbols for /lib/libncurses.so.5
Reading symbols from /lib/libm.so.6...done.
Loaded symbols for /lib/libm.so.6
Reading symbols from /lib/libresolv.so.2...done.
Loaded symbols for /lib/libresolv.so.2
Reading symbols from /usr/lib/libssl.so.0.9.8...done.
Loaded symbols for /usr/lib/libssl.so.0.9.8
Reading symbols from /lib/libc.so.6...done.
Loaded symbols for /lib/libc.so.6
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
Reading symbols from /usr/lib/libcrypto.so.0.9.8...done.
Loaded symbols for /usr/lib/libcrypto.so.0.9.8
Reading symbols from /usr/lib/asterisk/modules/app_mp3.so...done.
Loaded symbols for /usr/lib/asterisk/modules/app_mp3.so
Reading symbols from /usr/lib/asterisk/modules/res_features.so...done.
Loaded symbols for /usr/lib/asterisk/modules/res_features.so
Reading symbols from /usr/lib/asterisk/modules/chan_sip.so...done.
Loaded symbols for /usr/lib/asterisk/modules/chan_sip.so
Reading symbols from /lib/libnss_files.so.2...done.
Loaded symbols for /lib/libnss_files.so.2
Reading symbols from /lib/libnss_dns.so.2...done.
Loaded symbols for /lib/libnss_dns.so.2
Reading symbols from /usr/lib/asterisk/modules/res_musiconhold.so...done.
Loaded symbols for /usr/lib/asterisk/modules/res_musiconhold.so
Reading symbols from /usr/lib/asterisk/modules/app_meetme.so...done.
Loaded symbols for /usr/lib/asterisk/modules/app_meetme.so
Reading symbols from /usr/lib/asterisk/modules/res_indications.so...done.
Loaded symbols for /usr/lib/asterisk/modules/res_indications.so
Reading symbols from /usr/lib/asterisk/modules/res_snmp.so...done.
Loaded symbols for /usr/lib/asterisk/modules/res_snmp.so
Reading symbols from /usr/lib/libnetsnmpmibs.so.15...done.
Loaded symbols for /usr/lib/libnetsnmpmibs.so.15
Reading symbols from /usr/lib/libnetsnmpagent.so.15...done.
Loaded symbols for /usr/lib/libnetsnmpagent.so.15
Reading symbols from /usr/lib/libnetsnmphelpers.so.15...done.
Loaded symbols for /usr/lib/libnetsnmphelpers.so.15
Reading symbols from /usr/lib/libnetsnmp.so.15...done.
Loaded symbols for /usr/lib/libnetsnmp.so.15
Reading symbols from /usr/lib/libperl.so...done.
Loaded symbols for /usr/lib/libperl.so
Reading symbols from /lib/libnsl.so.1...done.
Loaded symbols for /lib/libnsl.so.1
Reading symbols from /lib/libcrypt.so.1...done.
Loaded symbols for /lib/libcrypt.so.1
Reading symbols from /lib/libutil.so.1...done.
Loaded symbols for /lib/libutil.so.1
Reading symbols from /usr/lib/asterisk/modules/res_esel.so...done.
Loaded symbols for /usr/lib/asterisk/modules/res_esel.so
Reading symbols from /usr/lib/asterisk/modules/res_watchdog.so...done.
Loaded symbols for /usr/lib/asterisk/modules/res_watchdog.so
Reading symbols from /usr/lib/asterisk/modules/res_agi.so...done.
Loaded symbols for /usr/lib/asterisk/modules/res_agi.so
Reading symbols from /usr/lib/asterisk/modules/res_crypto.so...done.
Loaded symbols for /usr/lib/asterisk/modules/res_crypto.so
Reading symbols from /usr/lib/asterisk/modules/res_adsi.so...done.
Loaded symbols for /usr/lib/asterisk/modules/res_adsi.so
Reading symbols from /usr/lib/asterisk/modules/res_valetparking.so...done.
Loaded symbols for /usr/lib/asterisk/modules/res_valetparking.so
Reading symbols from /usr/lib/asterisk/modules/res_monitor.so...done.
Loaded symbols for /usr/lib/asterisk/modules/res_monitor.so
Reading symbols from /usr/lib/asterisk/modules/pbx_dundi.so...done.
Loaded symbols for /usr/lib/asterisk/modules/pbx_dundi.so
Reading symbols from /usr/lib/libz.so.1...done.
Loaded symbols for /usr/lib/libz.so.1
Reading symbols from /usr/lib/asterisk/modules/pbx_functions.so...done.
Loaded symbols for /usr/lib/asterisk/modules/pbx_functions.so
Reading symbols from /usr/lib/asterisk/modules/pbx_ael.so...done.
Loaded symbols for /usr/lib/asterisk/modules/pbx_ael.so
Reading symbols from /usr/lib/asterisk/modules/pbx_spool.so...done.
Loaded symbols for /usr/lib/asterisk/modules/pbx_spool.so
Reading symbols from /usr/lib/asterisk/modules/pbx_loopback.so...done.
Loaded symbols for /usr/lib/asterisk/modules/pbx_loopback.so
Reading symbols from /usr/lib/asterisk/modules/pbx_config.so...done.
Loaded symbols for /usr/lib/asterisk/modules/pbx_config.so
Reading symbols from /usr/lib/asterisk/modules/pbx_realtime.so...done.
Loaded symbols for /usr/lib/asterisk/modules/pbx_realtime.so
Reading symbols from /usr/lib/asterisk/modules/chan_mgcp.so...done.
Loaded symbols for /usr/lib/asterisk/modules/chan_mgcp.so
Reading symbols from /usr/lib/asterisk/modules/chan_zap.so...done.
Loaded symbols for /usr/lib/asterisk/modules/chan_zap.so
Reading symbols from /usr/lib/libpri.so.1.0...done.
Loaded symbols for /usr/lib/libpri.so.1.0
Reading symbols from /usr/lib/libgsmat.so.1...done.
Loaded symbols for /usr/lib/libgsmat.so.1
Reading symbols from /usr/lib/libtonezone.so.1.0...done.
Loaded symbols for /usr/lib/libtonezone.so.1.0
Reading symbols from /usr/lib/asterisk/modules/chan_local.so...done.
Loaded symbols for /usr/lib/asterisk/modules/chan_local.so
Reading symbols from /usr/lib/asterisk/modules/chan_iax2.so...done.
Loaded symbols for /usr/lib/asterisk/modules/chan_iax2.so
Reading symbols from /usr/lib/asterisk/modules/chan_agent.so...done.
Loaded symbols for /usr/lib/asterisk/modules/chan_agent.so
Reading symbols from /usr/lib/asterisk/modules/chan_h323.so...done.
Loaded symbols for /usr/lib/asterisk/modules/chan_h323.so
Reading symbols from /usr/lib/libh323_linux_x86_r.so.1.19.0...done.
Loaded symbols for /usr/lib/libh323_linux_x86_r.so.1.19.0
Reading symbols from /usr/lib/libpt_linux_x86_r.so.1.11.0...done.
Loaded symbols for /usr/lib/libpt_linux_x86_r.so.1.11.0
Reading symbols from /usr/lib/libstdc++.so.6...done.
Loaded symbols for /usr/lib/libstdc++.so.6
Reading symbols from /usr/lib/libgcc_s.so.1...done.
Loaded symbols for /usr/lib/libgcc_s.so.1
Reading symbols from /usr/lib/libsasl2.so.2...done.
Loaded symbols for /usr/lib/libsasl2.so.2
#0  pthread_cancel (th=4294967295)
   at pthread_cancel.c:35
35      pthread_cancel.c: No such file or directory.
       in pthread_cancel.c
(gdb) bt
#0  pthread_cancel (th=4294967295) at pthread_cancel.c:35
#1  0xb6d9cc81 in unload_module () from /usr/lib/asterisk/modules/chan_h323.so
#2  0x08285f1a in ?? ()
#3  0xb7ef2dcc in _L_mutex_lock_165 () from /lib/libpthread.so.0
#4  0xb6daf79c in ?? () from /usr/lib/asterisk/modules/chan_h323.so
ASTERISK-1  0x08284718 in ?? ()
ASTERISK-2  0x08285f1a in ?? ()
ASTERISK-3  0x082841b0 in ?? ()
ASTERISK-4  0xffffffff in ?? ()
ASTERISK-5  0x0805c04e in ast_unload_resource ()
ASTERISK-6 0x08285f1a in ?? ()
ASTERISK-7 0xb79e0ae0 in ?? ()
ASTERISK-8 0x080962f1 in ast_cli ()
ASTERISK-9 0x08285f10 in ?? ()
ASTERISK-10 0xb7e2d800 in __malloc_initialize_hook () from /lib/libc.so.6
ASTERISK-11 0x00000002 in ?? ()
ASTERISK-12 0x00000003 in ?? ()
ASTERISK-13 0x00000001 in ?? ()
ASTERISK-14 0xb79e0ae0 in ?? ()
ASTERISK-15 0x0809477c in ast_cli ()
ASTERISK-16 0x00000001 in ?? ()
ASTERISK-17 0x00000000 in ?? ()



By: Tilghman Lesher (tilghman) 2008-08-28 08:14:16

Your backtrace actually provided the necessary information to figure out what's happening here.  Apparently, your compiler is rejecting the comparison of an unsigned type with a negative constant, and is thus allowing the if conditional to proceed, even though it was meant not to.  I have determined that there is a function designed for comparison of pthread_t type, which should make the comparison equal, as the value passed on the stack will transparently become equal due to (intentional) underflow.

Please let me know if this newly uploaded patch fixes your crash.

By: Tilghman Lesher (tilghman) 2008-08-28 08:16:03

BTW, the next Google search result provided a contradictory answer to your link, which agrees with the standard:

http://groups.google.com/group/comp.unix.internals/browse_thread/thread/1da5396af9e5dc9e

By: matti (matti) 2008-08-28 10:42:33

Your patch still has cancellation, killing etc. of a random thread, which is a clear error. After detachment, the thread ID must not be used.

By: Tilghman Lesher (tilghman) 2008-08-28 10:55:05

Matti:  I can find no support whatsoever in the documentation that supports your contention that we cannot cancel a detached thread.  The only restriction on a detached thread is that it cannot be joined.

The patch I proposed only prevents the code from running when the value passed in is the value AST_PTHREADT_NULL, which is what your backtrace shows is the value that the module is trying to cancel.  My analysis is that this could only happen if the "if" conditional failed to exclude the value AST_PTHREADT_NULL, which is likely only because the compiler decided that comparing a negative constant to an unsigned type was a non-starter.  So by using a function to do the comparison, we can prevent the compiler from overriding the (intentional) underflow.

By: Tilghman Lesher (tilghman) 2008-08-28 10:56:54

So in effect, I AM preventing the killing of a random thread, because I'm making the conditional function correctly, in spite of what the compiler (incorrectly) chose to interpret from the stated logic.

By: matti (matti) 2008-08-29 00:25:06

The specification of pthread_detach says that the storage for the detached thread can be reused. That means that if the thread erroneously ends, the cancellation could cancel a wrong thread. It is not the only restriction on a detached thread that it cannot be joined. The channel driver also calls pthread_kill on the thread ID after cancelling it, which means that it can kill a wrong thread. The driver also joins the thread after cancelling it, which is a grave error. NPTL defines the thread ID as a pointer, so using a thread ID after the thread has ended typically results in a segmentation fault.



By: matti (matti) 2008-08-29 02:53:16

The segmentation fault I had was because monitor_thread was used as a Boolean value instead of being compared with AST_PTHREADT_NULL. Asterisk 1.2 has such an error but trunk does not.

By: Tilghman Lesher (tilghman) 2008-08-29 07:39:45

Aha.  That makes sense.  Unfortunately, 1.2 is at EOL now, so we're not going to be able to make this fix for you.  Only security fixes are going into 1.2, not general bug fixes.  I encourage you to upgrade to the 1.4 series, as those versions are still supported at this time.