[Home]

Summary:ASTERISK-02375: Asterisk Deadlock
Reporter:Dan Mahoney (gushi)Labels:
Date Opened:2004-09-09 14:04:32Date Closed:2011-06-07 14:00:57
Priority:CriticalRegression?No
Status:Closed/CompleteComponents:Core/General
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:
Description:I am getting an asterisk deadlock.  This happens with SNOM 200 phones after only a couple simultaneous calls.  Contact me for login information, if you want to take a look.  Thread 4 below looks interesting.

****** ADDITIONAL INFORMATION ******

(gdb) bt
#0  0x00922c32 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#1  0x00a0c6f7 in poll () from /lib/tls/libc.so.6
#2  0x080aae08 in ast_el_read_char (el=0x9e77938, cp=0xbfeb06bb "\0048yç\t\004") at asterisk.c:871
#3  0x080bf747 in el_getc (el=0x9e77938, cp=0xbfeb06bb "\0048yç\t\004") at read.c:347
#4  0x080bf5b6 in read_getcmd (el=0x9e77938, cmdnum=0xfffffffc <Address 0xfffffffc out of bounds>, ch=0xbfeb06bb "\0048yç\t\004")
   at read.c:243
ASTERISK-1  0x080bf876 in el_gets (el=0x9e77938, nread=0xbfeb0708) at read.c:443
ASTERISK-2  0x080a9fac in ast_remotecontrol (data=0x0) at asterisk.c:1409
ASTERISK-3  0x080a6cd6 in main (argc=0, argv=0x0) at asterisk.c:1696
(gdb) bt
#0  0x00922c32 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#1  0x00a0c6f7 in poll () from /lib/tls/libc.so.6
#2  0x080aae08 in ast_el_read_char (el=0x9e77938, cp=0xbfeb06bb "\0048yç\t\004") at asterisk.c:871
#3  0x080bf747 in el_getc (el=0x9e77938, cp=0xbfeb06bb "\0048yç\t\004") at read.c:347
#4  0x080bf5b6 in read_getcmd (el=0x9e77938, cmdnum=0xfffffffc <Address 0xfffffffc out of bounds>, ch=0xbfeb06bb "\0048yç\t\004")
   at read.c:243
ASTERISK-1  0x080bf876 in el_gets (el=0x9e77938, nread=0xbfeb0708) at read.c:443
ASTERISK-2  0x080a9fac in ast_remotecontrol (data=0x0) at asterisk.c:1409
ASTERISK-3  0x080a6cd6 in main (argc=0, argv=0x0) at asterisk.c:1696
(gdb) info thread
 1 Thread -1084779808 (LWP 5156)  0x00922c32 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
(gdb) thread apply all bt

Thread 1 (Thread -1084779808 (LWP 5156)):
#0  0x00922c32 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#1  0x00a0c6f7 in poll () from /lib/tls/libc.so.6
#2  0x080aae08 in ast_el_read_char (el=0x9e77938, cp=0xbfeb06bb "\0048yç\t\004") at asterisk.c:871
#3  0x080bf747 in el_getc (el=0x9e77938, cp=0xbfeb06bb "\0048yç\t\004") at read.c:347
#4  0x080bf5b6 in read_getcmd (el=0x9e77938, cmdnum=0xfffffffc <Address 0xfffffffc out of bounds>, ch=0xbfeb06bb "\0048yç\t\004")
   at read.c:243
ASTERISK-1  0x080bf876 in el_gets (el=0x9e77938, nread=0xbfeb0708) at read.c:443
ASTERISK-2  0x080a9fac in ast_remotecontrol (data=0x0) at asterisk.c:1409
ASTERISK-3  0x080a6cd6 in main (argc=0, argv=0x0) at asterisk.c:1696
#0  0x00922c32 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
Comments:By: Olle Johansson (oej) 2004-09-09 14:26:59

Always report according to the instructions.

* Platform?
* Which version of Asterisk?
Also include information if you have CAPI or ZAP cards installed.

Thank you.

By: Dan Mahoney (gushi) 2004-09-09 14:31:32

Platform: AMD Athlon 1800 Running Fedora Core 1
Connected to Asterisk CVS-HEAD-08/03/04-23:43:07 currently running on newast (pid = 13445)

No cards installed at all, this is purely a SIP gateway, which consolidates endpoints and passes them on to a "main" asterisk machine whose job it is to talk to the outside world.

By: Olle Johansson (oej) 2004-09-09 14:33:37

Can you please update to latest CVS head and confirm that you still have the bug in the current code?

The 08/03/04 version is over a month old. The code has been changed quite a lot since then.

By: Dan Mahoney (gushi) 2004-09-09 15:03:21

Connected to Asterisk CVS-HEAD-09/09/04-15:58:56 currently running on newast (pid = 20776)

Here goes nothing.  Let's see if this thing still crashes.  Watch this space.

By: Mark Spencer (markster) 2004-09-09 15:51:09

you need thread apply all bt, not just bt, please read the bugs instructions.

By: Dan Mahoney (gushi) 2004-09-09 15:55:23

Yes, I pasted that in after the bt.

By: Mark Spencer (markster) 2004-09-09 16:01:06

No, the bug only contains a backtrace on thread 1, which is completely meaningless.  We need the *backtrace* for *all* threads, hence "thread apply all bt", not just "info threads".  What I believe you're referring to as "thread 4" is really just level 4 on the stack of the first thread.

Please understand that to debug a deadlock, you need someone to login and analyze the deadlock *while it is happening* and that trying to do it after the fact is basically useless.

You can try to find me on IRC and I'll do my best to help if you can make it happen.  Given that it happens with a low number of calls, it is likely related to some sort of an unusual configuration or use of features, although that doesn't mean it's not a real bug nontheless.

By: Mark Spencer (markster) 2004-09-09 17:15:26

It was a bug in "rate_engine.so" when that module wasn't loaded the problem didn't occur.

In the future, please be sure to include any non-stock modules that you are using with Asterisk.  Thanks.