[Home]

Summary:ASTERISK-17415: random deadlock
Reporter:wufan (wufan)Labels:
Date Opened:2011-02-16 10:38:21.000-0600Date Closed:2012-04-24 14:51:34
Priority:MajorRegression?No
Status:Closed/CompleteComponents:Channels/chan_sip/General
Versions:1.6.2.15 Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) 2011-03-09-bt_full.txt
( 1) 2011-03-09-core-show-locks.txt
( 2) 2011-03-09-thread_apply_all_bt.txt
( 3) asterisk_deadlock_23.02.2011.txt
( 4) bt_20110223.txt
( 5) bt_full_20110223.txt
( 6) bt_full.txt
( 7) bt.txt
( 8) core-show-locks_20110223.txt
( 9) core-show-locks.txt
(10) core-show-threads.txt
(11) netstat_20110223.txt
(12) thread_apply_all_bt
(13) thread_apply_all_bt_20110223.txt
Description:Asterisk min. 1.6.2.13 - 1.6.2.15 deadlocks randomly. avg 1 time per week.
but it was also yesterday and today.
The Process is running but it asterisk doenst talk to the phones anymore (no udp traffic outgoing)
incoming calls were accepted.

MAYBE it has something to do with transfers.
have subscriptions and qualify and some people are connected over VPN
Phones are SNOM 320 and 421

cannot restart asterisk with /etc/init.d/asterisk restart
have to kill -9

the people in the office says that there lamps on the phone are all blinking

Outgoing calls are not possible. (phone is trying to call but only retransmitting packages)

****** ADDITIONAL INFORMATION ******


-- Accepting overlap call from '0XXXXX' to 'YYYYYY' on channel 0/2, span 1
   -- Starting simple switch on 'DAHDI/2-1'
   -- Executing [YYYYYY@custom:1] Set("DAHDI/2-1", "CALLERID(num)=00XXXXXX") in new stack
   -- Executing [YYYYYY@custom:2] Goto("DAHDI/2-1", "zentrale-in,YYYYYY,1") in new stack
   -- Goto (zentrale-in,YYYYYY,1)
   -- Executing [YYYYYY@zentrale-in:1] Set("DAHDI/2-1", "CHANNEL(language)=de") in new stack
   -- Executing [YYYYYY@zentrale-in:2] Goto("DAHDI/2-1", "in-dw-lang,zentrale,1") in new stack
   -- Goto (in-dw-lang,zentrale,1)
   -- Executing [zentrale@in-dw-lang:1] Answer("DAHDI/2-1", "") in new stack
   -- Executing [zentrale@in-dw-lang:2] Wait("DAHDI/2-1", "1") in new stack
   -- Executing [zentrale@in-dw-lang:3] Playback("DAHDI/2-1", "ansage") in new stack
   -- <DAHDI/2-1> Playing 'ansage.gsm' (language 'de')
   -- Executing [zentrale@in-dw-lang:4] Wait("DAHDI/2-1", "1") in new stack
   -- Executing [zentrale@in-dw-lang:5] Queue("DAHDI/2-1", "zentrale-de") in new stack
   -- Started music on hold, class 'default', on DAHDI/2-1
 == Using SIP RTP CoS mark 5
 == Extension Changed 71[custom] new state Ringing for Notify User 78 (queued)
 == Extension Changed 71[custom] new state Ringing for Notify User 83 (queued)
 == Extension Changed 71[custom] new state Ringing for Notify User 13 (queued)
 == Extension Changed 71[custom] new state Ringing for Notify User 10 (queued)
 == Extension Changed 71[custom] new state Ringing for Notify User 90 (queued)
 == Extension Changed 71[custom] new state Ringing for Notify User 92 (queued)
 == Extension Changed 71[custom] new state Ringing for Notify User 12 (queued)
 == Extension Changed 71[custom] new state Ringing for Notify User 81 (queued)
 == Extension Changed 71[custom] new state Ringing for Notify User 77 (queued)
 == Extension Changed 71[custom] new state Ringing for Notify User 73 (queued)
 == Extension Changed 71[custom] new state Ringing for Notify User 72 (queued)
 == Using SIP RTP CoS mark 5
   -- Called 71
   -- Nobody picked up in 20000 ms
   -- Stopped music on hold on DAHDI/2-1
 == Spawn extension (in-dw-lang, zentrale, 5) exited non-zero on 'DAHDI/2-1'
   -- Hungup 'DAHDI/2-1'


Here i am missing the SIP/71 is ringing ...
Comments:By: wufan (wufan) 2011-02-16 10:43:59.000-0600

The additional note is example of console after the deadlock

thanks

By: wufan (wufan) 2011-02-23 04:19:21.000-0600

had another deadlock ;(
any ideas?
thank you!

By: Tan Tuerel (thsgmbh) 2011-02-23 05:25:17.000-0600

Same problem here...
See attached "core show locks"...

By: wufan (wufan) 2011-02-23 05:53:38.000-0600

hey ths,
do you have the same symtoms?

anything that is "off the standard" by you?

by me its a vpn (with ca. 15 people behind it) ... and smaller sip packages (because of the mtu)
BLF with snoms v8 (configured as 'Nebenstelle' (Extension) in the snom and not BLF)
a macro for dial in (to different from userstate busy and nav)
realtime postgres queue
qualify of the phones
fail2ban installed
fax over iaxmodem (twice) and hylafax
postgres cdr
swyx isdn card

maybe we find a consistency.



By: wufan (wufan) 2011-03-10 03:25:31.000-0600

and another deadlock ;/
in the console there was a little lag of the vpn and 1 user gets unreachable at the moment he gets called..
thanks

By: David Vossel (dvossel) 2011-06-29 14:03:40.014-0500

https://reviewboard.asterisk.org/r/1255/ Fixed this.

r325673 in 1.8

By: David Vossel (dvossel) 2011-07-06 10:32:26.852-0500

It has been discovered that the proposed fix for timerfd caused a serious regression and has been reverted.

By: Matt Jordan (mjordan) 2012-04-10 14:06:58.212-0500

Per the Asterisk maintenance timeline page at http://www.asterisk.org/asterisk-versions maintenance (bug) support for the 1.4 and 1.6.x branches has ended. For continued maintenance support please move to the 1.8 branch which is a long term support (LTS) branch. For more information about branch support, please see https://wiki.asterisk.org/wiki/display/AST/Asterisk+Versions.  After testing with Asterisk 1.8, if you find this problem has not been resolved, please open a new issue against Asterisk 1.8.

This issue is a bit old - can you reproduce this still in the latest 1.8?

By: Matt Jordan (mjordan) 2012-04-24 14:51:27.836-0500

Suspended due to lack of activity. Please request a bug marshal in #asterisk-bugs on the IRC network irc.freenode.net to reopen the issue should you have the additional information requested.  Further information can be found at http://www.asterisk.org/developers/bug-guidelines