Summary: | ASTERISK-13705: [patch] sip channel freezed in ChanSpy() app | ||
Reporter: | caspy (caspy) | Labels: | |
Date Opened: | 2009-03-06 14:35:15.000-0600 | Date Closed: | 2010-07-27 13:25:21 |
Priority: | Minor | Regression? | No |
Status: | Closed/Complete | Components: | Applications/app_chanspy |
Versions: | Frequency of Occurrence | ||
Related Issues: | |||
Environment: | Attachments: | ( 0) bt_chanspy_lock_status_20091104.txt ( 1) channel_freezed_1.6.0.diff ( 2) diagnose_for_14618_1.txt ( 3) diagnose_for_14618_2.txt | |
Description: | i have a channel that freezed in a strange state. which i can't kill. Scenario: SIP/1234 dialed number (897795678) that do "{Answer(); ChanSpy(SIP/5678,q); }", and after some hours i see this: SIP/1234 - is sjphone. it's alredy free, do nothing! call ended on client. it's already even unreachable. but channel still exist: *CLI> core show channels Channel Location State Application(Data) SIP/1234-b587fc50 897795678@fromoffice Up ChanSpy(SIP/5678,q) 1 active channel 1 active call *CLI> core show channel SIP/1234-b587fc50 -- General -- Name: SIP/1234-b587fc50 Type: SIP UniqueID: 1236337972.459555 Caller ID: 1234 Caller ID Name: User Name DNID Digits: 897795678 Language: ru State: Up (6) Rings: 0 NativeFormats: 0x8 (alaw) WriteFormat: 0x40 (slin) ReadFormat: 0x8 (alaw) WriteTranscode: Yes ReadTranscode: No 1st File Descriptor: 106 Frames in: 123606 Frames out: 79133 Time to Hangup: 0 Elapsed Time: 9h5m25s Direct Bridge: <none> Indirect Bridge: <none> -- PBX -- Context: fromoffice Extension: 897795678 Priority: 2 Call Group: 32768 Pickup Group: 32768 Application: ChanSpy Data: SIP/5678,q Blocking in: (Not Blocking) Variables: RTPAUDIOQOS=ssrc=1088103444;themssrc=265647381;lp=0;rxjitter=0.023820;rxcount=123606;txjitter=0.000000;txcount=79133;rlp=0;rtt=0.000000 SIPCALLID=1AB484D1-80BF-4F1E-97DD-F3B9FC49AA27@10.x.x.x SIPDOMAIN=sipproxy.int.domain.tld SIPURI=sip:1234@10.x.x.x:1000 CDR Variables: level 1: clid="User Name" <1234> level 1: src=1234 level 1: dst=897795678 level 1: dcontext=fromoffice level 1: channel=SIP/1234-b587fc50 level 1: lastapp=ChanSpy level 1: lastdata=SIP/5678,q level 1: start=2009-03-06 14:12:52 level 1: answer=2009-03-06 14:12:52 level 1: duration=32724 level 1: billsec=32724 level 1: disposition=ANSWERED level 1: amaflags=DOCUMENTATION level 1: uniqueid=1236337972.459555 *CLI> sip show channel 1AB484D1-80BF-4F1E-97DD-F3B9FC49AA27 * SIP Call Curr. trans. direction: Incoming Call-ID: 1AB484D1-80BF-4F1E-97DD-F3B9FC49AA27@10.x.x.x Owner channel ID: SIP/1234-b587fc50 Our Codec Capability: 14 Non-Codec Capability (DTMF): 1 Their Codec Capability: 1038 Joint Codec Capability: 14 Format: 0x8 (alaw) T.38 support No Video support No MaxCallBR: 384 kbps Theoretical Address: 10.x.x.x:1000 Received Address: 10.x.x.x:1000 SIP Transfer mode: open NAT Support: Always Audio IP: 10.y.y.y (local) Our Tag: as6bde6d25 Their Tag: 10251567111166 SIP User agent: SJphone/1.60.289a (SJ Labs) Username: 1234 Peername: 1234 Original uri: sip:1234@10.x.x.x:1000 Caller-ID: 1234 Need Destroy: No Last Message: Rx: BYE Promiscuous Redir: No Route: sip:1234@10.x.x.x:1000 DTMF Mode: rfc2833 SIP Options: (none) Session-Timer: Inactive *CLI> soft hangup SIP/1234-b587fc50 Requested Hangup on channel 'SIP/1234-b587fc50' 'soft hangup' DO NOTHING. channel is still existing. i did not yet restart my system, so if i can do anything more for diagnostic - please tell. this is rare situation, so, if i can look smth else - i should do it till nearest reload ;) | ||
Comments: | By: Joshua C. Colp (jcolp) 2009-03-10 09:29:22 Please recompile with DONT_OPTIMIZE and DEBUG_THREADS enabled in menuselect under Compiler Flags. Once this happens please upload the output of "core show locks". I would also suggest using 1.6.0 from subversion. By: caspy (caspy) 2009-03-10 09:48:55 ok, i'll do it tonight. please, leave this bug open in feedback state for a couple of weeks, cause i don't know when it will freeze again. thnx. By: caspy (caspy) 2009-03-19 13:44:22 it tooks too long to wait until it freezes again. interesting observation (for many years): asterisk is more stable, as more debug/verbosity compiled in or turned on. really ;) ok, what about case: i compiled asterisk with DONT_OPTIMIZE and DEBUG_THREADS. 'core show locks' show nothing. before and after 'soft hangup'. even more, i did not seen any locks for two weeks at all. in 'diagnose_for_14618_1.txt' you can find all commands i have executed last time, plus commands for lock. ps: as usual, i'll wait for any more debug/diagnose cases without forcing restart of asterisk. By: caspy (caspy) 2009-03-20 00:50:30 ?! just now i catch this freeze myself. still no locks, all just the same. but, some new observations: 1) the issue is not in a type of spying channel. i issued ChanSpy from DAHDI: *CLI> core show channels Channel Location State Application(Data) DAHDI/4-1 897793333@fromoffice Up ChanSpy(SIP/3333,q) SIP/1111-0839cf48 897792222@fromoffice Up ChanSpy(SIP/2222,q) spy from dahdi/4 - was mine spy from sip/1111 - described yesterday. 2) while spying i do nothing. i simply dialed my office number from my mobile, then dialed spy add-on. successfully attached, and put phone on the table. after the current call was finished - silence come. and when a new bridge occured at spyed channel, i still hear a silence. 3) in log i have: [Mar 20 08:17:23] NOTICE[31608] app_chanspy.c: Attaching DAHDI/4-1 to SIP/3333-b6a3de08 [Mar 20 08:19:46] NOTICE[31608] app_chanspy.c: Attaching DAHDI/4-1 to SIP/3333-b6ae6030 [Mar 20 08:20:07] NOTICE[31608] app_chanspy.c: Attaching DAHDI/4-1 to SIP/3333-b6132e90 [Mar 20 08:20:21] NOTICE[31608] app_chanspy.c: Attaching DAHDI/4-1 to SIP/3333-08353fe8 [Mar 20 08:20:36] NOTICE[31608] app_chanspy.c: Attaching DAHDI/4-1 to SIP/3333-084c1710 the first one line - it was a bridged call i was successfully listen for 2min. the last one - it was a new bridged call, which i did not hear! thouse in a middle - was an unbridged tries to reach SIP/3333 from queue. (SIP/3333 answering a calls from queue). 4) after i drop a line on mobile, messages in log stopped. but in asterisk's env nothing changed. still 'active' channel. i'll upload some diagnostic in a few minutes. ps: with freezing of DAHDI, the case become 'major' or even 'block'. cause we need to restart asterisk to unfreeze DAHDI's channel. ;( By: caspy (caspy) 2009-03-20 11:20:20 more interesting things: *CLI> core show channels Channel Location State Application(Data) SIP/1111-08402ad8 897792222@fromoffice Up ChanSpy(SIP/2222,q) but SIP/1111 gone home, his computer turned off. (i do not know did he finished a call normally. but as you can see next - nothing in history) *CLI> sip show peer 1111 * Name : 1111 [...] Status : UNKNOWN Useragent : SJphone/1.60.289a (SJ Labs) [...] *CLI> sip show history 7EF7B2D9-87BB-4 * SIP Call 1. Rx INVITE / 1 INVITE / sip:897792222@sipproxy.int.domain.tld 2. AuthChal Auth challenge sent for - nc 0 3. TxRespRel SIP/2.0 / 1 INVITE - 401 Unauthorized 4. SchedDestroy 32000 ms 5. Rx ACK / 1 ACK / sip:897792222@sipproxy.int.domain.tld 6. Rx INVITE / 2 INVITE / sip:897792222@sipproxy.int.domain.tld 7. CancelDestroy 8. Invite New call: 7EF7B2D9-87BB-4F4E-9D12-776F587C3B00@10.x.x.x 9. AuthOK Auth challenge succesful for 1111 10. NewChan Channel SIP/1111-08402ad8 - from 7EF7B2D9-87BB-4F4E-9D12-776F58 11. TxResp SIP/2.0 / 2 INVITE - 100 Trying 12. TxRespRel SIP/2.0 / 2 INVITE - 200 OK 13. Rx ACK / 2 ACK / sip:897792222@10.y.y.y By: caspy (caspy) 2009-03-20 12:46:03 add-on for previous comment: asterisk is still trying to send packets (seems to be RTP) to spying host. # tcpdump -nv host 10.x.x.x 20:41:51.094313 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto: UDP (17), length: 72) 10.y.y.y.20631 > 10.x.x.x.49195: UDP, length 44 20:41:56.096522 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto: UDP (17), length: 72) 10.y.y.y.20631 > 10.x.x.x.49195: UDP, length 44 20:42:01.094799 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto: UDP (17), length: 72) 10.y.y.y.20631 > 10.x.x.x.49195: UDP, length 44 20:42:06.097068 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto: UDP (17), length: 72) 10.y.y.y.20631 > 10.x.x.x.49195: UDP, length 44 By: caspy (caspy) 2009-03-24 14:36:58 can i provide any more info, or may be forward bug to any next state from 'feedback'? :) By: caspy (caspy) 2009-04-01 14:13:06 the way to try to reproduce: call from any sip phone to ChanSpy() apps, listen for a while, and then disconnect this phone from network. spying channel will not be killed anymore. but in _exactly_ such scenario it is possible to do a 'soft hangup'. how to lock in a phase from which it can not be 'soft hanguped' - i do not know yet. i'm waiting for such lock to look for sip history. By: caspy (caspy) 2009-05-05 04:00:04 interesting observation: after crash of asterisk due to another bug with three locked ChanSpy()'s channels (like above), i can not find anything locked/waiting/etc in gdb's output. seems like all spying threads gone away, leaving channels marked as inuse. By: Matthew Nicholson (mnicholson) 2009-08-13 17:03:08 Do you have rtptimeout set in sip.conf? By: caspy (caspy) 2009-08-14 01:39:11 no. should i try? did it affect peoples sitting and waiting in queues or smth like? By: Matthew Nicholson (mnicholson) 2009-08-14 12:07:25 Yes try turning on rtptimeout, it should not affect people waiting in queues. It should properly disconnect a call if a phone is unplugged from the network. Because there are no locks held, I suspect your problem is network related and not actually a problem with asterisk. Please test with rtptimeout set and see if it fixes your issue. By: Leif Madsen (lmadsen) 2009-08-20 15:00:17 caspy: any word on whether the suggestions Mark gave have solved your issue? By: caspy (caspy) 2009-08-21 03:44:35 lmadsen, i put 'rtptimeout=60' in sip.conf ~ 5 days ago. Since that time i did not seen any freezed ChanSpy(). But i whould like to look for it for about 1 or 2 weeks more. I'll write result here. By: caspy (caspy) 2009-08-21 03:49:22 one more interesting fact: as i said early in comment 0102005, - i saw such freeze on a DAHDI channel. so, rtptimeout can be not a solution. By: Tilghman Lesher (tilghman) 2009-09-03 12:17:59 Given the lack of feedback, I'm assuming that the use of rtptimeout fixed this issue. By: caspy (caspy) 2009-10-08 04:55:47 rtptimeout did not help. CLI>sip show settings .. RTP Timeout: 60 .. CLI> core show channels Channel Location State Application(Data) .. SIP/1111-b430d648 xxxxxxxxx@fromoffice Up ChanSpy(SIP/1111,q) CLI> sip show channels Peer User/ANR Call ID Format Hold Last Message .. 10.100.xxx.xxx 1111 Njc1MzNiOTM2MTA 0x4 (ulaw) No Rx: BYE No locks are shown in 'core show locks'. By: Jeff Peeler (jpeeler) 2009-10-23 17:41:01 This is going to be really hard to track down without some information of how to reproduce. If you can work out steps to make the issue occur, please outline those here. Otherwise, I wonder if some live debugging might be an option once your system gets in the problem state. By: caspy (caspy) 2009-10-28 11:13:12 == Applications/app_chanspy => Channels/chan_sip/General == it seems not to be a sip problem, cause as i said early - any channel (DAHDI in my case) can be locked this way. what about way to reproduce - i dont know how. only wait for. it seems to be a combination of some factors. and, as soon as i return from holidays and lock happens - i'll write here and we can do any debugging. By: caspy (caspy) 2009-11-02 01:52:38.000-0600 My Asterisk is standing now with 16 locked in such way channels. We can online-debug anything now. By: Jeff Peeler (jpeeler) 2009-11-02 16:00:48.000-0600 I hope that install is running an unoptimized build. Attach gdb to it and post the output of "thread apply all bt". By: caspy (caspy) 2009-11-04 10:23:25.000-0600 bt of running asterisk with locked chanspy() attached. that was the only channel active in system: *CLI> core show channels Channel Location State Application(Data) SIP/1111-09c0af88 000001122@fromoffice Up ChanSpy(SIP/1122,q) 1 active channel 1 active call 17818 calls processed By: David Vossel (dvossel) 2009-11-09 13:52:15.000-0600 I'm able to reproduce this. I should have a patch for testing shortly. *edit* I take that back, setting rtptimeout made the channel go away for me. By: David Vossel (dvossel) 2009-11-09 15:04:40.000-0600 Can you provide more information on exactly what version of Asterisk you are using? Please provide the branch as well as the svn revision number. By: caspy (caspy) 2009-11-09 16:20:20.000-0600 1.6.0.9 By: Leif Madsen (lmadsen) 2009-11-09 18:42:09.000-0600 Since we're up to 1.6.0.18-rc2 now, perhaps you should give it a try again with a newer version in your test environment? By: caspy (caspy) 2009-11-10 03:45:53.000-0600 dvossel said, he was able to reproduce it. should i still need upgrade? By: David Vossel (dvossel) 2009-11-10 09:28:21.000-0600 Yes, please upgrade. I thought I had reproduced it, but setting rtptimeout fixed the issue for me, so it wasn't the same thing. By: David Vossel (dvossel) 2009-11-20 11:28:05.000-0600 I uploaded a patch that may resolve this issue. Will you be able to test it? By: Digium Subversion (svnbot) 2009-11-20 11:33:02.000-0600 Repository: asterisk Revision: 230583 U trunk/include/asterisk/audiohook.h U trunk/main/audiohook.c ------------------------------------------------------------------------ r230583 | dvossel | 2009-11-20 11:33:01 -0600 (Fri, 20 Nov 2009) | 6 lines audiohook signal trigger on every status change (issue ASTERISK-13705) Review: https://reviewboard.asterisk.org/r/434/ ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=230583 By: Digium Subversion (svnbot) 2009-11-20 11:37:50.000-0600 Repository: asterisk Revision: 230585 _U branches/1.6.2/ U branches/1.6.2/include/asterisk/audiohook.h U branches/1.6.2/main/audiohook.c ------------------------------------------------------------------------ r230585 | dvossel | 2009-11-20 11:37:49 -0600 (Fri, 20 Nov 2009) | 13 lines Merged revisions 230583 via svnmerge from https://origsvn.digium.com/svn/asterisk/trunk ........ r230583 | dvossel | 2009-11-20 11:26:20 -0600 (Fri, 20 Nov 2009) | 6 lines audiohook signal trigger on every status change (issue ASTERISK-13705) Review: https://reviewboard.asterisk.org/r/434/ ........ ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=230585 By: Digium Subversion (svnbot) 2009-11-20 11:38:57.000-0600 Repository: asterisk Revision: 230586 _U branches/1.6.1/ U branches/1.6.1/include/asterisk/audiohook.h U branches/1.6.1/main/audiohook.c ------------------------------------------------------------------------ r230586 | dvossel | 2009-11-20 11:38:57 -0600 (Fri, 20 Nov 2009) | 13 lines Merged revisions 230583 via svnmerge from https://origsvn.digium.com/svn/asterisk/trunk ........ r230583 | dvossel | 2009-11-20 11:26:20 -0600 (Fri, 20 Nov 2009) | 6 lines audiohook signal trigger on every status change (issue ASTERISK-13705) Review: https://reviewboard.asterisk.org/r/434/ ........ ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=230586 By: Digium Subversion (svnbot) 2009-11-20 11:43:43.000-0600 Repository: asterisk Revision: 230587 _U branches/1.6.0/ U branches/1.6.0/include/asterisk/audiohook.h U branches/1.6.0/main/audiohook.c ------------------------------------------------------------------------ r230587 | dvossel | 2009-11-20 11:43:43 -0600 (Fri, 20 Nov 2009) | 13 lines Merged revisions 230583 via svnmerge from https://origsvn.digium.com/svn/asterisk/trunk ........ r230583 | dvossel | 2009-11-20 11:26:20 -0600 (Fri, 20 Nov 2009) | 6 lines audiohook signal trigger on every status change (issue ASTERISK-13705) Review: https://reviewboard.asterisk.org/r/434/ ........ ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=230587 By: caspy (caspy) 2009-11-23 02:19:18.000-0600 David, thanks. I'm going to upgrade to 1.6.0.18+this_patch tonight. I'll report shortly. By: caspy (caspy) 2009-11-30 06:21:08.000-0600 with this patch (applied on 1.6.0.18) all seems to be ok. thanks! By: caspy (caspy) 2010-05-26 10:13:01 after upgrade to 1.6.2.7 i see that this issue is still actual. chanspy channels are still locking in the same state. By: David Vossel (dvossel) 2010-05-26 11:06:36 are you using MixMonitor as well? By: pablo umanzor (pablou) 2010-05-26 15:01:53 i've the same situation with asterisk 1.4.28, although the call has been finished before , the channel chanspy still goes on, as a matter a fact, some times asterisk make a segfault but i don't know if asterisk crash is for this reason SIP/1292-000007b2 context2 *81231 4 Up ChanSpy SIP/1231|q 1292 17:12:45 salida (None) By: caspy (caspy) 2010-05-27 01:18:27 dvossel, i use Monitor() call arrives on DAHDI (which is recorded by Monitor()), and then terminated to SIP channel, which is spyed by ChanSpy() By: caspy (caspy) 2010-07-06 03:20:36 can i provide anything else, to help to solve this issue? By: Leif Madsen (lmadsen) 2010-07-27 13:25:20 I'm closing this issue as fixed as the original report this issue was filed for has been fixed for quite some time. The additional issues brought up here should be filed under a new issue with recent debugging information. Thanks! |