[Home]

Summary:ASTERISK-13705: [patch] sip channel freezed in ChanSpy() app
Reporter:caspy (caspy)Labels:
Date Opened:2009-03-06 14:35:15.000-0600Date Closed:2010-07-27 13:25:21
Priority:MinorRegression?No
Status:Closed/CompleteComponents:Applications/app_chanspy
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) bt_chanspy_lock_status_20091104.txt
( 1) channel_freezed_1.6.0.diff
( 2) diagnose_for_14618_1.txt
( 3) diagnose_for_14618_2.txt
Description:i have a channel that freezed in a strange state. which i can't kill.

Scenario: SIP/1234 dialed number (897795678) that do "{Answer(); ChanSpy(SIP/5678,q); }", and after some hours i see this:

SIP/1234 - is sjphone. it's alredy free, do nothing! call ended on client. it's already even unreachable. but channel still exist:

*CLI> core show channels
Channel              Location             State   Application(Data)
SIP/1234-b587fc50    897795678@fromoffice Up      ChanSpy(SIP/5678,q)
1 active channel
1 active call

*CLI> core show channel SIP/1234-b587fc50
-- General --
          Name: SIP/1234-b587fc50
          Type: SIP
      UniqueID: 1236337972.459555
     Caller ID: 1234
Caller ID Name: User Name
   DNID Digits: 897795678
      Language: ru
         State: Up (6)
         Rings: 0
 NativeFormats: 0x8 (alaw)
   WriteFormat: 0x40 (slin)
    ReadFormat: 0x8 (alaw)
WriteTranscode: Yes
 ReadTranscode: No
1st File Descriptor: 106
     Frames in: 123606
    Frames out: 79133
Time to Hangup: 0
  Elapsed Time: 9h5m25s
 Direct Bridge: <none>
Indirect Bridge: <none>
--   PBX   --
       Context: fromoffice
     Extension: 897795678
      Priority: 2
    Call Group: 32768
  Pickup Group: 32768
   Application: ChanSpy
          Data: SIP/5678,q
   Blocking in: (Not Blocking)
     Variables:
RTPAUDIOQOS=ssrc=1088103444;themssrc=265647381;lp=0;rxjitter=0.023820;rxcount=123606;txjitter=0.000000;txcount=79133;rlp=0;rtt=0.000000
SIPCALLID=1AB484D1-80BF-4F1E-97DD-F3B9FC49AA27@10.x.x.x
SIPDOMAIN=sipproxy.int.domain.tld
SIPURI=sip:1234@10.x.x.x:1000

 CDR Variables:
level 1: clid="User Name" <1234>
level 1: src=1234
level 1: dst=897795678
level 1: dcontext=fromoffice
level 1: channel=SIP/1234-b587fc50
level 1: lastapp=ChanSpy
level 1: lastdata=SIP/5678,q
level 1: start=2009-03-06 14:12:52
level 1: answer=2009-03-06 14:12:52
level 1: duration=32724
level 1: billsec=32724
level 1: disposition=ANSWERED
level 1: amaflags=DOCUMENTATION
level 1: uniqueid=1236337972.459555

*CLI> sip show channel 1AB484D1-80BF-4F1E-97DD-F3B9FC49AA27
 * SIP Call
 Curr. trans. direction:  Incoming
 Call-ID:                1AB484D1-80BF-4F1E-97DD-F3B9FC49AA27@10.x.x.x
 Owner channel ID:       SIP/1234-b587fc50
 Our Codec Capability:   14
 Non-Codec Capability (DTMF):   1
 Their Codec Capability:   1038
 Joint Codec Capability:   14
 Format:                 0x8 (alaw)
 T.38 support            No
 Video support           No
 MaxCallBR:              384 kbps
 Theoretical Address:    10.x.x.x:1000
 Received Address:       10.x.x.x:1000
 SIP Transfer mode:      open
 NAT Support:            Always
 Audio IP:               10.y.y.y (local)
 Our Tag:                as6bde6d25
 Their Tag:              10251567111166
 SIP User agent:         SJphone/1.60.289a (SJ Labs)
 Username:               1234
 Peername:               1234
 Original uri:           sip:1234@10.x.x.x:1000
 Caller-ID:              1234
 Need Destroy:           No
 Last Message:           Rx: BYE
 Promiscuous Redir:      No
 Route:                  sip:1234@10.x.x.x:1000
 DTMF Mode:              rfc2833
 SIP Options:            (none)
 Session-Timer:          Inactive

*CLI> soft hangup SIP/1234-b587fc50
Requested Hangup on channel 'SIP/1234-b587fc50'


'soft hangup' DO NOTHING. channel is still existing.
i did not yet restart my system, so if i can do anything more for diagnostic - please tell. this is rare situation, so, if i can look smth else - i should do it till nearest reload ;)
Comments:By: Joshua C. Colp (jcolp) 2009-03-10 09:29:22

Please recompile with DONT_OPTIMIZE and DEBUG_THREADS enabled in menuselect under Compiler Flags. Once this happens please upload the output of "core show locks". I would also suggest using 1.6.0 from subversion.

By: caspy (caspy) 2009-03-10 09:48:55

ok, i'll do it tonight.

please, leave this bug open in feedback state for a couple of weeks, cause i don't know when it will freeze again.

thnx.

By: caspy (caspy) 2009-03-19 13:44:22

it tooks too long to wait until it freezes again. interesting observation (for many years): asterisk is more stable, as more debug/verbosity compiled in or turned on. really ;)

ok, what about case:
i compiled asterisk with DONT_OPTIMIZE and DEBUG_THREADS.
'core show locks' show nothing. before and after 'soft hangup'.
even more, i did not seen any locks for two weeks at all.

in 'diagnose_for_14618_1.txt' you can find all commands i have executed last time, plus commands for lock.


ps: as usual, i'll wait for any more debug/diagnose cases without forcing restart of asterisk.

By: caspy (caspy) 2009-03-20 00:50:30

?! just now i catch this freeze myself.
still no locks, all just the same.

but, some new observations:

1) the issue is not in a type of spying channel. i issued ChanSpy from DAHDI:
*CLI> core show channels
Channel              Location             State   Application(Data)
DAHDI/4-1            897793333@fromoffice Up      ChanSpy(SIP/3333,q)
SIP/1111-0839cf48    897792222@fromoffice Up      ChanSpy(SIP/2222,q)

spy from dahdi/4 - was mine
spy from sip/1111 - described yesterday.


2) while spying i do nothing.
i simply dialed my office number from my mobile, then dialed spy add-on.
successfully attached, and put phone on the table.
after the current call was finished - silence come.
and when a new bridge occured at spyed channel, i still hear a silence.

3) in log i have:
[Mar 20 08:17:23] NOTICE[31608] app_chanspy.c: Attaching DAHDI/4-1 to SIP/3333-b6a3de08
[Mar 20 08:19:46] NOTICE[31608] app_chanspy.c: Attaching DAHDI/4-1 to SIP/3333-b6ae6030
[Mar 20 08:20:07] NOTICE[31608] app_chanspy.c: Attaching DAHDI/4-1 to SIP/3333-b6132e90
[Mar 20 08:20:21] NOTICE[31608] app_chanspy.c: Attaching DAHDI/4-1 to SIP/3333-08353fe8
[Mar 20 08:20:36] NOTICE[31608] app_chanspy.c: Attaching DAHDI/4-1 to SIP/3333-084c1710

the first one line - it was a bridged call i was successfully listen for 2min.
the last one - it was a new bridged call, which i did not hear!
thouse in a middle - was an unbridged tries to reach SIP/3333 from queue. (SIP/3333 answering a calls from queue).


4) after i drop a line on mobile, messages in log stopped. but in asterisk's env nothing changed. still 'active' channel. i'll upload some diagnostic in a few minutes.


ps: with freezing of DAHDI, the case become 'major' or even 'block'. cause we need to restart asterisk to unfreeze DAHDI's channel. ;(

By: caspy (caspy) 2009-03-20 11:20:20

more interesting things:

*CLI> core show channels
Channel              Location             State   Application(Data)
SIP/1111-08402ad8    897792222@fromoffice Up      ChanSpy(SIP/2222,q)


but SIP/1111 gone home, his computer turned off.
(i do not know did he finished a call normally. but as you can see next - nothing in history)

*CLI> sip show peer 1111
 * Name       : 1111
[...]
 Status       : UNKNOWN
 Useragent    : SJphone/1.60.289a (SJ Labs)
[...]


*CLI> sip show history 7EF7B2D9-87BB-4
 * SIP Call
1. Rx              INVITE / 1 INVITE / sip:897792222@sipproxy.int.domain.tld
2. AuthChal        Auth challenge sent for  - nc 0
3. TxRespRel       SIP/2.0 / 1 INVITE - 401 Unauthorized
4. SchedDestroy    32000 ms
5. Rx              ACK / 1 ACK / sip:897792222@sipproxy.int.domain.tld
6. Rx              INVITE / 2 INVITE / sip:897792222@sipproxy.int.domain.tld
7. CancelDestroy
8. Invite          New call: 7EF7B2D9-87BB-4F4E-9D12-776F587C3B00@10.x.x.x
9. AuthOK          Auth challenge succesful for 1111
10. NewChan         Channel SIP/1111-08402ad8 - from 7EF7B2D9-87BB-4F4E-9D12-776F58
11. TxResp          SIP/2.0 / 2 INVITE - 100 Trying
12. TxRespRel       SIP/2.0 / 2 INVITE - 200 OK
13. Rx              ACK / 2 ACK / sip:897792222@10.y.y.y

By: caspy (caspy) 2009-03-20 12:46:03

add-on for previous comment:

asterisk is still trying to send packets (seems to be RTP) to spying host.

# tcpdump -nv host 10.x.x.x
20:41:51.094313 IP (tos 0x0, ttl  64, id 0, offset 0, flags [DF], proto: UDP (17), length: 72) 10.y.y.y.20631 > 10.x.x.x.49195: UDP, length 44
20:41:56.096522 IP (tos 0x0, ttl  64, id 0, offset 0, flags [DF], proto: UDP (17), length: 72) 10.y.y.y.20631 > 10.x.x.x.49195: UDP, length 44
20:42:01.094799 IP (tos 0x0, ttl  64, id 0, offset 0, flags [DF], proto: UDP (17), length: 72) 10.y.y.y.20631 > 10.x.x.x.49195: UDP, length 44
20:42:06.097068 IP (tos 0x0, ttl  64, id 0, offset 0, flags [DF], proto: UDP (17), length: 72) 10.y.y.y.20631 > 10.x.x.x.49195: UDP, length 44

By: caspy (caspy) 2009-03-24 14:36:58

can i provide any more info, or may be forward bug to any next state from 'feedback'? :)

By: caspy (caspy) 2009-04-01 14:13:06

the way to try to reproduce:

call from any sip phone to ChanSpy() apps, listen for a while, and then disconnect this phone from network. spying channel will not be killed anymore.
but in _exactly_ such scenario it is possible to do a 'soft hangup'.

how to lock in a phase from which it can not be 'soft hanguped' - i do not know yet. i'm waiting for such lock to look for sip history.

By: caspy (caspy) 2009-05-05 04:00:04

interesting observation:

after crash of asterisk due to another bug with three locked ChanSpy()'s channels (like above), i can not find anything locked/waiting/etc in gdb's output.

seems like all spying threads gone away, leaving channels marked as inuse.

By: Matthew Nicholson (mnicholson) 2009-08-13 17:03:08

Do you have rtptimeout set in sip.conf?

By: caspy (caspy) 2009-08-14 01:39:11

no.

should i try?
did it affect peoples sitting and waiting in queues or smth like?

By: Matthew Nicholson (mnicholson) 2009-08-14 12:07:25

Yes try turning on rtptimeout, it should not affect people waiting in queues.  It should properly disconnect a call if a phone is unplugged from the network.  Because there are no locks held, I suspect your problem is network related and not actually a problem with asterisk.  Please test with rtptimeout set and see if it fixes your issue.

By: Leif Madsen (lmadsen) 2009-08-20 15:00:17

caspy: any word on whether the suggestions Mark gave have solved your issue?

By: caspy (caspy) 2009-08-21 03:44:35

lmadsen,
i put 'rtptimeout=60' in sip.conf ~ 5 days ago. Since that time i did not seen any freezed ChanSpy().
But i whould like to look for it for about 1 or 2 weeks more. I'll write result here.



By: caspy (caspy) 2009-08-21 03:49:22

one more interesting fact:
as i said early in comment 0102005, - i saw such freeze on a DAHDI channel. so, rtptimeout can be not a solution.

By: Tilghman Lesher (tilghman) 2009-09-03 12:17:59

Given the lack of feedback, I'm assuming that the use of rtptimeout fixed this issue.

By: caspy (caspy) 2009-10-08 04:55:47

rtptimeout did not help.

CLI>sip show settings
..
 RTP Timeout:            60
..

CLI> core show channels
Channel              Location             State   Application(Data)
..
SIP/1111-b430d648    xxxxxxxxx@fromoffice Up      ChanSpy(SIP/1111,q)


CLI> sip show channels
Peer             User/ANR    Call ID          Format           Hold     Last Message
..
10.100.xxx.xxx   1111        Njc1MzNiOTM2MTA  0x4 (ulaw)       No       Rx: BYE


No locks are shown in 'core show locks'.

By: Jeff Peeler (jpeeler) 2009-10-23 17:41:01

This is going to be really hard to track down without some information of how to reproduce. If you can work out steps to make the issue occur, please outline those here. Otherwise, I wonder if some live debugging might be an option once your system gets in the problem state.

By: caspy (caspy) 2009-10-28 11:13:12

==
Applications/app_chanspy => Channels/chan_sip/General
==

it seems not to be a sip problem, cause as i said early - any channel (DAHDI  in my case) can be locked this way.


what about way to reproduce - i dont know how. only wait for. it seems to be a combination of some factors.
and, as soon as i return from holidays and lock happens - i'll write here and we can do any debugging.

By: caspy (caspy) 2009-11-02 01:52:38.000-0600

My Asterisk is standing now with 16 locked in such way channels.
We can online-debug anything now.

By: Jeff Peeler (jpeeler) 2009-11-02 16:00:48.000-0600

I hope that install is running an unoptimized build. Attach gdb to it and post the output of "thread apply all bt".

By: caspy (caspy) 2009-11-04 10:23:25.000-0600

bt of running asterisk with locked chanspy() attached.
that was the only channel active in system:

*CLI> core show channels
Channel              Location             State   Application(Data)
SIP/1111-09c0af88    000001122@fromoffice Up      ChanSpy(SIP/1122,q)
1 active channel
1 active call
17818 calls processed

By: David Vossel (dvossel) 2009-11-09 13:52:15.000-0600

I'm able to reproduce this.  I should have a patch for testing shortly.

*edit* I take that back, setting rtptimeout made the channel go away for me.



By: David Vossel (dvossel) 2009-11-09 15:04:40.000-0600

Can you provide more information on exactly what version of Asterisk you are using? Please provide the branch as well as the svn revision number.



By: caspy (caspy) 2009-11-09 16:20:20.000-0600

1.6.0.9

By: Leif Madsen (lmadsen) 2009-11-09 18:42:09.000-0600

Since we're up to 1.6.0.18-rc2 now, perhaps you should give it a try again with a newer version in your test environment?

By: caspy (caspy) 2009-11-10 03:45:53.000-0600

dvossel said, he was able to reproduce it.
should i still need upgrade?

By: David Vossel (dvossel) 2009-11-10 09:28:21.000-0600

Yes, please upgrade.  I thought I had reproduced it, but setting rtptimeout fixed the issue for me, so it wasn't the same thing.



By: David Vossel (dvossel) 2009-11-20 11:28:05.000-0600

I uploaded a patch that may resolve this issue.  Will you be able to test it?

By: Digium Subversion (svnbot) 2009-11-20 11:33:02.000-0600

Repository: asterisk
Revision: 230583

U   trunk/include/asterisk/audiohook.h
U   trunk/main/audiohook.c

------------------------------------------------------------------------
r230583 | dvossel | 2009-11-20 11:33:01 -0600 (Fri, 20 Nov 2009) | 6 lines

audiohook signal trigger on every status change

(issue ASTERISK-13705)

Review: https://reviewboard.asterisk.org/r/434/

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=230583

By: Digium Subversion (svnbot) 2009-11-20 11:37:50.000-0600

Repository: asterisk
Revision: 230585

_U  branches/1.6.2/
U   branches/1.6.2/include/asterisk/audiohook.h
U   branches/1.6.2/main/audiohook.c

------------------------------------------------------------------------
r230585 | dvossel | 2009-11-20 11:37:49 -0600 (Fri, 20 Nov 2009) | 13 lines

Merged revisions 230583 via svnmerge from
https://origsvn.digium.com/svn/asterisk/trunk

........
 r230583 | dvossel | 2009-11-20 11:26:20 -0600 (Fri, 20 Nov 2009) | 6 lines
 
 audiohook signal trigger on every status change
 
 (issue ASTERISK-13705)
 
 Review: https://reviewboard.asterisk.org/r/434/
........

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=230585

By: Digium Subversion (svnbot) 2009-11-20 11:38:57.000-0600

Repository: asterisk
Revision: 230586

_U  branches/1.6.1/
U   branches/1.6.1/include/asterisk/audiohook.h
U   branches/1.6.1/main/audiohook.c

------------------------------------------------------------------------
r230586 | dvossel | 2009-11-20 11:38:57 -0600 (Fri, 20 Nov 2009) | 13 lines

Merged revisions 230583 via svnmerge from
https://origsvn.digium.com/svn/asterisk/trunk

........
 r230583 | dvossel | 2009-11-20 11:26:20 -0600 (Fri, 20 Nov 2009) | 6 lines
 
 audiohook signal trigger on every status change
 
 (issue ASTERISK-13705)
 
 Review: https://reviewboard.asterisk.org/r/434/
........

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=230586

By: Digium Subversion (svnbot) 2009-11-20 11:43:43.000-0600

Repository: asterisk
Revision: 230587

_U  branches/1.6.0/
U   branches/1.6.0/include/asterisk/audiohook.h
U   branches/1.6.0/main/audiohook.c

------------------------------------------------------------------------
r230587 | dvossel | 2009-11-20 11:43:43 -0600 (Fri, 20 Nov 2009) | 13 lines

Merged revisions 230583 via svnmerge from
https://origsvn.digium.com/svn/asterisk/trunk

........
 r230583 | dvossel | 2009-11-20 11:26:20 -0600 (Fri, 20 Nov 2009) | 6 lines
 
 audiohook signal trigger on every status change
 
 (issue ASTERISK-13705)
 
 Review: https://reviewboard.asterisk.org/r/434/
........

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=230587

By: caspy (caspy) 2009-11-23 02:19:18.000-0600

David, thanks.
I'm going to upgrade to 1.6.0.18+this_patch tonight.
I'll report shortly.

By: caspy (caspy) 2009-11-30 06:21:08.000-0600

with this patch (applied on 1.6.0.18) all seems to be ok.
thanks!

By: caspy (caspy) 2010-05-26 10:13:01

after upgrade to 1.6.2.7 i see that this issue is still actual.
chanspy channels are still locking in the same state.

By: David Vossel (dvossel) 2010-05-26 11:06:36

are you using MixMonitor as well?

By: pablo umanzor (pablou) 2010-05-26 15:01:53

i've the same situation with asterisk 1.4.28, although  the call has been finished before , the channel chanspy still goes on, as a matter a fact, some times asterisk make a segfault but i don't know  if asterisk crash is for this reason

SIP/1292-000007b2    context2  *81231     4 Up  ChanSpy  SIP/1231|q 1292            17:12:45 salida       (None)

By: caspy (caspy) 2010-05-27 01:18:27

dvossel,
i use Monitor()

call arrives on DAHDI (which is recorded by Monitor()), and then terminated to SIP channel, which is spyed by ChanSpy()

By: caspy (caspy) 2010-07-06 03:20:36

can i provide anything else, to help to solve this issue?

By: Leif Madsen (lmadsen) 2010-07-27 13:25:20

I'm closing this issue as fixed as the original report this issue was filed for has been fixed for quite some time. The additional issues brought up here should be filed under a new issue with recent debugging information. Thanks!