[Home]

Summary:ASTERISK-07742: [patch] Far end BYE causes weird Reinvite and zombie channel
Reporter:Matthew Simpson (matthewsimpson)Labels:
Date Opened:2006-09-13 21:53:25Date Closed:2006-11-12 10:06:54.000-0600
Priority:MinorRegression?No
Status:Closed/CompleteComponents:Channels/chan_sip/Interoperability
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) 7952.patched
( 1) 7952.unpatched
( 2) joe_originates_-_answer_supervision_-_i_terminate_-_BAD_CALL_-_full_debug.rtf
( 3) trunk-reinvite.2.patch
( 4) trunk-reinvite.3.patch
Description:Asterisk is in between Lucent media gateway and customer with Nextone.

If nextone customer (ENDPOINT A) calls through Asterisk to end point behind Lucent media gateway (ENDPOINT B), and endpoint B hangs up the call, Asterisk does weird stuff and may spawn a Zombie SIP channel.  After a while, these SIP channels cause Asterisk to crash.

****** ADDITIONAL INFORMATION ******

Tested working:

* ENDPOINT A calls ENDPOINT B, end point A hangs up.  This call completes normally.

Noteable points:

At Sep 13 20:36:09 in log file Asterisk realizes the far end hung up.  Why at Sep 13 20:36:09 after that does it try to send a reinvite?  Shouldn't it just pass on the BYE from the Lucent to the Nextone and wrap things up ?

I marked the Zombie SIP channel creation in the capture and did a sip show channels which shows the channel in state Rx: BYE.  The Zombie in this capture is eventually destroyed, but under high load, they will pile up and kill the box (all the zombie channels will be in state Rx: BYE).

Sip settings:

canreinvite=yes
nat=no

This also misbehaves in the same fashion in 1.2.x.
Comments:By: phsultan (phsultan) 2006-09-14 03:39:07

Matthew, I believe this issue is related to ASTERISK-7106.

It appears the recent modifications to chan_sip in r41769 have reintroduced the re-invite sending during native bridge breakout, although immediatly followed with a BYE request that closes the SIP dialog.

I experience problems under the following setup :
A : SJPhone
B : Cisco SIP 7960

- A calls B (both with canreinvite=yes)
- voice trafic flows directly between A to B
- B clears the call by sending a BYE request
- Asterisk sends an INVITE to A (re-invite), to let A know that voice trafic should now be sent to Asterisk
- A replies with a 200 OK response
- Asterisk tears down the call by sending a BYE request
- A continuously sends 481 code responses
- the dialog is closed on both A and Asterisk after some timeout

This causes the SJPhone SIP UA not to properly clear the call as it keeps sending 481 responses to the final BYE sent. I believe SJPhone is not behaving properly, and a simple workaround is to disable reinvite on A. In your case Matthew, you could just disable 'reinvite' on either your Lucent or Nextone box.

But, I am not sure that sending an INVITE packet during a call tear down procedure to break out of the bridge is appropriate in this case.

As a proposition, I send a patch that reflects the changes in the sip_set_rtp_peer function, and prevents Asterisk from sending a re-invite packet on reception of requests that don't carry any SDP, such as BYE.
The re-invite keeps being triggered through the process_sdp function, so that transfered and held calls (with MOH) will be processed correctly.

However, a problem similar to ASTERISK-7511 might still remain : Asterisk considers the channel broke out of a bridge, but the peer has not been informed of this event by a re-invite (see note 0050959 mentioning the very first seconds of the streamed file cannot be heard).

One lead might be to force a re-invite from Asterisk to the remaining end peer right before streaming a file to a SIP channel, just to ensure that voice traffic flows between Asterisk and the corresponding peer. But I guess handling this at the channel level with ast_queue_frame might not be a good or easy solution.

By: Matthew Simpson (matthewsimpson) 2006-09-16 22:26:41

Well in this situation the call has ended so why reinvite at all ?  All that should be done is forward the BYE on and close the call.

By: phsultan (phsultan) 2006-09-17 09:48:43

Have you tried the attached patch? I tested it with my setup, it should prevent Asterisk from re-inviting in your case too.

Serge : disclaimer sent has been sent.

By: Matthew Simpson (matthewsimpson) 2006-09-17 09:57:55

phsultan, I am compiling your patch now. :)  will report back.

By: Matthew Simpson (matthewsimpson) 2006-09-20 11:58:18

your patch seems to help but i still have stuck calls.

By: Serge Vecher (serge-v) 2006-09-20 12:05:44

matthew: please provide at least some debug information to work with. For example, please run SIP debug as per instructions with and without the patch on latest trunk and attach the console output here: Thanks.

1) Prepare test environment (reduce the amount of unrelated traffic on the server);
2) Make sure your logger.conf has the following line:
  console => notice,warning,error,debug
3) restart Asterik.
4) Enable SIP transaction logging with the following CLI commands:
set debug 4
set verbose 4
sip debug
5) Save complete console log to file and _attach_ said file to the bug.

By: Matthew Simpson (matthewsimpson) 2006-09-20 12:34:22

serge, i've already attached what you just asked for.

By: Serge Vecher (serge-v) 2006-09-20 12:48:43

matthew: I'm assuming "joe originates - answer supervision - i terminate - BAD CALL - full debug.rtf" is the log without the patch, what about the log with the patch?

By: marioja (marioja) 2006-09-24 14:56:24

Serge, when you say to save the console to a file (note 0051911), can you tell me what command to use to achieve that?

By: Serge Vecher (serge-v) 2006-09-25 08:58:28

marioja: sure, a common way is to pipe the output to 'tee', i.e. 'asterisk -Tvvvvvdddddgc | tee /tmp/sipdebug.txt'

By: Olle Johansson (oej) 2006-10-26 14:55:35

philippe, matthewsimpson: What's the status of this issue? Please summarize. Thanks. /Olle

By: Matthew Simpson (matthewsimpson) 2006-10-26 16:17:49

still broke, and the patch does nothing.

By: Olle Johansson (oej) 2006-10-27 01:23:03

Can anyone of you repeat this and capture a log from it (latest svn trunk) . I need all debugging - SIP packets and debug and verbose turned to 4.

By: phsultan (phsultan) 2006-10-27 04:46:19

The issue is still there. I believe it is related to ASTERISK-7106 (RE-INVITE sent instead of BYE, see note 0051707 in this bug). We need to be careful with this one because other issues raised up after the initial correction to ASTERISK-7106.

Tested along with this setup :
A : Xlite - IP : 10.1.2.165 - no extension in Asterisk
B : Cisco SIP 7960 - IP : 10.1.1.25 - extension 7999

A is behind an OpenSER box (IP : 10.1.1.252), B is behind Asterisk (IP : 10.1.1.253). OpenSER is registered in sip.conf as type=peer, B is registered in sip.conf as type=friend. Both have canreinvite=yes.

Call flow is A --> OpenSER --> Asterisk --> B.
If A hangs up : no problem
If B hangs up : an INVITE is wrongly sent by Asterisk, instead of relaying the received BY from B.

However, unlike Matthew, I can't see any zombie channel.

I have attached two debug captures, yielded with both unpatched and patched versions.

An updated patch is provided too, that matches with SVN version 46328.

By: zuo beifang (colinzuo) 2006-10-30 19:52:32.000-0600

The problem is Asterisk reversed the From and To field starting from ACK of the Re-Invite, so the ACK and the following BYE doesn't match to the existing dialog,
and the remote call will not be hangup and keep on retransmitting the 200OK waiting for a correct ACK to the Re-INVITE.
(BTW: it appears in asterisk-1.4.0-beta3 tarball also)

By: Olle Johansson (oej) 2006-10-31 02:15:15.000-0600

This has been fixed after the beta3 release. please test with latest 1.4 from subversion. thank you.

By: phsultan (phsultan) 2006-10-31 10:30:21.000-0600

Here is my test report with SVN-branch-1.4-r46583 which works ok for me. I also successfully tested SVN-trunk-r46650.

Same set up :
A : Xlite - IP : 10.1.2.165 - no extension in Asterisk
B : Cisco SIP 7960 - IP : 10.1.1.25 - extension 7999

- Initiation :
 A --> OpenSER --> Asterisk --> B

- Call establishes :
 RTP stream is A <--> B

- B hangs up by sending BYE to Asterisk

- Asterisk relays an INVITE, to inform A of an RTP topology change
 This is needed to handle the case described in ASTERISK-7511, where an
 AGI script streams a file after having called an extension.
 RTP stream is A <--> Asterisk

- Asterisk cancels this newly transmitted INVITE to A as it has detected a two-party call that needs to be ended, and sends a CANCEL request

- OpenSER forwards both INVITE and CANCEL (in that order) to A
 OpenSER replies with a 487 Request Terminated to Asterisk's INVITE request
 (ACKed by Asterisk) and a 200 OK -- no more pending branches to Asterisk's
 CANCEL request

- A replies with 200 OK to B's INVITE *and* CANCEL

- A closes the call by sending a BYE request, which is ignored by Asterisk as it has closed the call. Asterisk then replies with a 481 Call leg/transaction does not exists

It looks to be a proper way of breaking out of a bridged channel (ASTERISK-7106). The cases when one can encounter an RTP topology change typically involve peers/users that can re-invite, but also need to listen to streamed files from Asterisk. Having a music played when you're put on hold, or another file played after an AGI command (ASTERISK-7511) are good examples.

Both of these cases are correctly handled by the tested versions, ie. SVN-branch-1.4-r46583 and SVN-trunk-r46650.

So as for me, the bug can be closed. Maybe Matthew can test new revisions too.

By: Serge Vecher (serge-v) 2006-11-06 09:56:24.000-0600

matthewsimpson: is the latest 1.4/trunk working for you too?

By: Olle Johansson (oej) 2006-11-12 10:06:41.000-0600

If matthewsimpson has problems, please re-open the bug report. Thanks.