[Home]

Summary:ASTERISK-10429: 1.4.11 Stable - Polycom phones hang up when media is re-invited while resuming from an on-hold state
Reporter:Mark A Vince (mavince)Labels:
Date Opened:2007-10-02 09:23:11Date Closed:2007-10-08 14:43:55
Priority:MajorRegression?No
Status:Closed/CompleteComponents:Channels/chan_sip/Interoperability
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) Asterisk-HoldFailure-070928.cap
( 1) Attended.zip
( 2) Attended2_CLI-trace.txt
( 3) CentOS_compile_err.txt
( 4) CLI_Trace_Resume_Hangup
( 5) debug.thorium
( 6) edited_sip.conf
( 7) full_trace_polycom.txt
( 8) full.thorium
( 9) seg_fault_transfer.txt
(10) trace-1.zip
(11) UnAttended.zip
(12) UnAttended_CLI-trace.txt
Description:PSTN calls made with Polycom phones (several different firmware loads) hang up when resuming the call from a hold state.

The call flow connects a Polycom phone to a PSTN phone through a SONUS gateway with the media passing directly from to/from the Polycom phone.

Call Scenario: The call can be initiated and answered in either direction, to or from the PSTN. RTP is successfully provided. If the Polycom phone end places the call on hold, MOH will be heard. If the Polycom phone then attempts to resume the call, Asterisk issues two nearly simultaneous INVITEs with an incremented CSeq, resulting in a 491 - Request Pending (the appropriate response). Asterisk acknowledges the 491 and then hangs up the call! I can reproduce the call behavior consistently.

Key points: media is passing directly to Polycom phone, call can be placed on-hold successfully, other phones Snom, Aastra work correctly with the same configuration. When media traverses the Asterisk, a Polycom based call works correctly.

See Bug Tracker Issue 0009921
Comments:By: Mark A Vince (mavince) 2007-10-04 14:18:06

Snom and Aastra phones place the "Media Attribute: sendonly" at the end of the SDP in the INVITE from the HOLD request. These phones work properly, calls can be placed on hold and resumed successfully.

The Polycom phones place the "Media Attribute: sendonly" at the beginning of the SDP in the INVITE from the HOLD request. Use of the HOLD function terminates the call.

This seems to be the only difference in the messages.

By: Andrew Lindh (andrew) 2007-10-04 16:51:35

I just tried this quickly and had no disconnects (using asterisk branch 1.4).

How do you have your "hold" setup on the polycom phone? (it's in the phone config file). Polycom has two types of "hold".

In my SIP.CONF setup I have for each device:
NAT=NO
CANREINVITE=YES

My test phones: Polycom 650 and 600 with 2.2.0 and 2.1.2 firmware
My test gateway: Cisco AS5300 with IOS 12.3
Two firewalls between the phones and asterisk/cisco (but no NAT used)

RTP media stream is going directly between the two phones, or directly between the polycom and the cisco. It is not using asterisk, as verified by tcpdump (but see more info).

I used hold and resume on each phone several times. When on hold the stream to the polycom stopped (as expected) and the other end had hold music. When I resumed off hold I had two way audio (as expected).

Now for the strange part. When "canreinvite=yes" I got an asymmetric RTP media stream between the polycom/asterisk/cisco. The polycom would send data directly to the cisco but the cisco sends data to asterisk (and then to the phone). After a hold/resume the media data is direct between the phone and the cisco gateway (asterisk is no longer in the middle)....strange... seems something may be off with the first invite with the cisco. It did not matter which side started the call. I don't normally allow reinvite and this is part of the reason. I have had issues when the path is not on the same LAN too.... I guess this should be a new bug.



By: Mark A Vince (mavince) 2007-10-05 10:01:34

Uploaded WireShark trace.. Asterisk is 172.16.4.4, phone is 172.16.4.22

By: Mark A Vince (mavince) 2007-10-05 11:37:13

I had forgot about the different Hold options. My initial settings were:
SIP voIpProt.SIP.useRFC2543hold="0" voIpProt.SIP.useSendonlyHold="1"

I tried different combinations with no change in results. Using the RFC2543
I did see the media attribute change to "inactive" instead of "sendonly"

My sip.conf has

nat=no
canreinvite=yes

My call scenario calls to/from an IP Phone behind the Asterisk to/from
a PSTN based phone.

I can make the same calls through a Cisco 2811 and get the same response as I observed with the Sonus GSX gateway

Attached WireShark trace, Asterisk sends two identical INVITES (number 17 & 19 in trace) to the Session Border Controller...that invokes the behavior associated with the hangup

By: Digium Subversion (svnbot) 2007-10-05 13:35:25

Repository: asterisk
Revision: 84818

U   branches/1.4/main/rtp.c

------------------------------------------------------------------------
r84818 | file | 2007-10-05 13:35:25 -0500 (Fri, 05 Oct 2007) | 4 lines

Update the remembered RTP peer information when putting an endpoint on hold or taking it off hold so that the RTP stack does not initiate a needless reinvite.
(closes issue ASTERISK-10429)
Reported by: mavince

------------------------------------------------------------------------

By: Digium Subversion (svnbot) 2007-10-05 13:37:04

Repository: asterisk
Revision: 84819

_U  trunk/
U   trunk/main/rtp.c

------------------------------------------------------------------------
r84819 | file | 2007-10-05 13:37:03 -0500 (Fri, 05 Oct 2007) | 12 lines

Merged revisions 84818 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.4

........
r84818 | file | 2007-10-05 15:55:36 -0300 (Fri, 05 Oct 2007) | 4 lines

Update the remembered RTP peer information when putting an endpoint on hold or taking it off hold so that the RTP stack does not initiate a needless reinvite.
(closes issue ASTERISK-10429)
Reported by: mavince

........

------------------------------------------------------------------------

By: Mark A Vince (mavince) 2007-10-05 16:09:49

Significant improvement but not complete.

Replaced stable 1.4.11 rtp.c with modified version. Upon compiling got the following errors...
[CC] rtp.c -> rtp.o
rtp.c: In function `ast_rtp_read':
rtp.c:1236: warning: passing arg 3 of `ast_sched_add' from incompatible pointer type
rtp.c: In function `ast_rtp_raw_write':
rtp.c:2654: warning: passing arg 3 of `ast_sched_add' from incompatible pointer type

Restarted Asterisk and get the following message on every inbound call
-- Incoming call: Got SIP response 500 "CSeq Number Out of order" back from 172.16.4.26 (an Aastra phone, not involved in the call)

On the positive side, calls originating at the Polycom phones behind Asterisk (as well as the Aastra) can be alternated between hold and resume without a call hangup. That's a big improvement.

Calls inbound from the PSTN are answered and can be put on hold. When I resume the call, the call is disconnected. I have seen behavior where I can resume once, then go to hold again. When I attempt to resume, the call is disconnected.

By: Mark A Vince (mavince) 2007-10-05 16:12:32

Thanks for working this problem. Much appreciated!

By: Joshua C. Colp (jcolp) 2007-10-05 16:14:34

1. The function declarations for the scheduler stuff changed between 1.4.11 and 1.4 in subversion, so the compiler is complaining about that.

2. The Aastra is unhappy because Asterisk was restarted and the subscriptions (while preserved) did not preserve the cseq. This happens with Polycoms too, except they return Internal Server Error.

3. Can you grab me an Asterisk side log of where things stand with debug enabled in logger.conf and a debug level of 9?

By: Mark A Vince (mavince) 2007-10-05 17:13:16

Added full, debug and CLI files for PSTN to Asterisk call. Did not trace Asterisk to PSTN as that is working. Thanks for the compiler error explanation as well as the Aastra error... had seen the Polycom ones before, they say Server Error code 500.. again, much appreciated

By: Ramon Peek-Fares (ramonpeek) 2007-10-08 06:43:00

Hello everyone..

Me and my colleague (tbelder) entered issue 10696 in the bugtracker.
Which, after some further investigation, seems to be related to this one.

However in our case we have problems (running 1.4.12) in both Attended as Unattended transfers on Thomson phones
Looking at the problems occuring when doing an attended transfer, I immediatly see the same issue occuring as stated here. (2x INVITE to put the active call-leg on-hold) (Thanks to Olle for helping me out with this one... ;-) )

However, when I look at the traces made during an unattended transfer I can see that putting the call-leg back into the MOH works fine. But that Asterisk starts re-inviting the transferrer after he has already transferred the RTP stream back to asterisk... which I feel in incorrect.
After that it clear to see that the phone doesn't expect the message and ignores it, however asterisk start retransmitting the request after the T1 timer expires and thus a call is initiated with no RTP stream... (ghost calls.)

I've uploaded the two different traces for you all to see.
Perhaps anyone could find some use in them

I am starting to believe that the problem is really occuring in an earlier state than where the current patch to rtp.c is written.
Cause in my system too this patch solves the problem only partially on attended transfer, but the problem still exists on unattended transfers.
Here we can see that there is still an second (unwanted) Re-INVITE.

Anyone, any comments or suggestions...

By: Digium Subversion (svnbot) 2007-10-08 10:17:52

Repository: asterisk
Revision: 85023

U   branches/1.4/main/rtp.c

------------------------------------------------------------------------
r85023 | file | 2007-10-08 10:17:52 -0500 (Mon, 08 Oct 2007) | 4 lines

Update codec information as well as address when doing hold reinvites.
(issue ASTERISK-10429)
Reported by: mavince

------------------------------------------------------------------------

By: Digium Subversion (svnbot) 2007-10-08 10:18:54

Repository: asterisk
Revision: 85024

_U  trunk/
U   trunk/main/rtp.c

------------------------------------------------------------------------
r85024 | file | 2007-10-08 10:18:53 -0500 (Mon, 08 Oct 2007) | 12 lines

Merged revisions 85023 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.4

........
r85023 | file | 2007-10-08 12:37:46 -0300 (Mon, 08 Oct 2007) | 4 lines

Update codec information as well as address when doing hold reinvites.
(issue ASTERISK-10429)
Reported by: mavince

........

------------------------------------------------------------------------

By: Joshua C. Colp (jcolp) 2007-10-08 10:19:19

Give the above revision of rtp.c for 1.4 a try.

By: Joshua C. Colp (jcolp) 2007-10-08 10:45:50

You pulled the rtp.c from trunk, not from 1.4 so it was incompatible.

By: Ramon Peek-Fares (ramonpeek) 2007-10-08 10:47:57

Mine compiled fine...
However it still solves the issue only partially..

If you look at my traces (trace-1), you will see that the INVITE with CSeq: 104
is being retransmitted almost endlessly by Asterisk, because Asterisk fails to reply with an ACK to the 200 OK response..

It seems we have moved the problem to another point,
although I can see the improvement :-)

By: Joshua C. Colp (jcolp) 2007-10-08 10:54:06

ramonpeek: Did you update your entire tree, or just rtp.c? I made a change in channel.c for the repeated native bridging issue.

By: Ramon Peek-Fares (ramonpeek) 2007-10-08 10:58:21

Oh sorry, no I didn't update channel.c
I'll do that right now...
Hold on!

By: Mark A Vince (mavince) 2007-10-08 11:01:07

Downloaded correct version of rpt.c... compiles clean (except for same warnings as last Friday) ..... my bad...

Getting channel.c now

By: Ramon Peek-Fares (ramonpeek) 2007-10-08 11:11:55

Mmmm...
If I apply the new channel.c or just the diff to my channel.c
Asterisk crashes right after I transfer party A to C.
They get connected but after that is; bye bye Asterisk..  :-(

With the old channel.c the system didn't crash..

By: Joshua C. Colp (jcolp) 2007-10-08 11:14:32

Okay well then please open a new bug about this... your issue is different than this one.

By: Mark A Vince (mavince) 2007-10-08 11:17:45

Not sure what I am doing wrong... made sure to download from branch/1.4/main/ got the following errors

[CC] channel.c -> channel.o
channel.c: In function `ast_activate_generator':
channel.c:1874: warning: passing arg 3 of `ast_settimeout' from incompatible pointer type
channel.c: At top level:
channel.c:2061: error: conflicting types for 'ast_settimeout'
/usr/src/Asterisk-1_4/asterisk-1.4.11/include/asterisk/channel.h:1135: error: previous declaration of 'ast_settimeout' was here
channel.c:2061: error: conflicting types for 'ast_settimeout'
/usr/src/Asterisk-1_4/asterisk-1.4.11/include/asterisk/channel.h:1135: error: previous declaration of 'ast_settimeout' was here
channel.c: In function `ast_settimeout':
channel.c:2072: warning: assignment from incompatible pointer type
channel.c: In function `__ast_read':
channel.c:2256: warning: initialization from incompatible pointer type
channel.c: In function `ast_channel_masquerade':
channel.c:3434: error: structure has no member named `get_base_channel'
channel.c:3434: error: structure has no member named `get_base_channel'
make[1]: *** [channel.o] Error 1
make: *** [main] Error 2

By: Joshua C. Colp (jcolp) 2007-10-08 11:20:05

mavince: 1.4 changed so the channel.c can't just be dropped in, but for your issue you don't need it... ramonpeek's issue is different, thus why I sent him to open another bug about it.

By: Mark A Vince (mavince) 2007-10-08 11:36:58

file - I returned channel.c to original and recompiled cleanly... will test and report on results shortly... I probably will have to separately track ramonpeek's issue because transfers are next on my list... once past the hold/resume issues... Appreciate your efforts.

By: Ramon Peek-Fares (ramonpeek) 2007-10-08 11:37:39

For reference information related to other cases:

I created issue: 10915 to handle the somewhat different (but almost the same) bug concerning the call-transfers with Thomson or Polycom phones..

NOTICE: THIS DOES NOT REPLACE THIS ISSUE!!

By: Mark A Vince (mavince) 2007-10-08 14:35:43

I can place Polycom phone on HOLD and resume several times in a row... regardless of the direction of call origination. Looks good. Will test with Aastra and report later.

When I transfer the call, Asterisk seg faults! A simple CLI trace is attached.... I guess this problem should go under issue 10915 (ramonpeek) as per earlier discussion.

By: Joshua C. Colp (jcolp) 2007-10-08 14:43:54

Okay since this issue is fixed moving on to the next one... 10915.