[Home]

Summary:ASTERISK-02501: Design flaw in chan_sip
Reporter:Roy Sigurd Karlsbakk (rkarlsba)Labels:
Date Opened:2004-09-29 08:34:13Date Closed:2011-06-07 14:10:48
Priority:MinorRegression?No
Status:Closed/CompleteComponents:Core/General
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) chan_sip.patch
( 1) screenlog.0-20040930173532
Description:hi

there seems to be a design flaw in chan_sip that makes slow SIP Registrations cause audio dropouts. The problem, a mutex blocking all network traffic including RTP, is described below, thanks to diana.

roy
Comments:By: Mark Spencer (markster) 2004-09-29 09:21:00

If you look at the debug, and attach gdb right at the moment of the problem, you'll almost certainly find the audio hickup happens during the gethostbyname call in "parse_contact" (note the "XXX This could block for a long time XXX" comment just before the gethostbyname).  The primary design flaw is that you DNS is taking too long to look up, and you're using a protocol which specifies that we have to gethostbyname a variety of parameters.

I recommend reconfiguring such that the gethostbyname only operates on an IP address and never on a host name.

By: Brian West (bkw918) 2004-09-29 09:23:21

I don't feel this is a major bug.  If it were such a major bug why hasn't it shown up before.  Just because someone says its a design flaw doesn't mean that it is.  We need steps that PROVE this, not just becuase l-fy says so.  Granted she's a damn good coder but I want proof.

bkw

By: Mark Spencer (markster) 2004-09-29 09:43:58

Removed conversation with l-fy at the request of rkarlsba

By: Mark Spencer (markster) 2004-09-29 15:00:06

It would be nice to confirm that's the right place, still, just to be sure.  Can you attach gdb while this is happening?

By: Roy Sigurd Karlsbakk (rkarlsba) 2004-09-30 03:43:50

The problem behaves exacly the same when addressing the * server with ip address, as with a DNS name, so could this really be the problem?

How can I attach with gdb and get this info?

thanks

roy

By: Mark Spencer (markster) 2004-09-30 09:22:59

How frequently does the problem occur?  If it occurs frequently I can login to the box and we can share a screen session and I can have everything ready and you can just press "enter" at the moment you hear the breakup.

By: Roy Sigurd Karlsbakk (rkarlsba) 2004-09-30 09:40:16

It happens at every SIP register, so at a
   -- Registered SIP '1000002' at 80.239.107.87 port 5060 expires 1
   -- Saved useragent "100/000003" for peer 1000002
I's already there.
The dropout is local to the call/session, so another SIP client register will not block.

Diana's idea of what's happening was "The SIP client responds late to the initial SIP 401 in the registration, and the mutex (don't remember the name) locks network traffic including the voice"

roy

edited on: 09-30-04 09:42

By: Roy Sigurd Karlsbakk (rkarlsba) 2004-09-30 10:40:04

Attached a SIP DEBUG showing what's happening during the dropouts.

By: Roy Sigurd Karlsbakk (rkarlsba) 2004-10-01 04:09:55

> channels/chan_sip.c:7598
>
> there you have         p = find_call(&req, &sin);
> which is calling a locked call
> which returns a locked call
> not calling, but returns
> and is unlocked after calling handle_request()
> so while handle_request does not return the lock is held
> which is the same lock used by sip_write()
> which sip_write is writing things into the data channel

Thanks to Diana for the info above

roy

edited on: 10-01-04 07:38

By: Mark Spencer (markster) 2004-10-01 10:46:20

That would only occur if the same transaction is being used for REGISTER as for the call itself -- certainly extremely unlikely.

Unfortunately your SIP debug does not show the INVITE for the call in question, so there is no way to confirm that is the case.

By: Mark Spencer (markster) 2004-10-02 21:58:21

Can you post your sip debug including the invite and the register?  thanks.

By: Olle Johansson (oej) 2004-10-03 12:23:38

On a side note, we need to integrate an asynchronus DNS library

By: Brian West (bkw918) 2004-10-03 19:53:19

By chance are you using mysql_friends?

By: Roy Sigurd Karlsbakk (rkarlsba) 2004-10-04 03:09:13

yes, but the bug is the same with normal sip.conf usage. I've tried both and there is no noticable difference.

By: Roy Sigurd Karlsbakk (rkarlsba) 2004-10-06 06:12:29

the attached chan_sip.patch is Diana's proposed solution to the problem.
all copyrights go to digium as usual

By: Brian West (bkw918) 2004-10-06 09:43:20

does the solution fix the problem?

By: Roy Sigurd Karlsbakk (rkarlsba) 2004-10-06 10:29:20

AFAICS, yes
My problem isn't fixed yet, as that was a problem with the ATAs as well

roy

By: Mark Spencer (markster) 2004-10-06 11:15:14

Does Diana have an Asterisk disclaimer?  If she wrote the code, she would have to disclaim it right?

By: Roy Sigurd Karlsbakk (rkarlsba) 2004-10-07 02:53:12

She told me "I wrote the code, you paid me, it's your code"
So what do I have to write?

By: Mark Spencer (markster) 2004-10-07 14:39:12

If you bought it it's fine, now it's just getting through the technical part.

I'm trying to understand how the patch could possibly be related to the problem being described...

One section adds a lock and another section frees a lock during the call setup.  What does any of this have to do with a register blocking audio in the middle of a call?

By: Brian West (bkw918) 2004-10-14 01:12:32

Any update.

By: Mark Spencer (markster) 2004-10-24 09:16:16

The bug placer seems to have lost interest in this bug and is no longer responding to messages, therefore we are suspending the bug until they comment.