Summary:ASTERISK-09431: [patch] Disposition still set to FAIL when it should be NO ANSWER or BUSY when dialling multiple SIP peers
Reporter:Sam Deller (samdell3)Labels:
Date Opened:2007-05-12 23:52:19Date Closed:2007-05-18 17:41:42
Versions:Frequency of
Environment:Attachments:( 0) FAILfix
Description:I believe this is related to (now closed) bug 5918.

Bug 5918 corrected the disposition when the line is answered from FAILED to ANSWERED

However, just like 5918, when dialling multiple SIP peers eg

exten => s,6,Dial(SIP/${PEER1}@voip1&SIP/${PEER1}@voip2,120)

If the line is not answered, and the caller hangs up, the disposition still shows FAILED even though at least one of the peers was alive and happily ringing.
Likewise, the problem is also apparent on busy. Disposition shows FAILED when it should show BUSY

Confirmed on 1.2.10 and very likely all versions of 1.2.x judging by the changelogs/code

Comments:By: Steve Murphy (murf) 2007-05-14 12:26:36

Hmmmm. Weird. Thought I replied to this! I just tried to lab this up,
I set up an extension to dial two sip phones. I picked up a zap phone,
dialed the extension, let the Dial expire without answering, and just see NO ANSWER in the CDR. So, I think I need more input. Maybe it's the @voip1 notation... what are your voip1 and voip2 hosts? Asterisk boxes? How do they respond to an expired dial? Tell me more.

By: Sam Deller (samdell3) 2007-05-14 12:44:52

Sorry, I should have elaborated more.

The caller needs to hang up before the Dial expires.

@voip1 and @voip2 notations are for loadsharing between 2 Asterisk boxes - the peer will only ever be logged into one of the boxes as the CPE devices are using round robin DNS SRV

Try hanging up prior to dial timer expiry, and make sure at least one peer in your dialplan is unavailable

By: Steve Murphy (murf) 2007-05-14 13:40:26

OK, one of the two sip phones, I called a different extension, answered, and let them both sit. I then tried to call both, wiated for the dial to time out, and got  NO ANSWER. OK, so I tried again, and hung up the dialing extension before the timeout, still NO ANSWER.

I guess having it busy isn't the same as UNAVAILABLE. I unplugged one of the two sip phones. Now **that** is unavailable. Still result is NO ANSWER... I need more.

By: Sam Deller (samdell3) 2007-05-14 14:01:51

Calling in from PSTN gateway, effectively a SIP to SIP call, let the peer ring for approx 15 seconds then hangup (before dial timeout).
Peer is registered on localhost (voip-nap1) and not the failover voip-nap2 box that is presently unavailable.

CLI Output:

Connected to Asterisk 1.2.10 currently running on voip-nap1 (pid = 29234)
Verbosity is at least 7
May 15 06:42:32 WARNING[29238]: channel.c:787 channel_find_locked:     -- Executing Macro("SIP/202.x.x.60-08176948", "call-sip-nap|66500001") in new stack
Avoided initial deadlock for '0x81724a8', 10 retries!
   -- Executing Wait("SIP/202.x.x.60-08176948", "1") in new stack
   -- Executing Progress("SIP/202.x.x.60-08176948", "") in new stack
   -- Executing Set("SIP/202.x.x.60-08176948", "remoteparty=68781269<sip:68781269@202.x.x.60;user=phone>;party=calling;id-type=subscriber;privacy=off;screen=yes") in new stack
   -- Executing Set("SIP/202.x.x.60-08176948", "cutremote=68781269") in new stack
   -- Executing Set("SIP/202.x.x.60-08176948", "CDR(accountcode)=68781269") in new stack
   -- Executing Dial("SIP/202.x.x.60-08176948", "SIP/66500001&SIP/66500001@voip-nap2|60|L(14400000)") in new stack
   -- Called 66500001
May 15 06:42:33 NOTICE[7318]: app_dial.c:1049 dial_exec_full: Unable to create channel of type 'SIP' (cause 3 - No route to destination)
   -- SIP/66500001-0817be58 is ringing
 == Spawn extension (macro-call-sip-nap, s, 6) exited non-zero on 'SIP/202.x.x.60-08176948' in macro 'call-sip-nap'
 == Spawn extension (macro-call-sip-nap, s, 6) exited non-zero on 'SIP/202.x.x.60-08176948'


exten => s,1,Wait,1
exten => s,2,Progress()
exten => s,3,Set(remoteparty=${SIP_HEADER(Remote-Party-ID,1)})
exten => s,4,Set(cutremote=${CUT(remoteparty,\<,1)})
exten => s,5,Set(CDR(accountcode)=${cutremote})
exten => s,6,Dial(SIP/${ARG1}&SIP/${ARG1}@voip-nap2,60,L(14400000))
exten => s,7,Goto(ct-status,s-${DIALSTATUS}-${HANGUPCAUSE},1)
exten => s,107,NoOp(Call SIP Napier Priority 107, DialStatus:${DIALSTATUS} HangupCause:${HANGUPCAUSE})
exten => s,108,Goto(ct-status,s-${DIALSTATUS}-${HANGUPCAUSE},1)

Resulting CDR Record:
uniqueid userfield accountcode src dst dcontext clid channel dstchannel lastapp lastdata calldate duration billsec disposition amaflags processed
* * 68781269 68781269 066500001 default "68781269"<68781269> SIP/202.x.x.60-08176948 SIP/66500001-0817be58 Dial SIP/66500001&SIP/66500001@voip-nap2|60|L(14400000) 2007-05-15 06:42:32 18 0 FAILED 3 0

By: Steve Murphy (murf) 2007-05-16 11:41:40

The key to reproducing this is to somehow arrange things so I get the

app_dial.c:1049 dial_exec_full: Unable to create channel of type 'SIP' (cause 3 - No route to destination)

message, and right now, I'm not having any luck! I modified my dial to have a real sip phone, and a fake phone at a host that doesn't exist.

I get:

May 16 09:18:24 DEBUG[30096]: chan_sip.c:2083 sip_call: Outgoing Call for polycom430
   -- Called polycom430
May 16 09:18:24 DEBUG[30096]: chan_sip.c:2083 sip_call: Outgoing Call for x23434
   -- Called x23434@desktop
   -- SIP/polycom430-081f14d8 is ringing
May 16 09:18:40 DEBUG[30096]: chan_sip.c:2448 sip_hangup: update_call_counter(x23434) - decrement call limit counter
May 16 09:18:40 DEBUG[30096]: chan_sip.c:2448 sip_hangup: update_call_counter(polycom430) - decrement call limit counter
May 16 09:18:40 DEBUG[30096]: app_dial.c:1654 dial_exec_full: Exiting with DIALSTATUS=CANCEL.

I tried using hostname 'desktop' which no longer is connected to my network, or a fictitious ip addr '', and still, all I get is NO ANSWER.
I'm working with the svn 1.2 release....

When I ping those addresses, I get 'unreachable' messages for that addr.

I'll try to see if I can somehow get the "No route to destination" message...

By: Steve Murphy (murf) 2007-05-16 12:11:10

many thanks to file; I have reproduced it! There now, let's dive in.

By: Sam Deller (samdell3) 2007-05-16 13:49:46

Cool - so it wasnt just me going mad ;-)

There is a messy workaround as such:

exten => s,9,Dial(SIP/${ARG1},60,L(14400000))
exten => s,10,Goto(ct-status,s-${DIALSTATUS}-${HANGUPCAUSE},1)
exten => s,110,NoOp(Call SIP Napier2 Priority 110, DialStatus:${DIALSTATUS} HangupCause:${HANGUPCAUSE})
exten => s,111,Dial(SIP/${ARG1}@voip-nap2,60,L(14400000))
exten => s,112,Goto(ct-status,s-${DIALSTATUS}-${HANGUPCAUSE},1)
exten => s,212,Goto(ct-status,s-${DIALSTATUS}-${HANGUPCAUSE},1)

But, ideally - being able to dial multiple peers in the same statement without error is a huge bonus.

What was the key to replicating the issue in the end? (I couldn't NOT replicate it...)

By: Steve Murphy (murf) 2007-05-16 14:12:59

Achieving reproduction (that sounds bad) was done by creating a "nonsense" acct in sip.conf, type=peer, host=dynamic, etc. Then, in the dial, I did Sip/nonsense with no @ stuff.

OK, I think I understand this, and see what needs to be done. But no-one will like it, maybe.

For each phone number to ring, we call ast_cdr_answer, ast_cdr_failed, ast_cdr_busy, etc. Each call is set up so the disposition is prioritized.
Unfortunately, FAILED is pretty high priority. It's above NOANSWER and BUSY.
So, I propose this: Add an "AST_CDR_NULL state for disposition. Init to that instead of NOANSWER.

Then, arrange the priorities so:

#define AST_CDR_NULL                             0
#define AST_CDR_FAILED (1 << 0)
#define AST_CDR_BUSY (1 << 1)
#define AST_CDR_NOANSWER (1 << 2)
#define AST_CDR_ANSWERED (1 << 3)

Among a group of channels rung, we set the call cdr disposition to the highest priority achieved on any one of them. If one got answered, the CDR records ANSWERED. If they were all BUSY, then you get BUSY, but if one rang and rang and didn't answer, then you want a NOANSWER. If they all failed for some reason, like UNREGISTERED or NO_ROUTE, then you want FAILED for the entire call. But if one was BUSY, that should win.

Tha't my opinion. Anyway, there's some work, because the disposition is init'd to NOANSWER by default, and I'll have to make sure that such gets reported purposely.  The new disposition state, NULL, might actually show up in the CDR's under weird circumstances, but shouldn't normally.

The priority list is my gut speaking. If anyone thinks differently, let me know. I may have eaten too many oreo's today.

By: Steve Murphy (murf) 2007-05-16 16:22:02

I've uploaded the FAILfix patch. I figure that AST_CDR_NULL can be reported as "NO ANSWER" for disposition, which should be fairly backwards compatible. I made sure that, even if one of the dialed channels reports FAIL, that "NO ANSWER" gets returned, whether it times out, or if the dialer hangs up (CANCEL).

It should report BUSY if all the phones not reporting FAIL are BUSY-- I tried to test this, but I guess I set the call count too high for my polycom... I'll leave it to you to verify this.

By: Sam Deller (samdell3) 2007-05-16 17:21:14

Nice work, thanks guys.

Will apply the patch tonight and advise results soon thereafter

By: Sam Deller (samdell3) 2007-05-18 13:58:57

Patch applied without errors, recompiled with a few warnings, but so far, so good!  
Disposition is now correct on every scenario tested to date. I'll need a month or so to see how it runs in production before giving it the official thumbs up.

The warnings encountered during compile were:
cdr.c:532: warning: no previous prototype for `ast_cdr_noanswer'

I think that error popped up about 3 or 4 times

But it works fine so who cares ;-)

By: Steve Murphy (murf) 2007-05-18 17:41:42

OK, thanks for helping with the testing. I'll be travelling, so now's a good time to commit this. If there's any probs, reopen this and I'll fix them.

In 1.2, this fix via v.65172
In 1.4, v.65200
In trunk, v. 65202


lost cdr.c fix in 1.2->1.4, fixed in 1.4 via v.65202, trunk v.65203