Summary:ASTERISK-17826: [patch] 3 examples of loss of CDR data
Reporter:Steve Davies (one47)Labels:
Date Opened:2011-05-10 05:23:03Date Closed:2017-12-19 06:52:49.000-0600
Versions: 10.0.0 Frequency of
Environment:Attachments:( 0) cdr_xfer_cdr_cleanup.v10.part2.patch
( 1) cdr_xfer_cdr_cleanup.v10.patch
( 2) cdr-patch1-cleanup
( 3) cdr-patch2-explicit_goto
Description:Discovered in, but looking at 1.8 and trunk, it probably still applies, as this code appears to be largely unchanged.

If transferring (ast_do_masquerade) a bridged channel into an unbridged channel, CDR data is lost. I have found 3 scenarios so far where CDR data or records are lost.


Scenario 1. SIP handset transfer to a ringing call.

A bridged to B
A' ringing C (not answered)
A SIP Redirects to B to C. ast_do_masquerade() swaps the CDR records between A and A' - Overwriting the A CDR is harmless because it is in a bridge, and ast_bridge_cdr() has a copy of its CDR stored in 'bridge_cdr'. Overwriting the A' CDR permanently destroys the only copy of that data (The copy on A is destroyed after the hangup). In this case, 2 CDR records are written, but the "A' to C" leg of the call contains data from the "A to B" leg in error.

Scenario 2. SIP Transfer IVR to a bridged caller.
Scenario 3. SIP Transfer bridged caller to IVR.

These 2 scenarios are actually equivalent because scenario 2 is transmuted into scenario 3 by chan_sip.c before being executed. It pans out very similarly to the calls above, but only 1 CDR is ever written, the CDR describing the IVR leg of the call is never saved - I assume that being copied into a bridge causes it to be ignored in favour of the 'ast_bridge_cdr' copy.
Comments:By: Steve Davies (one47) 2011-05-10 05:25:35

Sadly I do not yet have a patch for this, but can think of 2 possible solutions. One is to flag the CDRs that are going to be overwritten by a bridge so that they are not swapped by ast_do_masquerade(). The other is that CDRs are simply never swapped where one channel is bridged and the other unbridged.

IMHO This needs the attention of a CDR expert.

By: Leif Madsen (lmadsen) 2011-05-10 08:20:43

I'm going to mark this as Acknowledged, but I know anyone who works on CDRs is very hesitant about making changes to the CDR code because every time that happens, 2-3 separate issues pop up because of a change in functionality somewhere.

By: Leif Madsen (lmadsen) 2011-05-10 08:21:10

In fact, I'd suggest you bring this up on the asterisk-dev mailing list so this can be discussed by the community to determine what, if anything, should be done here.

By: Steve Davies (one47) 2011-05-10 09:29:47

This is definitely "non-trivial in the extreme" - I have found the 1.6.0 CDR update commit from 2008 that brings things to more-or-less their current position.

Fundamentally, the 3 cases above ought to be treated like attended transfers of bridged calls - I will do some research, and take the conversation to the -dev list as suggested.

By: Steve Davies (one47) 2011-05-12 07:03:24

It transpires that Scenarios 2 and 3 fail differently depending on which party is transferred out of the bridge. If it is the originating channel, which has a chan->pbx object, then it can use ast_explicit_goto, otherwise it uses ast_async_goto involving a masquerade.

I am uploading 2 patches for comment.

- cdr-patch1-cleanup: removes some code which I believe to be redundant. This is just changes I stumbled on while understanding the code.

- cdr-patch2-explicit_goto: If a call returns to the pbx loop due to ast_async_goto, and has a CDR, and is not about to hang-up, and the channel is currently AST_STATE_UP, then any time that elapses is "billable". Unfortunately the specialised CDR reset will have made it an AST_CDR_NULL record, so we need to re-answer the CDR.

Without the 2nd patch, you get different CDR data for the following 2 cases:

exten => *600,1,Playback(demo-echotest)
same  =>      n,Hangup

exten => *601,1,Answer
same  =>      n,Playback(demo-echotest)
same  =>      n,Hangup

With the patch, you get the same result.

The only obvious difference this patch will cause is that if you blind-transfer to a dialplan with a Dial() command, then the ring-duration for that Dial() command cannot be recorded in the CDR. I believe this is correct, because Ringing being played on an UP channel is still billable seconds.

By: Leif Madsen (lmadsen) 2011-05-12 09:04:02

I'm going to upgrade this to Ready for Testing. Thanks for taking the time to analyse all these. I know CDRs are a bit of black magic.

By: Steve Davies (one47) 2012-01-03 10:34:46.452-0600

patch updated for v10

By: Steve Davies (one47) 2012-01-03 10:40:49.672-0600

I have just attached a version 10 update of this patch. I have not tested it yet as it is a forward-port of a patch I've been running in production for quite a while.

Basically, the meaning of this patch is - If AST_SOFTHANGUP_ASYNCGOTO (Blind Xfer) has just returned a channel to a PBX execution state, then:

a) ast_clear_flag(c,AST_FLAG_BRIDGE_HANGUP_DONT);

This removes any outstanding AST_FLAG_BRIDGE_HANGUP_DONT flag. This occurs when a ringing channel is "Redirect"ed back to a PBX - There is no bridge, so the natural place for this flag to be cleared at the end of the bridge never occurs.

b) if( c->cdr && c->_state == AST_STATE_UP && !ast_check_hangup(c) ) ast_cdr_answer(c->cdr);

If an answered channel is "Redirect"ed into the PBX, in a way that causes a new PBX and CDR to be created, the new CDR channel defaults to unanswered, resulting in incorrect CDR data. Answering it here cleans that up.

As mentioned, I'm running the equivalent of this code here, with some success, and I'm interested in some feedback from a CDR expert.

By: Steve Davies (one47) 2012-01-03 11:17:37.488-0600

Just found an issue with the above patch, will follow-up when I've pinned it down.

By: Steve Davies (one47) 2012-01-04 04:43:14.993-0600

Issue was elsewhere. Raised separate issue#19164.

By: Richard Mudgett (rmudgett) 2012-01-04 12:28:47.207-0600

FYI Please use the full identifier: ASTERISK-19164
The issue#19164 is a throwback to Mantis which is not valid in JIRA.

By: Vitezslav Novy (vnovy) 2012-03-13 05:56:32.952-0500

I have applied cdr_xfer_cdr_cleanup.v10.patch on ver. 1.8.8 but still, scenario
A calls B, B answers
A is redirected by AMI Redirect to C, C doesn't answer
end of call

Produces 1 CDR record
A->C, NO ANSWER, but bilsec>0 and covers both parts of call

BTW why AST_FLAG_BRIDGE_HANGUP_DONT implies discarding bridge CDR
(in ast_bridge_call)
I tried to write it down instead and it looks better, but I did'n test other scenarios.

By: Steve Davies (one47) 2012-03-13 06:50:18.649-0500

Explaining the discarded CDR when AST_FLAG_BRIDGE_HANGUP_DONT is set is hard, so I won't bother, but be warned that it is correct that the code removes it, and if you do not remove it, you will get odd duplicate CDR records in some cases.

I will upload a small patch that might fix your specific issue. Please let me know if it works because I might use it myself if it does :)

By: Vitezslav Novy (vnovy) 2012-03-21 07:57:55.031-0500

I have applied cdr_xfer_cdr_cleanup.v10.part2.patch but behavior is same.

By: Joshua C. Colp (jcolp) 2017-12-19 06:52:49.103-0600

CDR support has been rewritten in Asterisk 13 and due to the way it is now defined and works I don't believe this is still applicable.

If I'm mistaken please feel free to reopen by commenting and pointing out where in the specific things are going wrong.