[Home]

Summary:ASTERISK-18024: Incorrect data in CDRs (billsec >> durration)
Reporter:gicc (gicc)Labels:
Date Opened:2011-06-15 12:51:05Date Closed:2011-07-27 08:10:53
Priority:MajorRegression?
Status:Closed/CompleteComponents:CDR/General
Versions:1.8.4 Frequency of
Occurrence
Frequent
Related
Issues:
Environment:Attachments:( 0) 20110726__ASTERISK-18024.diff
( 1) ASTERISK-18024.txt
Description:We've discovered, that Asterisk sometimes produces incorrect billing records.

We see lot of CDR's with billsec higher than durration. Today we found approx 200 CDR's that had durration <30sec and billsec >5000sec. This is obviously impossible.
CDR start field: 2011-06-13 16:11:52
CDR end field: 2011-06-13 16:12:18
Billsec: 7048
Durration: 26
Disposition: Answered

Logs from upstream loadbalancer indicates, that call had indeed 26sec (there was clock shift 17sec between loadbalancer & asterisk)

Jun 13 16:11:35 V0P034-VoIP-LB /usr/local/sbin/opensips[6129]: New request - M=INVITE RURI=sip:123456789@lb-gw.local F=sip:154054@192.168.85.134 T=sip:123456789@lb-gw.local IP=192.168.85.134 ID=2577f30c3704069702c668e333a991cd@192.168.85.134:5060
Jun 13 16:12:01 V0P034-VoIP-LB /usr/local/sbin/opensips[6141]: New request - M=ACK RURI=sip:123456789@192.168.85.135:5060 F=sip:154054@192.168.85.134 T=sip:123456789@lb-gw.local IP=192.168.85.134 ID=2577f30c3704069702c668e333a991cd@192.168.85.134:5060
Jun 13 16:12:01 V0P034-VoIP-LB /usr/local/sbin/opensips[6134]: New request - M=INVITE RURI=sip:123456789@192.168.85.135:5060 F=sip:154054@192.168.85.134 T=sip:123456789@lb-gw.local IP=192.168.85.134 ID=2577f30c3704069702c668e333a991cd@192.168.85.134:5060
Jun 13 16:12:01 V0P034-VoIP-LB /usr/local/sbin/opensips[6135]: New request - M=ACK RURI=sip:123456789@192.168.85.135:5060 F=sip:154054@192.168.85.134 T=sip:123456789@lb-gw.local IP=192.168.85.134 ID=2577f30c3704069702c668e333a991cd@192.168.85.134:5060
Jun 13 16:12:01 V0P034-VoIP-LB /usr/local/sbin/opensips[6135]: New request - M=INVITE RURI=sip:123456789@192.168.85.135:5060 F=sip:154054@192.168.85.134 T=sip:123456789@lb-gw.local IP=192.168.85.134 ID=2577f30c3704069702c668e333a991cd@192.168.85.134:5060
Jun 13 16:12:01 V0P034-VoIP-LB /usr/local/sbin/opensips[6134]: New request - M=ACK RURI=sip:123456789@192.168.85.135:5060 F=sip:154054@192.168.85.134 T=sip:123456789@lb-gw.local IP=192.168.85.134 ID=2577f30c3704069702c668e333a991cd@192.168.85.134:5060
Jun 13 16:12:01 V0P034-VoIP-LB /usr/local/sbin/opensips[6131]: New request - M=INVITE RURI=sip:123456789@192.168.85.135:5060 F=sip:154054@192.168.85.134 T=sip:123456789@lb-gw.local IP=192.168.85.134 ID=2577f30c3704069702c668e333a991cd@192.168.85.134:5060
Jun 13 16:12:01 V0P034-VoIP-LB /usr/local/sbin/opensips[6129]: New request - M=ACK RURI=sip:123456789@192.168.85.135:5060 F=sip:154054@192.168.85.134 T=sip:123456789@lb-gw.local IP=192.168.85.134 ID=2577f30c3704069702c668e333a991cd@192.168.85.134:5060
Jun 13 16:12:01 V0P034-VoIP-LB /usr/local/sbin/opensips[6133]: New request - M=BYE RURI=sip:123456789@192.168.85.135:5060 F=sip:154054@192.168.85.134 T=sip:123456789@lb-gw.local IP=192.168.85.134 ID=2577f30c3704069702c668e333a991cd@192.168.85.134:5060


Another example we found out, Disposition: NO ANSWER, billsec: 9728, durration: 28.

We saw this behaviour in 1.6 as well.

cdr.conf
[general]
enable=yes
batch=yes
size=100
time=600
scheduleronly=no
safeshutdown=yes
unanswered =no
endbeforehexten=yes

[csv]
usegmtime=no
loguniqueid=yes
loguserfield=yes
Comments:By: gicc (gicc) 2011-06-16 07:29:26.715-0500

I was able to track back verbose log of such 'broken' call (ASTERISK-18024.txt added to this case).
CDR Start: 2011-06-16 08:14:25
CDR end: 2011-06-16 08:14:53
duration: 26
billsec: 216

By: gicc (gicc) 2011-06-20 09:55:43.175-0500

Seems issue is on cdr mysql backend, CDR's written to CSV has correct billsec.
Couple of examples from today:
MySQL:
Start: 2011-06-20 08:24:17
Stop: 2011-06-20 08:24:34
billsec: 41
duration: 17
disposition: ANSWERED

CSV:
Start: 2011-06-20 08:24:17
Stop: 2011-06-20 08:24:34
billsec: 0
duration: 17
dispostion: ANSWERED

another one:


MySQL:
Start: 2011-06-20 14:54:38
Stop: 2011-06-20 14:55:06
billsec: 183
duration : 28
disposition: Answered

CSV
Start: 2011-06-20 14:54:38
Stop: 2011-06-20 14:55:06
billsec: 0
duration : 28
disposition: Answered

By: gicc (gicc) 2011-06-20 15:56:28.895-0500

I've checked ~50 CDR's with incorrect billsec.
All of them should have as billsec = 0 (checked with up&down-stream load balancers as well).

So clearly bug is somewhere in cdr_mysql.so
We were not able to reproduce such broken CDR in lab.

By: gicc (gicc) 2011-06-21 05:17:57.497-0500

After next investigaions we found out that issue occurs only if batch=yes is set.
When we changed it batch=no we can't see corrupted CDR's anymore.

I hoppe that this will help to develop resoultion.

By: gicc (gicc) 2011-06-22 18:08:36.980-0500

Now I can confirm for sure, that corruption gets in place when batch=yes.
Number of broken CDR's by day (where billsec > duration) :

2011-06-01 237
2011-06-02 278
2011-06-03 266
2011-06-04 181
2011-06-05 23
2011-06-06 264
2011-06-07 313
2011-06-08 281
2011-06-09 291
2011-06-10 284
2011-06-11 179
2011-06-12 11
2011-06-13 636
2011-06-14 319
2011-06-15 229
2011-06-16 245
2011-06-17 246
2011-06-18 196
2011-06-19 11
2011-06-20 265
2011-06-21 42
2011-06-22 0

On 2011-06-21 we disabled batch cdr processing

By: Stuart Elvish (stuarte) 2011-07-13 11:20:59.588-0500

Have a similar problem with ODBC CDR logging. No changes made to the default configuration files (except un-commenting batch=no in cdr.conf in case there was a bug preventing the default from being applied). Exactly the same symptoms. Billing seconds is populated with the duration of the call and duration of the call has billing seconds.

This issue is affecting Asterisk versions 1.8.4.x (I think .3 and .4 at least) and 1.8.5.x.

By: Stuart Elvish (stuarte) 2011-07-25 04:31:10.572-0500

Some additional information which may help trace the problem.

Using 1.8.5.0, an SQL statement with ${CDR(billsec)} inserts the correct number of billing seconds (the figure matches what is printed in the duration column of the ODBC insert statement for CDR's) when run from the "h" extension. The same call logged to Master.csv shows the correct order (duration, billsec).

I think it is safe to say this issue is isolated to the incorrect generation of the CDR insert statement rather than incorrect "population" of the variable within Asterisk.

I am not a programmer but would really like this problem resolved ASAP. Can anyone give me some pointers where to start looking at this issue? It seems like it should be a fairly simple resolution.

By: Tilghman Lesher (tilghman) 2011-07-26 15:23:02.828-0500

Patch uploaded.  Problem seems to have occurred with the introduction of high resolution time.

By: gicc (gicc) 2011-07-27 04:04:48.727-0500

Does patch resolve issue in cdr_mysql.c as welll? As I saw it's only for cdr_odbc.c