[Home]

Summary:ASTERISK-06850: [patch] Asterisk 1.2.7.1 dying in alawtolin_framein at codec_alaw.c
Reporter:Anton Vazir (vazir)Labels:
Date Opened:2006-04-26 11:53:53Date Closed:2011-06-07 14:03:01
Priority:MajorRegression?No
Status:Closed/CompleteComponents:Core/CodecInterface
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) translate.patch
Description:There is a gdb backtrace given below. Load on the system is 20000-30000 calls per 24 hours, with approximatelly 3000 minutes of conversations per 24 hours. Serving a single E1 from PSTN and connecting than over IAX2 to another asterisk box (gateway) which than uses SIP to interconnect to the internet. IAX2 between * used to utilize Jitterbuffer and PLC.



****** ADDITIONAL INFORMATION ******

#0  alawtolin_framein (pvt=0x825e120, f=0xb7a2c804) at codec_alaw.c:173
173             tmp->outbuf[tmp->tail + x] = AST_ALAW(b[x]);
(gdb) backtrace
#0  alawtolin_framein (pvt=0x825e120, f=0xb7a2c804) at codec_alaw.c:173
#1  0x08069e74 in ast_translate (path=0x8130458, f=0xb7a2c804, consume=1) at translate.c:162
#2  0x080619a5 in ast_read (chan=0xb5e9c710) at channel.c:1933
#3  0xb732c7f0 in wait_for_answer (in=0xb5e9c710, outgoing=0x8153c10, to=0xb67dec10, peerflags=0xb67df568,
   sentringing=0xb67dec14, status=0xb67dedb0 "NOANSWER", statussize=256, busystart=0, nochanstart=0,
   congestionstart=0, priority_jump=0, result=0xb7a2c804) at app_dial.c:660
#4  0xb7329f40 in dial_exec_full (chan=0xb5e9c710, data=0x8153c10, peerflags=0xb67df568) at app_dial.c:1180
ASTERISK-1  0xb7328dc5 in dial_exec (chan=0xb7a2c804, data=0xb7a2c804) at app_dial.c:1619
ASTERISK-2  0x0808e445 in pbx_extension_helper (c=0xb5e9c710, con=0xb7a2c804, context=0xb5e9c860 "signalling",
   exten=0xb5e9c954 "005481099412450049", priority=1, label=0x0, callerid=0x0, action=0) at pbx.c:553
ASTERISK-3  0x0808efea in __ast_pbx_run (c=0xb5e9c710) at pbx.c:2227
ASTERISK-4  0x0808fcdf in pbx_thread (data=0xb7a2c804) at pbx.c:2514
ASTERISK-5  0xb7f15b63 in start_thread () from /lib/tls/libpthread.so.0
ASTERISK-6 0xb7e1018a in clone () from /lib/tls/libc.so.6
(gdb)  

Comments:By: Andrey S Pankov (casper) 2006-04-27 08:55:33

Is there any log entries before it crashes?

By: Andrey S Pankov (casper) 2006-04-27 09:17:32

Can you do in gdb:
print tmp->tail
print x
print b[x]

Is it "generic PLC" related? Is it reproducible without PLC?

By: Anton Vazir (vazir) 2006-04-30 03:12:19

Sorry for delay,
There is the requested output

Program terminated with signal 11, Segmentation fault.
...
#0  alawtolin_framein (pvt=0x82989a8, f=0x817978c) at codec_alaw.c:173
173             tmp->outbuf[tmp->tail + x] = AST_ALAW(b[x]);
(gdb) print tmp->tail
Attempt to extract a component of a value that is not a structure pointer.
(gdb) print x
$1 = 0
(gdb) print b[x]
Cannot access memory at address 0x0
(gdb)

Regarding log,
I've "messages" log, and there is no any unusual entries, it just cuts on the segfault time and than there are startup messages only.

I'll disable PLC for a few days too see. But Receiver Asterisk (with PLC too) does not die.



By: Denis Smirnov (mithraen) 2006-04-30 08:11:19

This patch must fix segfault.

But why frame has data == NULL?

By: Andrey S Pankov (casper) 2006-04-30 08:30:36

That's not the way to fix it, Denis...

By: Anton Vazir (vazir) 2006-04-30 09:46:34

So, what would be the right way? As suggested by Denis, I've putted the following check in the codec_alaw

/* Reset ssindex and signal to frame's specified values */
 b = f->data;
 if( !b ) {
  ast_log(LOG_ERROR, "BAD FRAME RECEIVED. SEGFAULT HERE!\n");
  return 0;
 }
Will see, will it sigseg or not again

By: Denis Smirnov (mithraen) 2006-04-30 09:55:37

Patch, proposed in translate.patch need for fix this bug in all codecs, not only alaw.

But it seems to broke PLC.

By: Andrey S Pankov (casper) 2006-04-30 10:32:17

It would be nice to capture the RTP stream which triggers segfault.
Then it would be easier to fix the bug.
I gave some suggestions to Denis, I hope he is on the right way now ;)

By: Denis Smirnov (mithraen) 2006-05-02 12:26:32

Vazir, can you add printing f->datalen and f->samples in your error logging?

If you not delete yout old backtrace, please show f->datalen and f->samples.

By: Serge Vecher (serge-v) 2006-05-10 11:18:03

Casper, Mithraen:

are you guys working on this issue?

By: Andrey S Pankov (casper) 2006-05-10 13:30:30

vechers: why do you think I'm working on it? ;)

By: Anton Vazir (vazir) 2006-05-10 13:40:03

I've put an empty (b) check in codec_alaw.c and now it does not crash.
I thinking to add frame dump into procedure, but an example code uses ast_verbose, but I can't log verbose traffic, since HDD space on that PC is too low. So when time permits, i'll do different logging, or maybe anyone have another idea? Since bug is nasty.



By: Serge Vecher (serge-v) 2006-05-19 16:41:29

vazir: trying to get to the original problem here is almost impossible without further debugging information. Have you done any code modifications? Can you increase hard-drive space, so that some logging can be done. You should not have low resources for a system handling 30K calls per day.

By: Kevin P. Fleming (kpfleming) 2006-05-19 20:08:17

A significant memory-allocation problem in chan_iax2 was just fixed. Please update your system to the latest SVN branch-1.2 code and try to reproduce the problem again. Thanks!

By: Anton Vazir (vazir) 2006-05-20 10:33:16

Thanks Kevin, will try that way!

2 vechers: I will upgrade the hardware on that PC and enable full debug for a few month. Will see what happens. Although, it's a production PC, so it takes a little bit of time getting to least used time range.

By: Serge Vecher (serge-v) 2006-05-26 09:58:15

vazir: if this is still an issue after trying the 1.2 branch code (rev > 30000) and upgrading hardware, please feel free to reopen with the debugging information provided.

Thank you.