[Home]

Summary:ASTERISK-06751: segfault in malloc and calloc
Reporter:Roy Sigurd Karlsbakk (rkarlsba)Labels:
Date Opened:2006-04-11 05:51:15Date Closed:2006-06-23 14:49:17
Priority:CriticalRegression?No
Status:Closed/CompleteComponents:Core/General
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) bt-bt-full.txt
Description:Asterisk just crashed during SIP INVITE, or so it seems. See attached backtrace for debug info
Comments:By: Andrey S Pankov (casper) 2006-04-11 06:00:38

It fails in malloc. It can be hardware or system installation/configuration issues.

By: Andrey S Pankov (casper) 2006-04-11 06:04:33

The only suspicious log entries are:
sched.c: Attempted to delete nonexistent schedule entry...

By: Roy Sigurd Karlsbakk (rkarlsba) 2006-04-11 07:13:19

it's not installation/config and i seriously doubt it's hardware. the box has been running with this installation for months and has only been restarted on crashes and for upgrades.

By: Olle Johansson (oej) 2006-04-11 10:59:50

Roy, didn't you have a scheduler error before?

By: Roy Sigurd Karlsbakk (rkarlsba) 2006-04-12 17:04:04

Olle, What do you mean?

By: wILMAR cAMPOS (willcampos) 2006-05-13 10:28:44

i also have a crash for the same reason.

#0  0x4017fffe in malloc_consolidate () from /lib/libc.so.6
No symbol table info available.
#1  0x4017f703 in _int_malloc () from /lib/libc.so.6
No symbol table info available.
#2  0x4017f080 in calloc () from /lib/libc.so.6
No symbol table info available.
#3  0x406c2f12 in sip_alloc (callid=0xbe1febb8 "d80dffffc8feffff@192.168.40.100", sin=0xbe1fe7a4, useglobal_nat=1, intended_method=2) at chan_sip.c:3038
       p = (struct sip_pvt *) 0xbe1fe7a4
#4  0x406d73b9 in find_call (req=0xbe1fe7b4, sin=0x812e018, intended_method=135454744) at chan_sip.c:3195
       found = 1081035376
       p = (struct sip_pvt *) 0xbe1febb8
       callid = 0xbe1febb8 "d80dffffc8feffff@192.168.40.100"
       tag = 0x406ea331 ""
       totag = '\0' <repeats 56 times>, "ûÓ\002@ô_#@\234æ\037¾¸Ó\002@¼æ\037¾ÏV\030@pÞ\002@ûÓ\002@ô_#\000$Do@\030\000\000\000\002\000\000\000àû\037¾ô/\003@àû\037¾\210Jo@ìæ\037¾­ª\002@"
       fromtag = '\0' <repeats 127 times>
ASTERISK-1  0x406d5aa7 in sipsock_read (id=0x81703e8, fd=18, events=1, ignore=0x0) at chan_sip.c:11151
       req = {rlPart1 = 0xbe1fe9cc "REGISTER", rlPart2 = 0xbe1fe9d5 "sip:sipserver", len = 705, headers = 13, method = 2, header = {0xbe1fe9cc "REGISTER",
   0xbe1fe9f4 "Via: SIP/2.0/UDP 192.168.40.100:5062;branch=z9hG4bK2af9ffff5deeffff",
   0xbe1fea39 "From: \"Luis FXO House\" <sip:17000000002@sipserver;user=phone>;tag=0ca300000ccfffff", 0xbe1fea95 "To: <sip:17000000002@sipserver;user=phone>",
   0xbe1feac9 "Contact: <sip:17000000002@192.168.40.100:5062;user=phone>",
   0xbe1feb04 "Authorization: Digest username=\"17000000002\", realm=\"asterisk\", algorithm=MD5, uri=\"sip:sipserver\", nonce=\"4f9c19f9\", response=\"1464111d85216bc20148389bd69df4b5\"", 0xbe1febaf "Call-ID: d80dffffc8feffff@192.168.40.100", 0xbe1febd9 "CSeq: 3888 REGISTER", 0xbe1febee "Expires: 180",
   0xbe1febfc "User-Agent: Grandstream HT488 1.0.2.16", 0xbe1fec24 "Max-Forwards: 70", 0xbe1fec36 "Allow: INVITE,ACK,CANCEL,BYE,NOTIFY,REFER,OPTIONS,INFO,SUBSCRIBE",
   0xbe1fec78 "Content-Length: 0", 0xbe1fec8b "", 0x0 <repeats 50 times>}, lines = 0, line = {0xbe1fec8d "", 0x0 <repeats 63 times>},
 data = "REGISTER\000sip:sipserver\000SIP/2.0\000\000Via: SIP/2.0/UDP 192.168.40.100:5062;branch=z9hG4bK2af9ffff5deeffff\000\000From: \"Luis FXO House\" <sip:17000000002@sipserver;user=phone>;tag=0ca300000ccfffff\000"..., debug = 0, flags = 0}
       sin = {sin_family = 2, sin_port = 50707, sin_addr = {s_addr = 61597976}, sin_zero = "\000\000\000\000\000\000\000"}
       p = (struct sip_pvt *) 0xbe1fe7b4
       res = 2
       len = 16
       nounlock = 0
       recount = 0
       iabuf = '\0' <repeats 15 times>
ASTERISK-2  0x080558cd in ast_io_wait (ioc=0x816b0b8, howlong=135454744) at io.c:284
       res = 1
       x = 0

By: Andrey S Pankov (casper) 2006-05-15 16:05:00

roy: please contact me directly...

Could you give some details on distro/platform/gcc version please?
I assume that should be slack 10.1/10.2 with gcc 3.3.4/3.3.6 :)

By: wILMAR cAMPOS (willcampos) 2006-05-15 21:41:08

Reading specs from /usr/lib/gcc-lib/i486-slackware-linux/3.3.4/specs
Configured with: ../gcc-3.3.4/configure --prefix=/usr --enable-shared --enable-threads=posix --enable-__cxa_atexit --disable-checking --with-gnu-ld --verbose --target=i486-slackware-linux --host=i486-slackware-linux
Thread model: posix
gcc version 3.3.4


Asterisk 1.2.7.1

Slackware 10.1 kernel 2.4

By: Roy Sigurd Karlsbakk (rkarlsba) 2006-05-16 03:26:19

I'm on debian sarge, hand-compiled kernel 2.6.16.4

gcc (GCC) 3.4.4 20050314 (prerelease) (Debian 3.4.3-13)

By: Olle Johansson (oej) 2006-05-16 03:39:34

One fails in malloc, another in calloc... Bad. Not really a SIP problem though. We need other people to look into this.

By: Andrey S Pankov (casper) 2006-05-16 08:15:25

That's a bug in debian and slackware and several other systems are affected.

Since this is a support issue IMO and nobody doesn't have a solution for now
I'd ask roy and willcampos to contact me directly (my email is available
by searching dev@ mailing list). Sorry...

By: wILMAR cAMPOS (willcampos) 2006-05-16 10:03:06

I am sorry casper, i am very new at lists, and don't know how to get your address.

wilmar.campos@gmail.com is my direct email.

thanks.

By: Kevin P. Fleming (kpfleming) 2006-05-18 16:32:14

How is this a 'support issue' and they should contact casper directly? I'm confused :-)

Clearly something is badly corrupting memory here. The stack trace from willcampos shows a completely invalid value for the third argument to find_call(), among other things.

Do you two have _any_ idea what sequence of events is leading to this?

By: wILMAR cAMPOS (willcampos) 2006-05-18 17:04:54

For what i see is a problem on chan_sip, what i notice on the SVN trunk is being used wrappers like ast_calloc to handle this memory assigments.

Any way, my asterisk is crashing each 2 days, and unfortunelly i dont have the money to pay for the support casper ask.

I would love to help to solve this problem, but i simple dont know how.

By: Serge Vecher (serge-v) 2006-05-19 11:29:53

willcampos: can you please attach a backtrace from non-optimized build of Asterisk after the crash. Also, if possible, turn on SIP debug and attach that here as well.

Thanks



By: Denis Smirnov (mithraen) 2006-05-20 13:01:43

malloc/calloc segfaults mostly when malloc system data is corrupted.

It mean, that something write prior/after real data.

It's not glibc bug, it _can_ be bug in gcc optimizer (that is rare), or it's bug in code.

By: wILMAR cAMPOS (willcampos) 2006-05-20 13:05:41

I will update with the backtrace soon...

I am waiting the segfault to happen again...

By: Roy Sigurd Karlsbakk (rkarlsba) 2006-05-23 03:49:47

after unsuccessfully having tried to contact casper off the list to get some info about these compiler problems, I still look for a solution here....

thanks

roy

By: Serge Vecher (serge-v) 2006-05-23 09:02:34

rkarlsba: does this still occur in latest _unmodified_ 1.2 branch code?

By: Andrey S Pankov (casper) 2006-05-23 19:09:59

>after unsuccessfully having tried to contact casper off the list to get some info
>about these compiler problems, I still look for a solution here....

roy: sorry for that. <censored part goes here...>

By: Roy Sigurd Karlsbakk (rkarlsba) 2006-05-23 19:15:39

casper,
If you ask people to contact you off the bugtracker, you should perhaps follow up.
If you do /not/ want people to contact you about bugs, perhaps you'd better stick from asking people from do so.

roy

By: Serge Vecher (serge-v) 2006-05-30 15:17:56

willcampos: does your patch in 6831 fix this issue also?

For the future, always post backtraces and logs as attachments, please. Thank you.

By: wILMAR cAMPOS (willcampos) 2006-05-30 21:31:46

Unfortunelly no, asterisk continue crashing on kernel 2.4.29.

I have upgrade another system from 2.4.29 to 2.6.16.18 and is very stable since 1 week, not crashes.

By: Serge Vecher (serge-v) 2006-06-08 15:46:48

willcampos: if the kernel upgrade resolved the issue, then I don't believe Asterisk is at fault.

again, is this reproducible in 1.2.9.1 with unpatched sources?

By: Roy Sigurd Karlsbakk (rkarlsba) 2006-06-09 08:51:38

I have this problem on recent kernels

By: Serge Vecher (serge-v) 2006-06-09 08:53:41

>I have this problem on recent kernels
with unpatched 1.2.9.1?

By: Roy Sigurd Karlsbakk (rkarlsba) 2006-06-09 09:02:27

don't know yet.
currently running 1.2.6
i'm upgrading this weekend

By: wILMAR cAMPOS (willcampos) 2006-06-09 12:22:09

For me the problem stops...

I am Using 1.2.8 and is working smoth.

I think that i notice, was with the new kernel the amount of free memory increases, so i dont know if this happens with low memory?

The true is on my system is not happening anymore, I have 2.6.16.18 kernel now.

By: Serge Vecher (serge-v) 2006-06-19 13:18:32

problem no more with 1.2.9.1? Ready to close?

By: Roy Sigurd Karlsbakk (rkarlsba) 2006-06-21 16:39:47

Please wait a week or so

By: Roy Sigurd Karlsbakk (rkarlsba) 2006-06-23 06:17:41

well, asterisk has been running stably for almost a week now, so I guess this is fixed. I can reopen this if it crashes again

roy

By: Serge Vecher (serge-v) 2006-06-23 09:01:07

rkarlsba: what revision is this running stable on and which revision do you think has fixed it?

By: Roy Sigurd Karlsbakk (rkarlsba) 2006-06-23 13:32:52

running 1.2.9.1 now

By: Serge Vecher (serge-v) 2006-06-23 14:49:15

Well, since both royk and willcampos do not observed crashes any longer, closing the bug for now. If the exact same problem reoccurs, please feel free to reopen the bug with a new backtrace.