Summary: | ASTERISK-06735: Asterisk randomly segfaults - Appears to be chan_iax2 | ||
Reporter: | Trevor Hammonds (trevmeister) | Labels: | |
Date Opened: | 2006-04-08 09:11:39 | Date Closed: | 2006-05-11 04:38:59 |
Priority: | Critical | Regression? | No |
Status: | Closed/Complete | Components: | Core/General |
Versions: | Frequency of Occurrence | ||
Related Issues: | |||
Environment: | Attachments: | ( 0) output.txt ( 1) output2.txt ( 2) output2-server2.txt ( 3) output-server2.txt | |
Description: | Not sure why this is happening, but it is happening very often since updating two days ago. Server is a Dell PowerEdge 2850, dual Xeon 3GHz, 2GB RAM, CentOS 4.3, Kernel 2.6.9-34.ELsmp Let me know if there is any more information I may provide that can help... | ||
Comments: | By: Andrey S Pankov (casper) 2006-04-08 16:21:13 Maybe some logs... By: Andrey S Pankov (casper) 2006-04-08 16:24:28 print iaxs[callno]->sockfd print iaxs[callno]->addr or something more appropiate (at line 1629) in gdb By: Trevor Hammonds (trevmeister) 2006-04-08 22:22:39 FYI. I downgraded several revisions 'till I found a stable revision. SVN rev 16006 has not crashed for several hours (all day), whereas 16306 was the last rev that crashed with the same condition with regularity. By: Trevor Hammonds (trevmeister) 2006-04-08 22:27:14 Casper, Sorry, I have no idea how to do what you are asking. I will revert to the most current revision and get the info you need, if you let me know what I should do. As to logs, they do not indicate anything before the crash. By: BJ Weschke (bweschke) 2006-04-10 19:57:48 fixed in r16386 of /trunk By: Trevor Hammonds (trevmeister) 2006-04-11 04:36:21 Sorry to re-open this, but this issue was not fixed in r16386. I tried r16759, r16745, r16671, r16558, and r16386. All had the same random segfault. IAX calls were not usually in progress when the crashes happened. By: Andrey S Pankov (casper) 2006-04-11 04:46:47 Any logs with 'set verbose 4', 'set debug 4', 'iax2 debug' and log output enabled for warning,notice,error,verbose,debug? Can you do in gdb: (gdb) bt <press Enter> (gdb) print iaxs[callno]->sockfd <press Enter> (gdb) print iaxs[callno]->addr <press Enter> It seems like sockfd or addr is null there... By: BJ Weschke (bweschke) 2006-04-11 07:00:43 Trevmeister - we're up to 19000+ with the commits at this point and chan_iax2 has had more fixes/improvements since 16386 which was a pretty significant bug fix. If you can, please test on the most current /trunk and post a complete bt to this bug and we'll get the right folks to take a look at it. Thanks. By: Trevor Hammonds (trevmeister) 2006-04-11 09:27:14 Updated to latest SVN trunk. Asterisk died within a couple of hours. I have reverted to the last known stable revision (16006) for my setup. I have attached the latest gdb information. The first line of the file, however, is what appeared on the console: *** glibc detected *** double free or corruption (!prev): 0x0000002a97c61490 *** I don't see anything mentioning IAX in this baktrace, so I suspect this has nothing to do with chan_iax. Perhaps someone can make a correllation between the two backtraces? Thanks again, guys. By: Trevor Hammonds (trevmeister) 2006-04-11 09:47:53 The "output-server2.txt" file is from an identically-configured server, sans the Sangoma A104D. It is running SVN-trunk-r19160, and I will leave it at that revision, as it is a less-critical server. Please note that this backtrace is nearly identical to the original... Is it possible that this is related to libpthread or libc from CentOS 4.3 x86_64? By: BJ Weschke (bweschke) 2006-04-11 09:53:05 joshnet: this one is pretty odd. Can you take a look at the chan_iax2 bt's? By: BJ Weschke (bweschke) 2006-04-11 10:04:26 Trevmeister - can you attach the relevant sections of extensions.conf when these crashes are happening? There's a few of us now scratching our heads on this one. By: BJ Weschke (bweschke) 2006-04-11 10:07:46 moving to core since we don't have a real good idea what's causing various parts of the platform to dump. By: Joshua C. Colp (jcolp) 2006-04-11 10:12:51 Would it be possible to get access to one of the boxes that are exhibiting this problem so that I can put in some extra debug information so we can see what's causing the segfault? By: Trevor Hammonds (trevmeister) 2006-04-11 22:20:09 joshnet: Certainly. Contact me directly at <address removed>. bweschke: The "server2" crashes are happening when the server is just sitting idle. Do you want the entire extenstions.conf? Here is the message from /var/log/messages from the most recent crash on "server 2": Apr 11 20:02:11 XXXXXXXX kernel: asterisk[31669]: segfault at 0000000000000000 rip 0000002a96b5d005 rsp 0000000040595130 error 4 Backtrace from this crash has been posted as output2-server2.txt. By: Serge Vecher (serge-v) 2006-05-05 15:28:40 trevmeister: any chance of an update here to see if commit in r24422 fixes the issue? By: Mark Spencer (markster) 2006-05-11 03:55:53 Trevmeister: please confirm whether the issue has now been fixed in latest trunk. Thanks! By: Trevor Hammonds (trevmeister) 2006-05-11 04:30:39 The problem appears to have been corrected, though I have not tried Trunk on a heavily-loaded server, yet. I will re-open the bug if it is still an issue in the future. Thanks for all your great work, guys. By: Joshua C. Colp (jcolp) 2006-05-11 04:38:59 Issue has not reappeared on reporter's machine. If it does, don't hesitate to reopen. Have a great day! |