Summary: | ASTERISK-17688: [patch] segfault res_musiconhold.so when called party puts call on hold | ||||||||
Reporter: | Michael Rack (rcrack2k) | Labels: | |||||||
Date Opened: | 2011-04-13 03:46:55 | Date Closed: | 2012-05-29 10:23:26 | ||||||
Priority: | Critical | Regression? | No | ||||||
Status: | Closed/Complete | Components: | Resources/res_musiconhold | ||||||
Versions: | Frequency of Occurrence | ||||||||
Related Issues: |
| ||||||||
Environment: | Attachments: | ( 0) asterisk_crash_havlasm.txt ( 1) backtrace.20110509.txt ( 2) backtrace.txt ( 3) call_map.png ( 4) check_asterisk ( 5) haurein.log ( 6) haurein.striped.log ( 7) havlasm_backtrace_2011-11-02.txt ( 8) res_musiconhold.asterisk-r314015.segfault.ast_strlen_zero.patch | |||||||
Description: | Dear Digitum, we used Asterisk 1.6.0 a long tome before. Now we switched over to Asterisk 1.8.3. The existing configuration worked without any modifications. The problem: When a called party puts our line on hold to transfer the call, asterisk is quit with a segfault in res_musiconhold.so. We can hear our MOH-Class for a random time (2 - 20 seconds that might be the time to transfer), but we have to hear the MOH from the other party, not the MOH from our Asterisk-Server! The big problem to track the bug down is, that this problem is not always reproducible. Sometimes asterisk crashes, sometimes not. But asterisk always crashes, when we can hear our MOH. Asterisk does not crash when we can hear the MOH from the other party. Our peer is a IAX2-Peer (pbx-network.de). We considered this problem also on our backup IAX2-Peer (xlink.at). Currently we run a trunk version of asterisk at revision 306540 but this problem is still not fixed. That is the syslog-message when asterisk quits: Apr 1 09:34:37 voip-01 kernel: [10366699.967654] asterisk[11071]: segfault at 6c003630 ip 00007f598962ba6f sp 00007f59700bfec0 error 4 in res_musiconhold.so[7f5989625000+a000] Apr 6 11:22:35 voip-01 kernel: [10805178.329682] asterisk[26102]: segfault at 68003a00 ip 00007f758594da6f sp 00007f756c2c2ec0 error 4 in res_musiconhold.so[7f7585947000+a000] Apr 7 10:13:52 voip-01 kernel: [10887455.598158] asterisk[26676]: segfault at c4004ba0 ip 00007f94e41aaa6f sp 00007f94caf26ea0 error 4 in res_musiconhold.so[7f94e41a4000+a000] The Asterisk is currently in a Live-Production field, so we could not put our asterisk into a debug-state. ****** ADDITIONAL INFORMATION ****** Asterisk Trunk rev306540 Gentoo Base System release 1.12.13 Linux voip-01 2.6.35.7-amd64-xen-2.6.35.7 #1 SMP Tue Oct 12 10:35:31 Local time zone must be set--see zic x86_64 AMD Athlon(tm) 64 X2 Dual Core Processor 5200+ AuthenticAMD GNU/Linux Running on XEN DOM-U with 768 MB Ram, 1 CPU-Core | ||||||||
Comments: | By: Leif Madsen (lmadsen) 2011-04-13 08:54:10 The only way to move this issue forward is to provide debugging information. This would include DEBUG level logging from the console leading up to the crash, along with a backtrace as described here: * https://wiki.asterisk.org/wiki/display/AST/Collecting+Debug+Information * https://wiki.asterisk.org/wiki/display/AST/Getting+a+Backtrace By: Leif Madsen (lmadsen) 2011-04-14 09:11:08 OK, so now you just need to provide the backtrace. Thanks! By: Michael Rack (rcrack2k) 2011-04-18 03:49:04 I know. Now i've got a core-dump one. The Problem is not always reproducible so i had to wait for a crash. Apr 18 10:37:34 voip-01 kernel: [11839276.928530] asterisk[14579]: segfault at 9c000c30 ip 00007f20bf9c14b1 sp 00007f20a3294260 error 4 in res_musiconhold.so[7f20bf9ba000+c000] This was the Problem reported by KERNEL. The Core-Dump-File is attached. By: Michael Rack (rcrack2k) 2011-04-18 03:53:58 The file could not be uploaded via the "UPLOAD FILE" method, because the filesize of 2.15mb is to big. The file is uploaded on my server: http://www.michaelrack.de/public/download/core-dump.asterisk.bz2 By: Alec Davis (alecdavis) 2011-04-18 03:59:22 try ASTERISK-17378 I think I've seen this before. the patch bug18781.diff3.txt If it's the same, this was fixed in trunk at r310288 By: Michael Rack (rcrack2k) 2011-04-18 05:02:10 Sorry, i currently run trunk rev311466 ... The crash is based on 311466. I think that my problem is not the same as fixed in r310288. Currently i will try the last trunk version. My Segfault is generated in res_musiconhold. So the Problem seems not to be the same as in Issue-Report 0018781. By: Alec Davis (alecdavis) 2011-04-18 05:11:23 RcRaCk2k: with your core-dump, you first need to follow the backtrace info in ~133704 then upload the gdb.txt file that is created from that. By: Michael Rack (rcrack2k) 2011-04-18 06:26:59 ok, so gdb requires the original executable of asterisk? Damn. I've installed the last TRUNK before knowing that the core-dump is useless. Now i have to wait for a new crash + core-dump. Sorry guys. By: Michael Rack (rcrack2k) 2011-04-20 08:55:46 So, now i've got the right backtrace. return (!s || (*s == '\0')); throws an segmentation fault @include/asterisk/strings.h:65 called from res_musiconhold.c:1311 in function local_ast_moh_start. Hope that the problem could be located and fixed as far as possible. By: Michael Rack (rcrack2k) 2011-04-28 04:22:07 So guys, sorry for interrupting, but is anyone checking this problem? Is there a workaround, so that i can run asterisk without crashing? The System is used on a production server and in a production environment. Currently we have a crontab installed that starts asterisk after it has gone. But this state is not optimal because on segfault all current calls will be disconnected. By: Michael Rack (rcrack2k) 2011-04-28 05:26:04 Hi, i patched the line in "res_musiconhold.c" that is passing "mclass" (out of bounds / null) to the static function ast_strlen_zero in "include/strings.h". I hope that this patch will save my asterisk for future crashes. I am not familiar with C / C++ and hope that my patch does not create other problems. I am a JAVA / PHP Programmer and i mean, that the line "return (!s || (*s == '\0'));" in include/strings.h should not fail in segmentation fault. !s should prevent asterisk from setting '\0' to the address but it did not. So i hope you can track the problem more down then i can. By: mickeyratt (mickeyratt) 2011-05-04 07:18:43 Dear All I encountered this problem with latest (1.6) 1.6.2.18 asterisk version too. I using on production server for 1 month, so it is very frustrating. In kern.log: May 4 13:44:08 digium kernel: [3472685.931049] asterisk[26715]: segfault at 880074a0 ip 00007f05acfec5db sp 00007f05945b4740 error 4 in res_musiconhold.so[7f05acfe6000+a000] Dear RcRaCk2k, can you share your "crontab" solution? Or may I use safe_asterisk "to eliminate" this crash-problem? By: Michael Rack (rcrack2k) 2011-05-04 08:47:44 I've attached the script that checks asterisk is alive. PS: My Patch was not work. I had a segfault this day again, but the time to crash was a little bit longer then before. Asterisk was not run with option -g so i have to wait for a crash again, sorry. By: mickeyratt (mickeyratt) 2011-05-04 09:16:19 RcRaCk2k Thank you for your check-script! By: Michael Rack (rcrack2k) 2011-05-09 07:06:09 So... My patch was not working... I could not check the out of bounds... I hope someone else can make a patch. By: Martin Havlas (havlasm) 2011-10-27 02:26:00.979-0500 asterisk_crash_havlasm.txt = coredump from linux (debian 6.2.1 x64) - attached well, according to issue: ASTERISK-18756 problem said here is still actual. situation, that causes crash is attached as the call_map.png By: Martin Havlas (havlasm) 2011-11-02 05:39:32.683-0500 well..... I have unloaded res_musiconhold.so and hoped that will solve problem temporarily. Unfortunately it crashed down again - even that MOH was not loaded. I have next coredump for you. partial info: it crashes down only when there is a IAX2 connection between asterisk 1.8.x and 1.2.x (coredum attached: havlasm_backtrace_2011-11-02.txt) By: Martin Havlas (havlasm) 2011-11-03 05:49:47.175-0500 Im desperate. Not the day when it does not fall. Again, same problem. Again IAX2 trunk 1.8.x vs 1.2.x, Again MOH... I have to switch to SIP trunk those that have old Asterisk. Any idea if compile flag "IAX_OLD_FIND" can affect functionality? By: Martin Havlas (havlasm) 2012-01-03 06:55:15.120-0600 Why developers ignore this issue? At least some expression would be great. By: Paul Belanger (pabelanger) 2012-01-03 11:00:38.637-0600 Not ignoring, there are over 750 open issue on the tracker, it takes time to triage them all. By: Misha Slyusarev (misha.slyusarev) 2012-04-04 10:56:29.732-0500 Hi everyone! Is there any update to this issue? I've got same problem ASTERISK-19636 By: Misha Slyusarev (misha.slyusarev) 2012-04-05 10:50:35.307-0500 Ok, I've got the same issue and it looks like the problem is in the use of 64-bit system. In my case I've switched to 32-bit and everything works fine. Hope that will be helpful. By: Michael L. Young (elguero) 2012-05-29 10:24:46.413-0500 ASTERISK-19597 should fix this issue. |