Summary: | ASTERISK-19597: Failure to pass NULL data pointer with AST_CONTROL_HOLD frame causes crash when MOH is started | ||||||||||
Reporter: | mgrobecker (mgrobecker) | Labels: | |||||||||
Date Opened: | 2012-03-27 11:00:21 | Date Closed: | 2012-05-25 11:33:52 | ||||||||
Priority: | Major | Regression? | |||||||||
Status: | Closed/Complete | Components: | Channels/chan_iax2 Channels/chan_sip/General Resources/res_musiconhold Resources/res_realtime | ||||||||
Versions: | 1.8.7.2 1.8.10.1 | Frequency of Occurrence | Occasional | ||||||||
Related Issues: |
| ||||||||||
Environment: | Debian Squeeze amd64 | Attachments: | ( 0) asterisk_backtrace.txt ( 1) backtrace-issue-asterisk-19597-1.txt ( 2) jira_asterisk_19597_v1.8.patch ( 3) seanbright_moh_test_cli_command.diff | ||||||||
Description: | We have some phones and Asterisk boxes connected to a central server. The setup looks like this: Phone -> Asterisk Box -> Central Asterisk -> PSTN via IAX When a phone parks a channel the Asterisk Box seems to forward this "on Hold" information to the central Asterisk which tries to play music on hold. If this occurs I get Messages like this on the CLI: -- Music class ▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒W_VYC@LIOJtNNBYYP▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒VP^D[DC[^X]RR▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒UUWVWVPQQQWWWUU▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒UTVVQQPSSQQQQVWUUU▒հ requested but no musiconhold loaded. -- Music class requested but no musiconhold loaded. There seems to be an uninitialized variable resp. a pointer which targets to program code in memory. Sometimes the central Asterisk crashes at this point (maybe because the OS kills the Asterisk process due to address violations?). I have no backtrace at the moment because this is a productive system and I need to recompile it in order to generate dumps and backtraces. If you really need them, I will try to generate one as soon as possible. EDIT: These are the lines on the CLI around the Music on Hold lines: [2012-03-27 17:10:29] DEBUG[11408]: channel.c:7372 ast_channel_bridge: Returning from native bridge, channels: IAX2/81.xxxxxxxx:4569-1924, IAX2/188.xxxxxxxx:4569-2158 (<-- call got parked here) -- Music class ▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒W_VYC@LIOJtNNBYYP▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒VP^D[DC[^X]RR▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒UUWVWVPQQQWWWUU▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒UTVVQQPSSQQQQVWUUU▒հ requested but no musiconhold loaded. [2012-03-27 17:10:30] DEBUG[11408]: channel.c:7372 ast_channel_bridge: Returning from native bridge, channels: IAX2/81.xxxxxxx:4569-1924, IAX2/188.xxxxxxx:4569-2158 [2012-03-27 17:10:30] DEBUG[11408]: channel.c:7372 ast_channel_bridge: Returning from native bridge, channels: IAX2/81.xxxxxxx:4569-1924, IAX2/188.xxxxxxx:4569-2158 -- Music class requested but no musiconhold loaded. [2012-03-27 17:10:34] DEBUG[11318]: chan_iax2.c:2396 peercnt_remove: ip callno count decremented to 5 for 188.xxxxxxx [2012-03-27 17:10:34] DEBUG[11318]: chan_iax2.c:2396 peercnt_remove: ip callno count decremented to 5 for 81.xxxxxxx .... [2012-03-27 17:10:50] DEBUG[11327]: chan_iax2.c:9429 log_jitterstats: JB STATS:IAX2/81.xxxxxxx:4569-16306 ping=6 ljitterms=-1 ljbdelayms=0 ltotlost=-1 lrecentlosspct=-1 ldropped=0 looo=-1 lrecvd=-1 rjitterms=0 rjbdelayms=40 rtotlost=0 rrecentlosspct=0 rdropped=0 rooo=0 rrecvd=1 [2012-03-27 17:10:50] DEBUG[11328]: chan_iax2.c:9429 log_jitterstats: JB STATS:IAX2/188.xxxxxxx:4569-2176 ping=3 ljitterms=-1 ljbdelayms=0 ltotlost=-1 lrecentlosspct=-1 ldropped=0 looo=-1 lrecvd=-1 rjitterms=0 rjbdelayms=40 rtotlost=0 rrecentlosspct=0 rdropped=0 rooo=0 rrecvd=1 [2012-03-27 17:10:57] DEBUG[11408]: channel.c:7372 ast_channel_bridge: Returning from native bridge, channels: IAX2/81.xxxxxxx:4569-1924, IAX2/188.xxxxxxx:4569-2158 (<-- call got parked again here) (Asterisk died here) | ||||||||||
Comments: | By: Matt Jordan (mjordan) 2012-03-27 12:49:01.403-0500 Thank you for your bug report. In order to move your issue forward, we require a backtrace[1] from the core file produced after the crash. Also, be sure you have DONT_OPTIMIZE enabled in menuselect within the Compiler Flags section, then: make install After enabling, reproduce the crash, and then execute the backtrace[1] instructions. When complete, attach that file to this issue report. [1] https://wiki.asterisk.org/wiki/display/AST/Getting+a+Backtrace Note that there was some work done recently in call parking to fix some uninitialized values. The backtrace should illustrate whether or not this is the same issue. By: mgrobecker (mgrobecker) 2012-03-29 07:47:47.621-0500 Sorry for writing this late. I can try to create a backtrace on weekend since the system isn't used at this time. By: mgrobecker (mgrobecker) 2012-04-02 00:32:22.648-0500 OK, I've done it ;-) We updated this morning to 1.8.10 (the latest stable on Weekend) because you mentioned that it's possible the bug is already fixed. Since this is a production system we cannot make any wild experiments ;-) So the attached backtrace is made with Asterisk 1.8.10 ! By: mgrobecker (mgrobecker) 2012-04-02 00:43:23.139-0500 Just a note: In Asterisk 1.8.11 I get the same issue ;-) By: Matt Jordan (mjordan) 2012-04-02 12:51:48.478-0500 The fix for uninitialized values was done in call parking. This is occurring due to a non-NULL junk data pointer being passed with an AST_CONTROL_FRAME for an AST_CONTROL_HOLD, where that data pointer should specify MOH class instead. Since that isn't call parking, we haven't addressed this issue. From your logs, you've connected something to an IAX2 call. What is the IAX2 call in this case being bridged with? By: mgrobecker (mgrobecker) 2012-04-03 06:42:11.472-0500 The call has been bridged with another IAX2 channel. Setup looks as follows: IAX2 peer -> Server -> IAX2 PSTN gateway By: Matt Jordan (mjordan) 2012-04-13 09:24:06.942-0500 I've attached a code contribution from seanbright that adds a CLI command that will hold/unhold an IAX2 channel. In discussions with him on #asterisk-dev, he found that repeated use of this command will oftentimes reproduce this issue's crash. By: Sean Bright (seanbright) 2012-04-13 10:58:52.826-0500 *CLI> iax2 kill <IAX2 channel name> By: Michael L. Young (elguero) 2012-04-13 15:16:11.132-0500 I Saw Matt and Sean discussing this on IRC, which sparked my interest. On 10.3.0, I tried the following and it seems like the same issue may be occurring: {noformat} IAX Client -> Asterisk A <-IAX2 trunk-> Asterisk B -> PRI SIP Client -> Asterisk A <-IAX2 trunk-> Asterisk B -> PRI {noformat} On Asterisk A, I set mohinterpret=passthrough in the global section of iax.conf, leaving everything else to their default values on Asterisk A and B. Whenever the IAX client or the SIP client place the call on HOLD, Asterisk B segfaults. mohinterpret and mohsuggest on Asterisk B are set to the default values. Hopefully this extra information helps to track down the issue. By: Richard Mudgett (rmudgett) 2012-05-23 19:06:36.002-0500 [^jira_asterisk_19597_v1.8.patch] fixes IAX receiving HOLD without suggested MOH class crash. It also fixes a memcpy size error when queueing signaling frames before connect. By: Michael L. Young (elguero) 2012-05-25 09:52:45.150-0500 I can confirm that the patch fixes the crash with my setup on Asterisk 10. |