Summary:ASTERISK-04823: musiconhold leaving zombie mpg123 processes behind
Reporter:Kiss Karoly (crash)Labels:
Date Opened:2005-08-12 09:24:42Date Closed:2008-01-15 15:48:25.000-0600
Versions:Frequency of
Environment:Attachments:( 0) asterisk.sigfix.patch
Description:res_musiconhold.c around line 503 has the following code:
class->srcfd = -1;
if (class->pid) {
       kill(class->pid, SIGKILL);
       class->pid = 0;
The problem with this code is that it sends SIGKILL to the audio player process, which in my case was mpg123 v0.59r on a Debian Sarge box.
If I have buffering enabled in mpg123, it forks a second thread for a decoder process and the main thread just writes data from the buffer to the output.
The mpg123 code is written fine because it waits for it's children, or at least tries because after receiving a SIGKILL it will probably fail.

But then again this is only my opinion ...


I saw this in many versions.
Comments:By: Michael Jerris (mikej) 2005-08-12 09:32:47

Please look at ASTERISK-4416 and attempt the mods to that patch discussed in the bug.  And send feedback.  This is not major becuase there are multiple workarounds to this not requiring mpg123.

By: Michael Jerris (mikej) 2005-08-23 07:17:29

we can not follow up on this bug without feedback.  Please re-open the bug if you are able to try the attached suggestions.

By: Kiss Karoly (crash) 2005-08-31 03:34:51

I have made some more digging ( tried the suggested patch, and also tried other applications like madplay for moh ) and came to the conclusion that asterisk has a good child handler, the only problem arises when someone uses some function of asterisk that calls ast_safe_system() function since this redirects the child handler fuction to a null function in order to be able to collect the exit status of the process executed using the ast_safe_system() function.
Since all these functions are in asterisk.c maybe there is a safe solution of somehow using the child handler function in the ast_safe_system() and not redirecting it to the null handler. I think this would be much safer and would also improve behaviour. I would do this but I am not certain I have the necessary knowledge of asterisk internals to do a good job.

By: Michael Jerris (mikej) 2005-09-09 21:51:14

why not give it a try, and them we will at least have somthing to review as it appears no one else is jumping on this.

By: Kiss Karoly (crash) 2005-09-12 14:29:22

Ok Mike, I took your advice and a shot at this. It seems to be working for me but it's only a workaround.
The real problem seems to be that the SIGCHLD handler is changed by the ast_safe_system to null_sig_handler and somehow not set back ( this is what I can't figure out how ). I have a hunch that it's a problem of two threads interacting somehow.
The patch only disables SIGCHLD handling for the time of the ast_safe_system run. This ofcourse has it's obvious problems, but we always have the same SIGCHLD handler and child processes are properly reaped.
Any thoughts ?

By: Kevin P. Fleming (kpfleming) 2005-09-14 18:43:04

ast_safe_system() had some thread-safety issues; these are now fixed in CVS HEAD. If there is still a problem on your system, please request a reopen of this issue.

By: Digium Subversion (svnbot) 2008-01-15 15:48:25.000-0600

Repository: asterisk
Revision: 6603

U   trunk/asterisk.c

r6603 | kpfleming | 2008-01-15 15:48:25 -0600 (Tue, 15 Jan 2008) | 2 lines

ensure that ast_safe_system() is thread-safe (issue ASTERISK-4823)