Summary:ASTERISK-20947: astcanary exits immediately because of wrong pid argument
Reporter:Jakob Hirsch (jhirsch)Labels:
Date Opened:2013-01-17 04:35:09.000-0600Date Closed:2013-01-18 17:34:20.000-0600
Versions:10.12.0 Frequency of
is caused byASTERISK-19463 Asterisk deadlocks during startup with mutex errors
is related toASTERISK-20945 "Unable to connect to remote asterisk" message on service asterisk start, even though service is running
Environment:all that HAVE_WORKING_FORKAttachments:( 0) asterisk-10.12.0.astcanary_ppid.diff
Description:I wanted to update our machines to 10.12.0 (as an intermediate update
until we upgrade to 11) and noticed that astcanary is not running any
more, not even right after starting asterisk, resulting in asterisk to
reduce it's priority (with a rather lengthy and slightly annoying
message). This only happens when starting asterisk in the background,

With strace I found out that astcanary being started with a wrong ppid:

> 29358 18:19:04.138884 execve("/usr/sbin/astcanary", ["astcanary", "/var/run/asterisk/alt.asterisk.c"..., "29356"], [/* 28 vars */] <unfinished ...>
> 29358 18:19:04.146543 setpriority(PRIO_PROCESS, 0, 0 <unfinished ...>
> 29358 18:19:04.146568 <... setpriority resumed> ) = 0
> 29358 18:19:04.146593 getppid( <unfinished ...>
> 29358 18:19:04.146633 <... getppid resumed> ) = 29357
> 29358 18:19:04.146663 exit_group(0)     = ?

Notice 29356 vs. 29357.

The reason is obviously a change made to main/asterisk.c. In 10.11.1,
ast_mainpid is updated right after daemon(), in 10.12.0 it is updated
_after_ astcanary is started, so astcanary gets the old pid (the process
starting doing the daemon()) and therefore exits immediately because it checks the ppid argument against the return value of getppid().

The easy fix is to update ast_mainpid after daemon(), as in the attached patch.
Comments:By: Jakob Hirsch (jhirsch) 2013-01-17 04:35:48.272-0600

patch to fix issue