Summary:ASTERISK-17072: [patch] Segfault when ExternalIVR() app. immediately sends S command due to race condition
Reporter:Stephen A. Brandli (stevebrandli)Labels:
Date Opened:2010-12-06 21:01:26.000-0600Date Closed:2011-01-03 08:09:31.000-0600
Versions:Frequency of
Environment:Attachments:( 0) backtrace.txt
( 1) externalivr.patch
Description:Segfault occurs if the ExternalIVR() application sends an S command faster than the sound generator can setup and "play silence."  The segfault occurs because the code believes it must send a T event but there are no sound files current playing.

Entire extensions.conf:

Entire python script referenced above:
import sys
import time

Just dial extension 70.  Notice the if the time.sleep(1) line in the python script is uncommented, there is no problem.  I assume it is possible that other setups will have trouble reproducing because application execution takes longer.

Please consider relating issue ASTERISK-16095.


The problem appears to be this: app_exec() (which is in app_externalivr.c) creates the ivr_localuser (the pointer to which is often referred to as u). It does not initialize the playing_silence field of that struct. It then calls ast_activate_generator(), passing in that struct to be passed back to the handlers such as gen_generate(). ast_activate_generator() does an ast_settimeout(), which, near as I can tell, causes generator_force() to be called on a different thread. generator_force() calls the above-mentioned gen_generate(), which ultimately calls gen_nextfile(). It is gen_nextfile() that sets or clears the u->playingsilence field. Meanwhile, app_exec() has called eivr_comm(). This function gets the first file to play from the Python script above. In processing that command--look for EIVR_CMD_SQUE--it checks both u->abort_current_sound and u->playing_silence. However, playing_silence is false not having yet been set by gen_nextfile(). So the code tries to send a T event to the Python script even though there is nothing being played. It chokes on the send_eivr_event() call because the entry->filename parameter is not valid.

Probably should just init playing_silence when abort_current_sound is initialized in app_exec().
Comments:By: David Ruggles (thedavidfactor) 2010-12-11 11:57:56.000-0600

It appears to definitely be a timing issue. I'm running asterisk in a VM and it only crashed every once in a while, but I was able to duplicate the problem. If I turned verbosity up to four or higher the I couldn't get it to crash at all. My only concern with the fix suggested was the possibility of problems when ExternalIVR was run on an unanswered channel, however I tested it and it didn't cause any problems. I am attaching a patch that initializes playing_silence to 1 when the struct is first initialized. Please test and confirm this fixes the problem in your environment.

By: Stephen A. Brandli (stevebrandli) 2010-12-11 12:53:51.000-0600

I tested the patch and it fixes my scenario.  My channel is answered when ExternalIVR is called, so I didn't test the unanswered channel scenario.

By: Digium Subversion (svnbot) 2011-01-03 08:09:30.000-0600

Repository: asterisk
Revision: 300121

U   trunk/apps/app_externalivr.c

r300121 | diruggles | 2011-01-03 08:09:30 -0600 (Mon, 03 Jan 2011) | 13 lines

initialize playing_silence in struct initialization

playing_silence was not initialized with the struct
was initialized, it was being set after the fact
which caused problems if something that relied on
playing_silence being set was called too quickly

(closes issue ASTERISK-17072)
Reported by: stevebrandli
     externalivr.patch uploaded by thedavidfactor (license 903)
Tested by: thedavidfactor, stevebrandli