Summary: | ASTERISK-15196: [patch] ExternalIVR confuses AGI by double closing FDs | ||
Reporter: | dillec (dillec) | Labels: | |
Date Opened: | 2009-11-22 10:38:24.000-0600 | Date Closed: | 2009-12-03 09:22:28.000-0600 |
Priority: | Major | Regression? | No |
Status: | Closed/Complete | Components: | Applications/app_externalivr |
Versions: | Frequency of Occurrence | ||
Related Issues: | |||
Environment: | Attachments: | ( 0) app_externalivr.c_proper-fd-closing_v2.patch.gz ( 1) app_externalivr.c_proper-fd-closing_v3.patch ( 2) trace_pipe_syscalls.stp.gz | |
Description: | By putting a asterisk system on higher load (15 Calls per Second) as normal curios effects are arrising when using AGI and ExternalIVR together. For getting into the issue I've added some extensive output to the code which mainly concentrates on when and which fd is opened, closed and used. Further our AGI/ExternalIVR scripts use timeouts while reading from STDIN. Their log messages are the reason why I've started investigating on it. 1.) AGI suddenly can't write to the FD of the pipe which is connected to the agi scripts STDIN (in parantheses after ast_agi_send you can see the fd which is used for writing) [2009-11-22 16:02:48.004] WARNING[18107] res_agi.c: 003_Agi opened pipes: toast 0=38 1=106 | fromast 0=199 1=363 | errno=0 [2009-11-22 16:02:48.007] WARNING[18107] res_agi.c: ast_agi_send(363) >> agi_request: init-call.php [2009-11-22 16:02:48.007] ERROR[18107] utils.c: write() returned error: Bad file descriptor [2009-11-22 16:02:48.007] WARNING[18107] res_agi.c: ast_agi_send(363) >> agi_channel: SIP/sipp-ae747388 [2009-11-22 16:02:48.007] ERROR[18107] utils.c: write() returned error: Bad file descriptor [2009-11-22 16:02:48.007] WARNING[18107] res_agi.c: ast_agi_send(363) >> agi_language: de [2009-11-22 16:02:48.007] ERROR[18107] utils.c: write() returned error: Bad file descriptor 2.) ExternalIVR script reads content from another AGI process (log messages of our externalivr script): 17:02:48 22.11.2009 | uid=1258905766.2546 chan=SIP/sipp-b00f88a8 | ExternalIVR: Command: S,/home/voice/cgm53stage2/6000/1/Ansage_RP1 17:02:48 22.11.2009 | uid=1258905766.2546 chan=SIP/sipp-b00f88a8 | ExternalIVR: 'T,1258905768' 17:02:50 22.11.2009 | uid=1258905766.2546 chan=SIP/sipp-b00f88a8 | ExternalIVR: '200 result=1' 17:02:50 22.11.2009 | uid=1258905766.2546 chan=SIP/sipp-b00f88a8 | ExternalIVR: DTMF Digit: 200 result=1 17:02:50 22.11.2009 | uid=1258905766.2546 chan=SIP/sipp-b00f88a8 | ExternalIVR: '200 result=1' 17:02:50 22.11.2009 | uid=1258905766.2546 chan=SIP/sipp-b00f88a8 | ExternalIVR: DTMF Digit: 200 result=1 3.) ExternalIVR can't write/read from the FDs of the pipes which are connected to the externalivr script: [2009-11-22 16:02:53.520] WARNING[18047] app_externalivr.c: ExternalIVR: sending command 'G,1258905773,sys_context=6000_1' chan_state=6 [2009-11-22 16:02:53.520] WARNING[18047] app_externalivr.c: ExternalIVR: fprintf on handle 0xae7a8738 failed with error Bad file descriptor (9), chan state=6 First it came to my mind, that the kernel may has a synchronization bug while handing out FDs for the pipes: AGI opened FDs: [2009-11-22 16:02:48.004] WARNING[18107] res_agi.c: 003_Agi opened pipes: toast 0=38 1=106 | fromast 0=199 1=363 | errno=0 ExternalIVR opened FDs [2009-11-22 16:02:48.026] WARNING[18047] app_externalivr.c: Pipes created child_stdin 0=199 1=363 | child_stdout 0=484 1=489 | child_stderr 0=491 1=49 I was able to negative this thoguht by using a systemtap script (see attachements) which shows that everytime a pipe is created it is always based on formerly closed fds - so kernel was right: Sun Nov 22 16:02:48 2009 (ms=4) process asterisk[30094][18107]: pipe opened: 0=[199, 363] Sun Nov 22 16:02:48 2009 (ms=4) process asterisk[30094][17731]: fd closed: 363 Sun Nov 22 16:02:48 2009 (ms=11) process asterisk[30094][17731]: pipe opened: 0=[363, 478] But, the log also showed that while one thread of a process gets the FD by a pipe, another thread (17731) of the process is closing it: Sun Nov 22 16:02:48 2009 (ms=14) process asterisk[30094][17731]: fd closed: 363 Sun Nov 22 16:02:48 2009 (ms=26) process asterisk[30094][18047]: pipe opened: 0=[199, 363] And thats it - thread 17731 was running ExternalIVR, which closes the FDs first in eivr_comm() by fclose (line 785) and a second time in app_exec() (line=495) by close(). The patch in the attachments passes the FDs by reference and assigns them 0 after fclose has finished. With that close() in app_exec() is not called. | ||
Comments: | By: David Ruggles (thedavidfactor) 2009-11-22 13:53:22.000-0600 Thank you for the detailed explanation, testing and patch, however when I view the patch it appears to be the whole app_externalivr.c program. Please upload a new unified patch that is limited to the specific change(s) you're proposing and I will try to get it in to testing and review as soon as possible. By: dillec (dillec) 2009-11-22 14:48:22.000-0600 ah - sorry, messed the diff command with relative pathnames and a wrong working dir. Please find the correct patch attached. By: David Ruggles (thedavidfactor) 2009-11-23 08:07:55.000-0600 thank you for the corrected patch. By: David Ruggles (thedavidfactor) 2009-11-23 10:13:33.000-0600 I see this problem and this patch seems to be completely valid for the parent/child instance of EIVR, however it exposed some issues with the client/server instance of EIVR. Please attach a patch without the modifications @ -441,7 and I will recommend it for testing and review. Also, if possible please don't gzip it as that makes it impossible to view without downloading. By: dillec (dillec) 2009-11-23 13:00:00.000-0600 ok, I was not sure about that and not easily able to test it. Since the function eivr_comm now expects references instead of integers and uses them with the dereferncing op * it is not possible to simply delete the modifications. I think the main issue on the client/server side is that ser->fd is invalidated in that case. The recent patch now copies ser->fd into the child_* array and then passes them to the function as a reference like the parent/child part does. By: David Ruggles (thedavidfactor) 2009-11-23 13:07:36.000-0600 Well, the rest of the client server code is broken so it would never get to that point anyway. I am working on patching that part of it and will address this issue there. However this bug should be regressed and the decision was made to only fix the client/server implementation in trunk. So therefore, I think that adding more broken code to already broken code is less of a problem then muddying the already muddy waters of the client/server issues. :-) By: David Ruggles (thedavidfactor) 2009-11-30 11:33:28.000-0600 Should have something soon By: David Ruggles (thedavidfactor) 2009-11-30 12:02:17.000-0600 This is pending review on the reviewboard https://reviewboard.asterisk.org/r/436 By: Digium Subversion (svnbot) 2009-12-02 16:24:58.000-0600 Repository: asterisk Revision: 232587 U trunk/apps/app_externalivr.c ------------------------------------------------------------------------ r232587 | diruggles | 2009-12-02 16:24:58 -0600 (Wed, 02 Dec 2009) | 12 lines Prevent double closing of FDs by EIVR This caused a problem when asterisk was under heavy load and running both AGI and EIVR applications. EIVR would close an FD at which point it would be considered freed and be used by a new AGI instance the second close would then close the FD now in use by AGI. (closes issue ASTERISK-15196) Reported by: diLLec Tested by: thedavidfactor, diLLec Review: https://reviewboard.asterisk.org/r/436/ ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=232587 By: Digium Subversion (svnbot) 2009-12-03 08:55:37.000-0600 Repository: asterisk Revision: 232811 U branches/1.6.0/apps/app_externalivr.c ------------------------------------------------------------------------ r232811 | diruggles | 2009-12-03 08:55:36 -0600 (Thu, 03 Dec 2009) | 19 lines Merged revisions 232587 via svnmerge from https://origsvn.digium.com/svn/asterisk/trunk ........ r232587 | diruggles | 2009-12-02 17:17:22 -0500 (Wed, 02 Dec 2009) | 12 lines Prevent double closing of FDs by EIVR This caused a problem when asterisk was under heavy load and running both AGI and EIVR applications. EIVR would close an FD at which point it would be considered freed and be used by a new AGI instance the second close would then close the FD now in use by AGI. (closes issue ASTERISK-15196) Reported by: diLLec Tested by: thedavidfactor, diLLec Review: https://reviewboard.asterisk.org/r/436/ ........ ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=232811 By: Digium Subversion (svnbot) 2009-12-03 09:11:00.000-0600 Repository: asterisk Revision: 232812 U branches/1.6.1/apps/app_externalivr.c ------------------------------------------------------------------------ r232812 | diruggles | 2009-12-03 09:11:00 -0600 (Thu, 03 Dec 2009) | 19 lines Merged revisions 232587 via svnmerge from https://origsvn.digium.com/svn/asterisk/trunk ........ r232587 | diruggles | 2009-12-02 17:17:22 -0500 (Wed, 02 Dec 2009) | 12 lines Prevent double closing of FDs by EIVR This caused a problem when asterisk was under heavy load and running both AGI and EIVR applications. EIVR would close an FD at which point it would be considered freed and be used by a new AGI instance the second close would then close the FD now in use by AGI. (closes issue ASTERISK-15196) Reported by: diLLec Tested by: thedavidfactor, diLLec Review: https://reviewboard.asterisk.org/r/436/ ........ ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=232812 By: Digium Subversion (svnbot) 2009-12-03 09:22:27.000-0600 Repository: asterisk Revision: 232813 U branches/1.6.2/apps/app_externalivr.c ------------------------------------------------------------------------ r232813 | diruggles | 2009-12-03 09:22:27 -0600 (Thu, 03 Dec 2009) | 19 lines Merged revisions 232587 via svnmerge from https://origsvn.digium.com/svn/asterisk/trunk ........ r232587 | diruggles | 2009-12-02 17:17:22 -0500 (Wed, 02 Dec 2009) | 12 lines Prevent double closing of FDs by EIVR This caused a problem when asterisk was under heavy load and running both AGI and EIVR applications. EIVR would close an FD at which point it would be considered freed and be used by a new AGI instance the second close would then close the FD now in use by AGI. (closes issue ASTERISK-15196) Reported by: diLLec Tested by: thedavidfactor, diLLec Review: https://reviewboard.asterisk.org/r/436/ ........ ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=232813 |