[Home]

Summary:ASTERISK-13491: AMI shows utils.c:1198 ast_careful_fwrite: fflush() returned error: Broken pipe
Reporter:Chris Maciejewski (chris-mac)Labels:
Date Opened:2009-01-31 16:50:06.000-0600Date Closed:2011-06-07 14:08:04
Priority:MinorRegression?No
Status:Closed/CompleteComponents:Core/ManagerInterface
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) 20090204__bug14379.diff.txt
( 1) AMI.class.php
( 2) ami-test.php
Description:When sending data to AMI socket, without reading (waiting for) response - console shows:

ERROR[5148]: utils.c:1198 ast_careful_fwrite: fflush() returned error: Broken pipe

How to reproduce:
0. You will need php5-cli installed
1. Download 'AMI.class.php' and 'ami-test.php' attached.
2. Edit AMI username/password in 'ami-test.php'.
3. Run ./ami-test.php
4. Watch Asterisk console

NOTE: To revert back to a 'normal' behaviour, change line 72 of 'AMI.class.php' to: 'public $sync = true;'

Comments:By: Michael Cargile (mcargile) 2009-02-04 10:12:09.000-0600

I saw this exact same error in Asterisk 1.4.23 before I downgraded back to 1.4.21.2. I do not see this error in 1.4.21.2.

By: Mark Michelson (mmichelson) 2009-02-04 14:33:54.000-0600

I believe I see what the problem is. The problem is that the responses sent by the manager for the actions executed are being sent to a remotely-closed socket. I was able to easily reproduce the error message by using your scripts.

I re-ran the test and changed the "sync" variable in AMI.class.php to be true instead of false. While it took noticeably longer for the script to complete, it meant that we waited for the responses to reach us before moving on to the next command, and we did not get any broken pipe messages that way either.

In your particular case, the "Broken pipe" message can be safely ignored because you have purposely closed your end of the socket. Removing the message may not be a great idea though, because in other cases, it can serve as a genuine indication to someone that there is a problem with their connection.



By: Mark Michelson (mmichelson) 2009-02-04 14:41:57.000-0600

I think one sensible way to handle this would be to downgrade the error message to a warning or notice message and make it more informative than just the strerror() interpretation of EPIPE. Something like "The remote end closed its connection" seems less like a real problem and makes more sense.

It also would make sense for us to close the AMI connection ourselves if we receive an EPIPE when trying to write to a socket. That way, you'd only get the message a single time instead of potentially seeing it all over the screen.

By: Chris Maciejewski (chris-mac) 2009-02-04 14:46:40.000-0600

Another question is why it takes quite a long time for Asterisk to reply?
This was the reason I introduced not very elegant "$sync = false" behaviour.

Ideally I would like to wait for AMI replies, but it should take milliseconds rather than seconds to arrive.



By: Tilghman Lesher (tilghman) 2009-02-04 14:51:24.000-0600

chris-mac: could be there's lock contention.  What commands are you waiting for?

By: Chris Maciejewski (chris-mac) 2009-02-04 14:59:16.000-0600

Corydon76: When my Asterisk is idle simple login/logoff as in "ami-test.php" attached takes about 3 seconds to complete (when "$sync" is set to true in "AMI.class.php")

By: Mark Michelson (mmichelson) 2009-02-04 15:48:14.000-0600

I hate to blame php for this, but if I connect to Asterisk using telnet and issue commands, I get responses immediately. Using your script, there is a delay when stream_get_contents is called. This could be due to delays in the buffering used by php for the stream functions you use. It may be worthwhile to try using socket-based stream functions instead of buffer-based ones. But this is neither here nor there. I'll work on getting a patch to make the "Broken Pipe" error not appear as much and give a better description.

By: Jose P. Espinal (khratos) 2009-02-04 17:12:06.000-0600

Maybe the problem is a little bigger for us (php users) than just a 'broken pipe' message, because it also means that changes intended to be done with 'UpdateConfig' command will NOT be written to the destination files after 'broken pipe' messages start to appear.

That is, if you make several changes to a the same file using updateconfig, after the first 'broken pipe' changes are not written.

I was testing with sockets created with fsockopen (socket based stream)
and socket_create(AF_INET, SOCK_STREAM, SOL_TCP) also called 'endpoint of communication' and both are broken.

By: Tilghman Lesher (tilghman) 2009-02-04 18:22:12.000-0600

khratos:  correct.  If you (incorrectly) close the socket prematurely, then the commands which are still in the pipe will not be acted upon.  This is clearly a problem on the PHP client side, since, as previously stated, if you run the same commands with telnet, there's NO problem (because there's no output buffering).

By: Tilghman Lesher (tilghman) 2009-02-04 18:45:31.000-0600

This patch might allow the operation to proceed, if the pipe still has data remaining.

By: Jose P. Espinal (khratos) 2009-02-05 07:54:38.000-0600

I tested the patch.

I'm using the UdateConfig command not only for modifying the existent config files, but also in order to create new ones that are to be included on existent config files.

a. Reading files from a directory
b. making changes
c. saving them on destination directories.

After the first 'broken pipe' message I confirm that some files are written, but the same behavior was confirmed on an unpatched installation (another box) with Asterisk 1.4.23.1

I also confirmed that the problem get worst when PHP is acting as a client for a remote Asterisk Server (even on the same LAN).

IMO, as this is obviously a PHP buggy way of managing sockets (as you can not turn off response buffering), maybe the solution is to find a workaround on client side. Unfortunately, when sockets are general purpose socket ( socket_create ), not 'stream socket' ( fsockopen ), the problem is even worst; because you can not (even if they say so) control the stream timeout (stream_set_timeout) on them.

By: Michael Cargile (mcargile) 2009-02-05 08:47:24.000-0600

Our application is written in Perl and is connecting via the Net::Telnet directly into the AMI. We have a script that spawns child processes. The children connect to the AMI, issue a command (mostly originate), wait 4 seconds for a response, and then disconnect and exit. If the children get a response they puts them into a log table in our database. We are seeing these errors appear in the CLI on 1.4.23.1 with only about 12 calls. In 1.4.21.2 we do not see this issue at all, and we place 120 calls via this method regularly. Something must be slowing Asteirsk's response to our commands down or blocking them internally.

By: Mark Michelson (mmichelson) 2009-02-05 10:51:37.000-0600

mcargile: The biggest difference I'm seeing between 1.4.21.2 and 1.4.23 is that the ast_carefulwrite function was rewritten. Most of the changes had to do with reporting errors.

One change that was made was to poll on the file descriptor we are going to write to prior to writing to it. I can't really see how this would be a problem, though. If poll(2) blocks, then that would also mean that write(2) would, so I can't see how this would cause a delay.

mcargile, I may have read your comment incorrectly, so if what I suggest doesn't make sense, I apologize in advance. Could you run identical tests of 120 calls on 1.4.21.2 and 1.4.23.1 and see if the log table that is written to by the child processes is significantly different? I'm just thinking that the same error existed in 1.4.21.2 but there was no error message printed.

By: Michael Cargile (mcargile) 2009-02-05 11:11:54.000-0600

I will do my best to run this test for you soon. It might be the weekend before I can get to this though.

I have been meaning to rerun our test on 1.4.23.1 anyways as there were some other ERRORs in the CLI that I have not had a chance to isolate. They appeared far less frequently which is why I focused on this one. I hope these ERRORs do not sway the results.

By: Mark Michelson (mmichelson) 2009-02-05 13:26:37.000-0600

I'm assigning this to Corydon76 since he has contributed a patch for testing.

By: Jose P. Espinal (khratos) 2009-02-05 13:42:48.000-0600

I already tested the patch, see comment above.

By: Tilghman Lesher (tilghman) 2009-02-16 17:06:35.000-0600

As stated before, this is clearly a bug on the client side, not the server side.