Summary:ASTERISK-13846: [patch] Improvements/fixes for app_fax
Reporter:Andrew Lindh (andrew)Labels:
Date Opened:2009-03-26 15:14:53Date Closed:2009-08-06 11:08:02
Versions:Frequency of
Environment:Attachments:( 0) app_fax.log
( 1) app_fax-20090326.diff
( 2) app_fax-20090406.diff
( 3) v1-14769.patch
I was working on a backport of app_fax to asterisk 1.4 branch.

It works great now, but the original code was causing errors to be logged (a new issue as of 1.4.24, see bug id ASTERISK-13565). As I looked into the code more I found what I think a problems/bugs. Here is a patch (for 1.6 trunk) for my "improvements/fixes".

I'm using SPANDSP 0.0.6pre7 and testing with SIP using an external PSTN gateway (AS5300). I can send a fax from the system to the same system (using the PSTN), so it seems happy. I also tested with a normal fax machine. RX works way better than TX for me.


What and why did I change?...

Using SPANDSP_EXPOSE_INTERNAL_STRUCTURES seems cleaner than an ifdef for SPANDSP includes.

MAX_SAMPLES seems to be better as a larger frame, but works when tested at 160, 240 or 320....

The "bug" I found has to do with the loops using ast_waitfor. The original code only waits for 20ms which is not enough (if the system is fast and the sended is slow, you always have some timeouts at 20ms). It needs to wait longer so asterisk can wait for the next frame rather then return with a timeout. 100ms seems to be more then enough but not too long, maybe it should be shoter (25ms resolved most errors)? The real problem that causes errors to be logged is the return value from ast_waitfor was not correctly checked. If it's 0 then it's a timeout and it should not try to read a frame, which is the next step in the original code. I added the longer timeout which mostly prevents the problem. But I also added a check for a timout from ast_waitfor and continue the loop. Because the loop can go forever with timouts I moved the watchdog section to the top of the loop so it's always included. This should save on CPU time as my test system was trying to read 3-4 frames for every 2 that arrived.

I changed the code to set FAXPAGES to a default of 0 (zero) rather than NULL. This allows me to check the FAXPAGES value in the dialplan without having a NULL variable error. I check the number of pages in a fax even if app_fax returns with an error status. While this is not the default that it was before, it seems to still be a safe choice.
Comments:By: Frederic Van Espen (freh) 2009-03-27 07:43:52

The patch has fixed the problem for me. No more error messages being displayed in the console. I applied the patch to asterisk1.6.0.7-rc2.

By: deti (deti) 2009-04-02 15:55:01

This patch also works for me in SIP only applications and with mISDN based PSTN access.

By: caspy (caspy) 2009-04-03 09:06:03

tested upon asterisk and spandsp 0.0.6pre6.
compiled ok, no ast_read() errors in log, faxes are received ok.

tested with audio faxing, not t38.

By: Amilcar S Silvestre (amilcar) 2009-04-03 10:55:54

andrew, is there any place that we can have this app_fax.c 1.4 backport? ;-)

By: Andrew Lindh (andrew) 2009-04-03 15:21:55

You can look at my asterisk stuff page for the 1.4 version I am using....


By: Dmitry Andrianov (dimas) 2009-04-03 19:28:44

1. I believe that at the time I made app_fax, the ast_read used too return AST_FRAME_NULL without producing any warnings even when wait_for ended with timeout. This is why the code was organized the way it was orgenized.

2. Anyway, my understanding was that I HAD TO call t38_terminal_send_timeout periodically even if no frames were received. I thought that it is important for SpanDSP to be called periodically in order to update some internal state and probably call t38_tx_packet_handler callback which sends frames to the remote side.

I still think it is true. Because of this I do not think your fix for the T38 part is correct.

By: Andrew Lindh (andrew) 2009-04-03 20:11:16

1... yes, it's a new error. It may return NULL, but there's also no reason to waste the CPU time with new errors and NULL frames, right?

2... If T.38 needs the frames all the time my code is wrong, I have not tested T.38. It would be easy to move the "continue" below t38_terminal_send_timeout, above ast_read. Is once per 100ms enough?

By: Dmitry Andrianov (dimas) 2009-04-03 20:25:58

Honestly, I have no idea. I never dig deep into SpanDSP code, so only its author (Steve Underwood) may shed some light on it.
I would just leave 20 ms for T38 code as it was before and would put

 if (!res) continue; else res = 0;

check after the t38_terminal_send_timeout/

However it is bad idea to improve/change _anything_ in the T38 part if you can not test your changes.

By: Andrew Lindh (andrew) 2009-04-04 23:23:42

Well, I guess I had better get T.38 setup on my test system...

By: Santiago Gimeno (sgimeno) 2009-04-06 04:39:59

I have tested the patch upon asterisk and spandsp 0.0.6pre7 with T.38 faxing and the messages don't appear anymore but it doesn't work correctly.

My scenario is as follows:

Asterisk tries to send messages to a PSTN fax through a Cisco MGW. Asterisk and the Cisco MGW try to negotiate T.38 but Asterisk rejects it with a 488 "Not acceptable here", so Cisco downgrades the call to audio, and asterisk tries to send the fax in audio mode.

Moreover, these traces appear while negotiating the media:

[packet-per.c 1225] Not decoded yet in packet : 2057  [unknown choice extension]
[packet-per.c 1225] Not decoded yet in packet : 2059  [unknown choice extension]
[packet-per.c 1225] Not decoded yet in packet : 2066  [unknown choice extension]
[packet-per.c 1225] Not decoded yet in packet : 2071  [unknown choice extension]
[packet-per.c 1225] Not decoded yet in packet : 2073  [unknown choice extension]
[packet-per.c 1225] Not decoded yet in packet : 2079  [unknown choice extension]
[packet-per.c 255] Not decoded yet in packet : 2081  []
[packet-per.c 826] Not decoded yet in packet : 2083  [too long integer]
[packet-per.c 255] Not decoded yet in packet : 2086  []
[packet-per.c 826] Not decoded yet in packet : 2088  [too long integer]
[packet-per.c 255] Not decoded yet in packet : 2091  []

* This scenario works "more or less" correctly with Asterisk and spandsp 0.0.5pre4.

By: Dmitry Andrianov (dimas) 2009-04-06 10:27:08

I you can wait, I will be able to make patch which fixes messages but does not break anything in T38 on weekend.

By: Andrew Lindh (andrew) 2009-04-06 15:39:55

I got T.38 working for inbound to asterisk, outbound still picks audio. How can I make asterisk offer T.38 to the cisco gateway for outbound calls?

I don't know where all the complicated code came from. I just set the waitfor timeout to 20, and put "t38_terminal_send_timeout(&t38, 160);" inside the loop. It seems to work fine. It does spin a little because 20 is a bit too low, but it seems to keep T.38 happy. The timeout in the t38 function looks like samples not MS.

Steve would be the best one to answer why the existing values were selected.

By: Santiago Gimeno (sgimeno) 2009-04-07 03:05:18

I don't think Asterisk should offer Cisco T.38 if the call was initiated in audio mode, but the other way around. It's the callee (Cisco) that should offer T.38 to the caller (Asterisk).

The call flow should be:

     Asterisk         Cisco        
         |              |              
         |              |              
         |              |              
         |Invite (Audio)|              
         |200 OK (Audio)|              
         |ACK           |              
         |              |Establishes the call with the PSTN Fax
         |              |              
         |re-INVITE (T.38)            
         |200 OK (T.38) |              
         |ACK           |              
         |              |              
         |              |              

The problem I'm having is that Asterisk answers the T.38 re-INVITE with a "488 Not acceptable here", so the Cisco downgrades the call to audio again.

*Sorry about the call flow I don't know why it's not displayed correctly.

By: Andrew Lindh (andrew) 2009-04-07 10:43:51

I uploaded a new patch. The audio fax part is about the same. I set the timeout to 50ms, which is more than enough to wait for a frame in a normal case since it's not tested at over 320 bytes anyway (40ms). I reordered the IF statement to deal with the most common case first.

I made less of a mess of the T.38 section. It continues to call t38_terminal_send_timeout inside the loop and the timeout is still 20ms (which assumes that it's sending data based on 20ms/160 samples).

I tested send/receive using audio and tested receive using T.38, but unable to send using T.38 due to codec negotiation issues... See bug id ASTERISK-1468849

By: Dmitry Andrianov (dimas) 2009-04-07 10:50:03

The if is redundant.
After the t38_terminal_send_timeout you may only have res equal to 0 or above because you already handled the below zero condition. So

if (!res)
   res = 0;

should be enough. That is what I was suggesting you - http://bugs.digium.com/view.php?id=14769#102731

By: Andrew Lindh (andrew) 2009-04-07 10:59:32

I left the code so it would be clear as to what was happening and so both sections looked alike (audio and T.38).

It could just be:
if (!res) continue;

By: deti (deti) 2009-04-08 05:10:53

Enabled T.38 on Cisco gateway (voice service voip / fax protocol t38), patched app_fax.c with  app_fax-20090406.diff.

A fax could not be received due to some errors (see log file 'app_fax.log' attached). Hope that helps.

Sending faxes does not switch over to T.38 - there is no re-invite from the the Cisco gateway.

By: Dmitry Andrianov (dimas) 2009-04-08 05:24:27

was it working BEFORE the patch?
If not - it is unrelated problem and you should fill separae issue.

By: Santiago Gimeno (sgimeno) 2009-04-08 06:44:46

The latest patch app_fax-20090406.diff is definetely not working in order to send fax in T38 mode. Asterisk declines the T.38 offer with a 488 response and the call is downgraded to audio. This was working with

I have not been able to test receiving fax as I am not able to configure asterisk correctly. It always receives in audio mode and never switches to T38. Any pointers on how to accomplish that?.

*After going back to I have been able to receive fax in T.38 mode. Using the same configuration in the patched version of, asterisk never switches to T.38. By looking at the code, apparently it doesn't detect the tone to change to T.38 mode.

By: deti (deti) 2009-04-08 06:47:03

It never worked before - with none of my T.38 capable gateways (Cisco 2821, 5350, Innovaphone ip400, ip800). So maybe your current efforts to make T.38 work with app_fax may fail because of some other (indirect) problem.

By: Dmitry Andrianov (dimas) 2009-04-08 07:46:40

again, have you tested it without the patch? Did it work?

The patch solves very specific issue and it is a bad idea to put all T38 relaetd problems onto this page..

By: Andrew Lindh (andrew) 2009-04-08 18:28:45

Once again, please look at bug id ASTERISK-1468849 for T.38 negotiation problems. This patch does not directly fix or break T.38 negotiation. With the patch from ASTERISK-1468849, T.38 works.

By: dovid (dovid) 2009-06-11 17:48:39

I came here because of 14723 (I was getting channel.c:2043 __ast_read: ast_read() called with no recorded file descriptor). No sure I understand the exact issue but after applying this patch the errors no longer come up.

I am running

By: Kevin P. Fleming (kpfleming) 2009-07-10 11:05:12

The loop restructuring and watchdog timeout parts of this patch have been committed to trunk and 1.6.2 (although I didn't realize they were in your patch so I wrote them independently). They won't be committed to 1.6.0 or 1.6.1 because they don't fix any actual bugs, just cosmetic issues with LOG_DEBUG messages.

By: Andrew Lindh (andrew) 2009-07-10 13:20:51

I see the changes in trunk from ASTERISK-1468849 and ASTERISK-1531480

The loop changes were mostly for the waitfor bugs, plus it is a better design. The current trunk still generates some RTP Read errors with T38 (for sending), but trunk works for me with both audio and T38 (on cisco AS5300).

There are still other improvements in this patch that I think are worth adding to trunk.

By: Kevin P. Fleming (kpfleming) 2009-07-27 11:19:40

Can you update the patch with just the parts that you'd like to consider for merging now that trunk has had many other changes made? That would be very helpful.

By: Dmitry Andrianov (dimas) 2009-07-27 12:16:18

I created the patch with all andrew changes.

1. the main bugfix (do not attempt to read frame whn ast_waitfor returned 0) was already in the code so i just re-structured it for cleaner code.

2. I have to say that there is too many changes in the fax application in the tunk so someone needs to test it. I'm not running trunk in production.

By: Andrew Lindh (andrew) 2009-07-27 13:29:02

I'll take a look. The main thing was the waitfor reading after a 0 return and moving the timeout checking up to account for never reading. Trunk has addresses most of the issues and the T.38 problems. I'll review my code and update it for trunk, but the same problems/issues apply to older versions too.

By: Kevin P. Fleming (kpfleming) 2009-08-06 10:41:49

For branches other than trunk, we'll only commit changes to fix actual problems, not just to improve the layout/structure/etc. of the code. That's why I haven't backported the app_fax changes to 1.6.0 and 1.6.1, because at this point the poor code structure is not actually causing failures for anyone as far as we can tell. Yes, it generates excessive console messages, but only as DEBUG messages, which doesn't affect normal users.

It doesn't appear that there are many fixes left here to apply to trunk; there's no need to provide an updated patch if you can just describe what they are, I'm happy to just make the changes and get this issue closed.

By: Dmitry Andrianov (dimas) 2009-08-06 10:47:07

can v1-14769.patch go to trunk?

By: Digium Subversion (svnbot) 2009-08-06 11:07:33

Repository: asterisk
Revision: 210777

U   trunk/apps/app_fax.c
U   trunk/configure
U   trunk/configure.ac
U   trunk/include/asterisk/autoconfig.h.in

r210777 | kpfleming | 2009-08-06 11:07:33 -0500 (Thu, 06 Aug 2009) | 12 lines

Minor improvements to app_fax.

This patch makes some small changes to handle watchdog timeouts in a better way,
and also uses a 'cleaner' method of including the spandsp header files.

(closes issue ASTERISK-13846)
Reported by: andrew
     app_fax-20090406.diff uploaded by andrew (license 240)
     v1-14769.patch uploaded by dimas (license 88)
Tested by: freh, deti, caspy, dimas, sgimeno, Dovid



By: Digium Subversion (svnbot) 2009-08-06 11:07:59

Repository: asterisk
Revision: 210778

_U  branches/1.6.2/

r210778 | kpfleming | 2009-08-06 11:07:59 -0500 (Thu, 06 Aug 2009) | 18 lines

Blocked revisions 210777 via svnmerge

 r210777 | kpfleming | 2009-08-06 11:07:15 -0500 (Thu, 06 Aug 2009) | 12 lines
 Minor improvements to app_fax.
 This patch makes some small changes to handle watchdog timeouts in a better way,
 and also uses a 'cleaner' method of including the spandsp header files.
 (closes issue ASTERISK-13846)
 Reported by: andrew
       app_fax-20090406.diff uploaded by andrew (license 240)
       v1-14769.patch uploaded by dimas (license 88)
 Tested by: freh, deti, caspy, dimas, sgimeno, Dovid