Summary:ASTERISK-21242: Segfault when T.38 re-invite retransmission receives 200 OK
Reporter:Ashley Winters (awinters)Labels:
Date Opened:2013-03-13 16:27:37Date Closed:2013-12-08 21:18:01.000-0600
Status:Closed/CompleteComponents:Resources/res_fax Resources/res_fax_spandsp
Versions:11.2.1 Frequency of
Environment:CentOS 6.2 x86_64Attachments:( 0) A_PARTY.xml
( 1) always-init-t38.patch
( 2) gdb-t38-retransmission-segfault.txt
( 3) t38_uac_no_dcn.dump
( 4) t38-retransmission-segfault-debug.log
Description:If res_fax falls back to audio after timing out on T.38 upgrade, and chan_sip continues retrying the re-INVITE, and the remote end responds to one of the retransmissions (after res_fax times out) with a 200 OK, a T.38 frame will be delivered to the res_fax_spandsp driver without the t38 subsystem being initialized.

This results in a segfault.
Comments:By: Ashley Winters (awinters) 2013-03-13 16:31:27.913-0500

GDB trace of the segfault, showing the uninitialized t38 structure

By: Ashley Winters (awinters) 2013-03-13 17:31:52.330-0500

Near the top of the log, "timed-out during the T.38 negotiation". On a later re-INVITE retry, you see a 200 OK followed by "switched to T.38 FAX session '1115'". Segfault followed immediately afterwards.

By: Ashley Winters (awinters) 2013-03-13 18:15:29.946-0500

Either res_fax_spandsp needs to run {{t38_terminal_init}} unconditionally if T.38 is possible, or {{generic_fax_exec}} needs to not run {{switch_to_t38}} after {{new_session}} has been called without AST_FAX_TECH_T38.

By: Ashley Winters (awinters) 2013-03-13 18:40:49.888-0500

This is the simplest patch that could possibly work. I'm going to give it a spin.

By: Ashley Winters (awinters) 2013-03-25 15:29:52.797-0500

I haven't seen this bizarre SIP flow happen again, but the patch itself has been harmless.

By: Torrey Searle (tsearle) 2013-11-12 08:18:48.601-0600

Here is a sipp scenario that can be used to trigger a crash in asterisk.  Requires rtpplay to be installed

By: Matt Jordan (mjordan) 2013-11-12 12:34:37.872-0600

I took a look at spandsp, and couldn't see any reason why initializing the T38 core would have a significant performance impact. It does initialize the T.38 context object; however, we won't do anything with it unless {{p->ist38}} is true. That won't be the case unless we actually start sending/receiving a T.38 fax. The same is true of {{p->fax_state}}.

By: Matt Jordan (mjordan) 2013-11-25 09:16:08.831-0600

Torrey - do you have a dialplan snippet that you used in conjunction with the SIPp scenario attached?

By: Torrey Searle (tsearle) 2013-11-25 10:36:47.311-0600

Standard T.38 fax callflow like this should be sufficient

exten => s,n,Set(FAXOPT(maxrate)=9600)
exten => s,n,Set(FAXOPT(modem)=V27,V29)
exten => s,n,Set(FAXOPT(ecm)=yes)
exten => s,n,ReceiveFAX(/tmp/fax.tiff,f)

I've found that the sipp scenario only crashes asterisk about half the time.  Moving the rtplay to before sending the ACK might increase the crash success rate a bit

By: Torrey Searle (tsearle) 2013-11-25 10:38:43.472-0600

also note that the recorded fax provided isn't good enough to result in a successful fax delivery, so it's only good for reproducing the crash.