Summary:ASTERISK-18951: [regression] T.38 pass through produce 100% CPU usage spike
Reporter:Kristijan Vrban (vrban)Labels:
Date Opened:2011-12-01 09:01:01.000-0600Date Closed:2012-02-02 16:35:01.000-0600
Versions:SVN Frequency of
must be completed before resolvingASTERISK-19128 Asterisk 1.8.10 Blockers
must be completed before resolvingASTERISK-19129 Asterisk 10.2.0 Blockers
is caused byASTERISK-18400 RTCP Receiver Reports are sent for idle RTP sessions
Environment:Attachments:( 0) backtrace-threads.txt
( 1) no_read_loop.diff
Description:T.38 pass through produce 100% CPU usage spike and asterisk becomes unresponsive with current 1.8-branch

This issue does not happen with latest So it must be related the one of the latest changes in 1.8-branch after branching tag
Comments:By: Kristijan Vrban (vrban) 2011-12-01 09:19:19.828-0600

It's changeset 340970. After (de)merge this change from 1.8 branch, it ok again.

By: Kinsey Moore (kmoore) 2011-12-01 14:40:26.382-0600

Could you find out where Asterisk is getting stuck in the loop using gdb?

By: Kristijan Vrban (vrban) 2011-12-02 10:12:12.606-0600

>Could you find out where Asterisk is getting stuck in the loop using gdb?
First look, it's a loop around function ast_rtcp_read in res_rtp_asterisk.c

By: Kinsey Moore (kmoore) 2011-12-02 11:20:15.903-0600

I'm thinking that RTCP needs to be prodded before it can work properly after being reenabled in some cases.  Could you give the attached patch a try?  If that doesn't fix it for you, could you give me the full stack trace from gdb?

By: Kristijan Vrban (vrban) 2011-12-02 17:43:56.355-0600

the patch does not help. it crash asterisk.

how can i offer a stack trace when asterisk does not crash (without your patch)
i know how to get the stack trace from gdb, when asterisk crash a dump a core file.
But this issue does not crash asterisk. it just use 100% CPU until T.38 is pass through.

By: Richard Mudgett (rmudgett) 2011-12-02 18:13:02.865-0600

You can attach gdb at any time to a running process.  The running process does not need to be specially compiled to do this.

See the https://wiki.asterisk.org/wiki/display/AST/Getting+a+Backtrace section on "Getting Information For A Deadlock" for an example gdb command.  You should run gdb when Asterisk is using 100% CPU.

By: Kristijan Vrban (vrban) 2011-12-03 07:23:14.968-0600

here is the output of gdb -ex "thread apply all bt" --batch /usr/sbin/asterisk `pidof asterisk` > /tmp/backtrace-threads.txt
when asterisk eat 100%CPU

By: Kinsey Moore (kmoore) 2012-01-23 08:13:29.051-0600

Hi Kristijan, could you capture a pcap of this occurring with tcpdump or wireshark?  That would be very helpful in reproducing this issue.  Configuration information would also help (specifically sip.conf).

By: Kinsey Moore (kmoore) 2012-01-25 15:39:28.598-0600

I've been looking at this further and I think I have a fix for the issue.  It seems that Asterisk is still polling the RTCP file descriptor after RTCP is shut down and removed.  If the descriptor happens to have data ready when the removal occurs, then Asterisk will go into an infinite loop trying to read data that it can never actually access.  The attached patch disables the audio RTCP file descriptor for the duration of the T.38 transaction. Please test the patch at your convenience.  

By: Kristijan Vrban (vrban) 2012-01-31 09:39:11.748-0600

i just did test the patch. issues is fixed!