Summary: | ASTERISK-18951: [regression] T.38 pass through produce 100% CPU usage spike | ||||||||
Reporter: | Kristijan Vrban (vrban) | Labels: | |||||||
Date Opened: | 2011-12-01 09:01:01.000-0600 | Date Closed: | 2012-02-02 16:35:01.000-0600 | ||||||
Priority: | Blocker | Regression? | Yes | ||||||
Status: | Closed/Complete | Components: | Channels/chan_sip/T.38 | ||||||
Versions: | SVN | Frequency of Occurrence | |||||||
Related Issues: |
| ||||||||
Environment: | Attachments: | ( 0) backtrace-threads.txt ( 1) no_read_loop.diff | |||||||
Description: | T.38 pass through produce 100% CPU usage spike and asterisk becomes unresponsive with current 1.8-branch This issue does not happen with latest 1.8.8.0-rc4. So it must be related the one of the latest changes in 1.8-branch after branching 1.8.8.0 tag | ||||||||
Comments: | By: Kristijan Vrban (vrban) 2011-12-01 09:19:19.828-0600 It's changeset 340970. After (de)merge this change from 1.8 branch, it ok again. By: Kinsey Moore (kmoore) 2011-12-01 14:40:26.382-0600 Could you find out where Asterisk is getting stuck in the loop using gdb? By: Kristijan Vrban (vrban) 2011-12-02 10:12:12.606-0600 >Could you find out where Asterisk is getting stuck in the loop using gdb? First look, it's a loop around function ast_rtcp_read in res_rtp_asterisk.c By: Kinsey Moore (kmoore) 2011-12-02 11:20:15.903-0600 I'm thinking that RTCP needs to be prodded before it can work properly after being reenabled in some cases. Could you give the attached patch a try? If that doesn't fix it for you, could you give me the full stack trace from gdb? By: Kristijan Vrban (vrban) 2011-12-02 17:43:56.355-0600 the patch does not help. it crash asterisk. how can i offer a stack trace when asterisk does not crash (without your patch) i know how to get the stack trace from gdb, when asterisk crash a dump a core file. But this issue does not crash asterisk. it just use 100% CPU until T.38 is pass through. By: Richard Mudgett (rmudgett) 2011-12-02 18:13:02.865-0600 You can attach gdb at any time to a running process. The running process does not need to be specially compiled to do this. See the https://wiki.asterisk.org/wiki/display/AST/Getting+a+Backtrace section on "Getting Information For A Deadlock" for an example gdb command. You should run gdb when Asterisk is using 100% CPU. By: Kristijan Vrban (vrban) 2011-12-03 07:23:14.968-0600 here is the output of gdb -ex "thread apply all bt" --batch /usr/sbin/asterisk `pidof asterisk` > /tmp/backtrace-threads.txt when asterisk eat 100%CPU By: Kinsey Moore (kmoore) 2012-01-23 08:13:29.051-0600 Hi Kristijan, could you capture a pcap of this occurring with tcpdump or wireshark? That would be very helpful in reproducing this issue. Configuration information would also help (specifically sip.conf). By: Kinsey Moore (kmoore) 2012-01-25 15:39:28.598-0600 I've been looking at this further and I think I have a fix for the issue. It seems that Asterisk is still polling the RTCP file descriptor after RTCP is shut down and removed. If the descriptor happens to have data ready when the removal occurs, then Asterisk will go into an infinite loop trying to read data that it can never actually access. The attached patch disables the audio RTCP file descriptor for the duration of the T.38 transaction. Please test the patch at your convenience. By: Kristijan Vrban (vrban) 2012-01-31 09:39:11.748-0600 i just did test the patch. issues is fixed! |