[Home]

Summary:ASTERISK-08183: Asterisk process utilizing 100% of CPU time after a P2P'd sip call is hung up
Reporter:knight112 (knight112)Labels:
Date Opened:2006-11-24 15:20:34.000-0600Date Closed:2007-01-15 12:38:05.000-0600
Priority:MajorRegression?No
Status:Closed/CompleteComponents:Core/General
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) 48074.log
( 1) 48168.txt
( 2) ast.log
( 3) trunk-48018.log
Description:Randomly asterisk process goes to 100% and stays there. During this time DTMF tones stops working. But playback like application works when the call comes in.

Usually this happens after a call is completed.

Only way to fix it is to stop and start asterisk.

Asterisk is running on FC5 32 bit OS. Mysql is the DB used for realtime. Happens 2 to 3 times a day sometimes depending on the call volume.



****** ADDITIONAL INFORMATION ******

"Info thread" list contains this

 21 Thread 4668320 (LWP 7838)  0x00710402 in __kernel_vsyscall ()
 20 Thread 11140000 (LWP 7839)  __ast_pthread_mutex_unlock (
   filename=0x814d357 "sched.c", lineno=162, func=0x814d655 "ast_sched_wait",
   mutex_name=0x814d395 "&con->lock", t=0x8491618)
   at /usr/src/asterisk-1.4.0-beta3/include/asterisk/lock.h:343

It doesn't look right, but I am not too sure. Attached is the complete bt output.
Comments:By: jmls (jmls) 2006-11-25 01:56:46.000-0600

there have been a lot of updates to the 1.4 codebase since beta3. Would you be able to run 1.4 svn to see if you can reproduce this bug ? Thanks

By: knight112 (knight112) 2006-11-27 11:48:30.000-0600

I just installed SVN-trunk-r48018. I will upload the new bt output if the problem exists in this version.

By: knight112 (knight112) 2006-11-27 21:46:48.000-0600

Asterisk process was at 100% CPU with Trunk 48018. russellb told me to upgrade it to 48074 in irc.
Upgrading it to 48074. Will post back if the problem occurs in 48074.
Attached is the bt output for trunk 48018 when the asterisk process was at 100%.

By: benjamin (benjamin) 2006-11-29 12:38:56.000-0600

I have a similar issue on FC5 64bit. Upgraded today to version 48105 and the issue is still there : after incoming calls (SIP), asterisk process jumps to 100% processor use.

By: Serge Vecher (serge-v) 2006-11-30 11:19:33.000-0600

benjamin: as indicated by knight112, please try 1.4 branch rev > 48074

By: benjamin (benjamin) 2006-12-01 03:01:21.000-0600

Upgraded again this morning, the problem is still there.
core show version: Asterisk SVN-branch-1.4-r48168M built by root @ vintage on a x86_64 running Linux on 2006-12-01 07:07:05 UTC

The problem is very consistent, it happens after every incoming call and my only solution for the moment is to restart the service. I'm not sure what I can do to help debug this issue. I suspect it could be somehow related to my sip provider (french "wengo").

By: Serge Vecher (serge-v) 2006-12-01 09:24:08.000-0600

what modifications have you done to the source?

By: benjamin (benjamin) 2006-12-01 10:00:39.000-0600

One file did not compile. I added "#include <linux/compiler.h> just before #include<ixjuser.h>" in chan_phone.c to make it work after some googling. Will try a fresh svn again.

By: knight112 (knight112) 2006-12-01 10:06:13.000-0600

I can reproduce the problem with 48074 as well. I couldn't get the bt output as the asterisk process crashed and restarted automatically. Is there any other information that you need?

By: Serge Vecher (serge-v) 2006-12-01 10:13:31.000-0600

start asterisk with -g option so the core is dumped. Then you can analyze the core file with bt. Make sure that "DONT OPTIMIZE" is selected in menuselect under compiler options.

By: knight112 (knight112) 2006-12-04 09:31:49.000-0600

I have uploaded bt output (48074.log). Please let me know if you need more information.

By: benjamin (benjamin) 2006-12-05 04:15:56.000-0600

Added my gdb info for rev 48168

By: Joshua C. Colp (jcolp) 2006-12-22 11:36:23.000-0600

Maybe we are going about this the wrong way... the backtraces show nothing strange in them, so maybe they aren't catching something that is executing fast and consistently that is causing CPU usage to be driven up. If you turn on core debug (logger.conf, add debug to the appropriate place you want it to appear) do you see something over and over? It's a long shot but the backtraces don't appear to be too helpful.

By: benjamin (benjamin) 2007-01-14 05:31:37.000-0600

Did more testing. Problem is very consistent after the end of incoming or outgoing SIP calls to specific provider (wengo). During the call, CPU is Ok. IAX calls seem to be Ok too. A call that does not complete (no answer) doesn't exhibit the problem.

I turned on "full" debugging in logger.conf here is the trace during outgoing call :

[Jan 14 12:13:56] VERBOSE[18761] logger.c:     -- Executing [0344654321@sip:1] Goto("SIP/handytone-007db9f0", "notenum|0344654321|1") in new stack
[Jan 14 12:13:56] VERBOSE[18761] logger.c:     -- Goto (notenum,0344654321,1)
[Jan 14 12:13:56] VERBOSE[18761] logger.c:     -- Executing [0344654321@notenum:1] Dial("SIP/handytone-007db9f0", "SIP/0344654321@wengo|40") in new stack
[Jan 14 12:13:56] DEBUG[18761] acl.c: ##### Testing 213.91.9.206 with 192.168.0.0
[Jan 14 12:13:56] DEBUG[18761] acl.c: ##### Testing 192.168.0.2 with 192.168.0.0
[Jan 14 12:13:56] VERBOSE[18761] logger.c:     -- Called 0344654321@wengo
[Jan 14 12:13:57] VERBOSE[18761] logger.c:     -- SIP/wengo-007c51f0 is ringing
[Jan 14 12:13:57] VERBOSE[18761] logger.c:     -- SIP/wengo-007c51f0 is making progress passing it to SIP/handytone-007db9f0
[Jan 14 12:14:27] VERBOSE[18761] logger.c:     -- SIP/wengo-007c51f0 answered SIP/handytone-007db9f0
[Jan 14 12:14:27] VERBOSE[18761] logger.c:     -- Packet2Packet bridging SIP/handytone-007db9f0 and SIP/wengo-007c51f0
[Jan 14 12:14:35] DEBUG[18308] acl.c: ##### Testing 213.91.9.206 with 192.168.0.0
[Jan 14 12:14:35] DEBUG[18308] acl.c: ##### Testing 192.168.0.2 with 192.168.0.0

=====> STILL 0% CPU HERE

[Jan 14 12:15:40] DEBUG[18761] channel.c: Returning from native bridge, channels: SIP/handytone-007db9f0, SIP/wengo-007c51f0
[Jan 14 12:15:40] VERBOSE[18761] logger.c:   == Spawn extension (notenum, 0344654321, 1) exited non-zero on 'SIP/handytone-007db9f0'
[Jan 14 12:15:40] DEBUG[18761] pbx.c: Function result is '"Benjamin Denoziere" <811>'
[Jan 14 12:15:40] DEBUG[18761] pbx.c: Function result is '811'
[Jan 14 12:15:40] DEBUG[18761] pbx.c: Function result is '0344654321'
[Jan 14 12:15:40] DEBUG[18761] pbx.c: Function result is 'notenum'
[Jan 14 12:15:40] DEBUG[18761] pbx.c: Function result is 'SIP/handytone-007db9f0'
[Jan 14 12:15:40] DEBUG[18761] pbx.c: Function result is 'SIP/wengo-007c51f0'
[Jan 14 12:15:40] DEBUG[18761] pbx.c: Function result is 'Dial'
[Jan 14 12:15:40] DEBUG[18761] pbx.c: Function result is 'SIP/0344654321@wengo|40'
[Jan 14 12:15:40] DEBUG[18761] pbx.c: Function result is '2007-01-14 12:13:56'
[Jan 14 12:15:40] DEBUG[18761] pbx.c: Function result is '2007-01-14 12:14:27'
[Jan 14 12:15:40] DEBUG[18761] pbx.c: Function result is '2007-01-14 12:15:40'
[Jan 14 12:15:40] DEBUG[18761] pbx.c: Function result is '104'
[Jan 14 12:15:40] DEBUG[18761] pbx.c: Function result is '73'
[Jan 14 12:15:40] DEBUG[18761] pbx.c: Function result is 'ANSWERED'
[Jan 14 12:15:40] DEBUG[18761] pbx.c: Function result is 'DOCUMENTATION'
[Jan 14 12:15:40] DEBUG[18761] pbx.c: Function result is ''
[Jan 14 12:15:40] DEBUG[18761] pbx.c: Function result is '1168773236.9'
[Jan 14 12:15:40] DEBUG[18761] pbx.c: Function result is ''

=====> NOW 100%

Hope this can help ? Let me know if I should enable specific debug options.

--

Update: Also tried with "core set debug" and "sip set debug" options, and log is a lot more verbose, but keeps quiet after the call is hanged up, and CPU goes to 100%



By: Serge Vecher (serge-v) 2007-01-15 10:43:13.000-0600

benjamin: is this with the latest 1.4 (r > 50800)? If not, please retry and always specify revision you are working with. Also, please enable sip debug and rtp debug and let's see if that shows anything interesting ...

By: benjamin (benjamin) 2007-01-15 12:21:20.000-0600

Just upgraded to SVN r50895 and I can no longer reproduce the problem.
Will be doing more testing but it seems the problem is fixed.
Thanks a lot for you support !

By: Joshua C. Colp (jcolp) 2007-01-15 12:38:04.000-0600

Fixed in 1.4 as of revision 50032 and trunk as of revision 50033. If this does turn out to still be an issue, please reopen. Peace!