[Home]

Summary:ASTERISK-16996: ooh323 crashes with segmentation fault every 2-3 days
Reporter:brett (celtic6969)Labels:
Date Opened:2010-11-21 22:14:18.000-0600Date Closed:2010-12-01 13:33:45.000-0600
Priority:CriticalRegression?No
Status:Closed/CompleteComponents:Addons/chan_ooh323
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) log_full.txt
( 1) log_h323_log.txt
( 2) ooh323.conf
( 3) show_locks.txt
( 4) trace_gdb.txt
Description:Server using Asterisk 1.6.2.14 and asterisk-addons 1.6.2.2 (both tarball releases). The server receives inbound ooh323 calls (20 calls per minute) which are terminated by the caller before the call is answered. After 2 to 3 days the asterisk will core dump with segmentation fault.

****** ADDITIONAL INFORMATION ******

OS: CentOS 5.4
kernel: 2.6.18-194.26.1.el5
asterisk: 1.6.2.14 (DEBUG_THREADS & DONT_OPTIMIZE)
addons: 1.6.2.2

Will attach gdb trace and logs.
Comments:By: brett (celtic6969) 2010-11-21 22:38:03.000-0600

Restarted the asterisk today and enabled ooh323 and RTP debug. OOH323 looks to have died quietly a few hours later, asterisk console is still accessable but the server not processing any h323 calls.

By: BrettH (zeero) 2010-11-22 09:12:40.000-0600

My login name has changed from celtic69 -> zeero. Following my previous note, a "core show locks" was attached to this case after ooh323 had died but the console was still responding.

By: BrettH (zeero) 2010-11-24 06:23:42.000-0600

I tried 1.4 SVN with same result :(

Connected to Asterisk SVN-branch-1.4-r295906 currently running on receiver01 (pid = 16743)
Verbosity is at least 6
Core debug is at least 6
receiver01*CLI> core show locks

=======================================================================
=== Currently Held Locks ==============================================
=======================================================================
===
=== <file> <line num> <function> <lock name> <lock addr> (times locked)
===
=== Thread ID: -1208968304 (ooh323c_stack_thread started at [   43] ooh323cDriver.c ooh323c_start_stack_thread())
=== ---> Lock #0 (chan_ooh323.c): MUTEX 1517 onCallCleared &p->lock 0x9d034f0 (1)
=== ---> Waiting for Lock #1 (channel.c): MUTEX 920 __ast_queue_frame (channel lock) 0x9d01ee0 (1)
=== --- ---> Locked Here: channel.c line 1652 (ast_hangup)
=== -------------------------------------------------------------------
===
=== Thread ID: -1209459824 (pbx_thread           started at [ 2705] pbx.c ast_pbx_start())
=== ---> Lock #0 (channel.c): MUTEX 1652 ast_hangup (channel lock) 0x9d01ee0 (1)
=== ---> Waiting for Lock #1 (chan_ooh323.c): MUTEX 881 ooh323_hangup &p->lock 0x9d034f0 (1)
=== --- ---> Locked Here: chan_ooh323.c line 1517 (onCallCleared)
=== -------------------------------------------------------------------
===
=======================================================================


Can anyone please assist or provide a recommendation?



By: Alexander Anikin (may213) 2010-11-24 10:46:01.000-0600

Hi,

for first i recommend upgrade to 1.8 version because there is too much various errors in the original ooh323 codes, many of these fixed for 1.8
I think that 1.4/1.6 addons version of chan_ooh323 is not usable for production, especially for high load environment. This is due to singlethread model and many bugs in codes.

By: Alexander Anikin (may213) 2010-11-24 11:25:34.000-0600

for trying to solve this problem in addons-1.6.2.2 please attach bt full gdb output here. Main reason is a ooh323 stack memory heap corruption as i think.

By: BrettH (zeero) 2010-11-24 23:15:36.000-0600

Thanks for responding May. I will install 1.8 SVN version and compile with the relevant debug flags. I will provide an update soon.

By: Alexander Anikin (may213) 2010-11-24 23:38:43.000-0600

Ok, i think there will not this trouble with 1.8 or trunk.

By: BrettH (zeero) 2010-11-25 07:54:06.000-0600

Hi May,

I have installed "Asterisk SVN-branch-1.8-r296230" and ooh323 is currently processing calls. I will need to monitor for at least 3 days to measure stability.

I have 2 concerns with version 1.8:

1) ooh323 would not accept any calls (chan_ooh323.c:1797 ooh323_onReceivedSetup: Unacceptable ip 10.8.6.252) until I configured a "user profile" and IP address for every Cisco gateway in the network. Is there a way to change this behaviour as this will be an issue for customers with many h.323 gateways and no gatekeeper.

2) The asterisk process was initially running at 98.9% cpu utilisation for at least 1 hour but was still processing calls. I then restarted the asterisk process and it is currently running at 0.2%. I will monitor and see if the high cpu reoccurs.



By: Alexander Anikin (may213) 2010-11-25 17:42:13.000-0600

Hi,

1. yes, it's normal, you must define profile for all known peers/users or gateway else you'll have hole for unauthorized access to calls via your system. I haven't idea how to make guest access without this hole. Btw for SIP you must create definition for every endpoint also.

2. I suggest that 100 or above % cpu usage is not h323 trouble possible some other module in asterisk

By: BrettH (zeero) 2010-11-30 18:58:02.000-0600

Hi May,

Thanks for your help, asterisk 1.8 has been stable for 5 days. I think we can close this issue now and if the problem reoccurs I will create a new issue.

Thanks again.

By: Alexander Anikin (may213) 2010-12-01 13:32:39.000-0600

Hi,

Thanks you for testing, i'll close this issue, feel free to reopen or create new issue if it will need.