Summary:ASTERISK-05339: zap channels not closing after hangup
Reporter:Jennifer Hales (jennifer hales)Labels:
Date Opened:2005-10-20 20:26:24Date Closed:2011-06-07 14:00:22
Versions:Frequency of
Environment:Attachments:( 0) 081output.txt
( 1) 082output.txt
( 2) 083output.txt
( 3) died.rtf
( 4) modules.txt
( 5) output.txt
( 6) output1.txt
( 7) output3.txt
Description:We are running CVS Head 2005-10-10 21:37:47 utc. We are a call centre and all our calls come into queues. We have a random problem where our 30 pri lines are showing on the cli screen as full and allocated to queues, however no agents are takeing calls.  It appears that these channels are not being disconnected from asterisk once a caller has hung up.  Calls continue to come in causing asterisk to show WARNING[3762]: chan_zap.c:8107 pri_dchannel: Ring requested on channel 0/8 already in use on span 1. Hanging up owner.
Comments:By: Jennifer Hales (jennifer hales) 2005-10-20 20:32:27

I have tried to attach mycorefile, however it is 67,000k in size. What should I do with this?

By: Clod Patry (junky) 2005-10-20 21:15:23

read the file README.backtrace and provide just these informations, not the the whole coredump.

Do you get any core dump at all? or its just a block?

By: Jennifer Hales (jennifer hales) 2005-10-21 04:54:34

One of the symptoms that also occurs is that Music on hold stops functioning.

By: Jennifer Hales (jennifer hales) 2005-10-25 23:20:59

Do I need to supply more information?  This is an urgent issue for us and we do not have the resources to debug/resolve the issue ourselves.  I am happy to provide any information that is required.

By: Olle Johansson (oej) 2005-10-26 01:02:53

mattf, can you please take a look at this?

By: Matthew Fredrickson (mattf) 2005-10-26 15:46:21

I can't really see anything useful at all in this backtrace.  I know that Kevin has made a few changes in the last couple of weeks on possibly related code so make sure that this works on up to date code.  Also make sure that you compile without optimizations.  If you get another deadlock situation, post the backtrace again and maybe we can get more information about what might be causing the problem.

By: Mark Spencer (markster) 2005-10-31 23:25:40.000-0600

Can you please do a "show modules" and paste the result in here?

By: Jennifer Hales (jennifer hales) 2005-11-01 03:31:42.000-0600

I have just updated to the latest cvs head and compiled without optimizations.  Will let you know how it goes.  This can take two weeks to occur.

By: Jennifer Hales (jennifer hales) 2005-11-01 03:51:40.000-0600

After updating to the latest cvs head I did show modules and that file is now attached.

By: Mark Spencer (markster) 2005-11-02 21:42:55.000-0600

DO not load cdr_addon_mysql and let me know if that fixes the problem.

By: Jennifer Hales (jennifer hales) 2005-11-02 22:17:20.000-0600

We have had our problem occur again today.  I have attached the output and a copy of the cli screen. I would appreciate if you could help with this as soon as possible as this is our live system and is impacting on our business operations.

By: Jennifer Hales (jennifer hales) 2005-11-02 22:48:16.000-0600

We are running 64bit Centos on a Dell PowerEdge 2850 with the on board network cards disabled.

By: Mark Spencer (markster) 2005-11-02 22:52:17.000-0600

If this is time critical, I suggest you use Digium technical support, not the Asterisk bug tracker.  Also, as I suggested, please do *not* load the MySQL CDR module and see if the problem persists.

By: Jennifer Hales (jennifer hales) 2005-11-06 17:26:12.000-0600

I loaded asterisk without the cdr_addon_mysql on 5-11-2005.  On 6-11-2005 at around 3:00pm our problem occured again, however I was unable to capture the information on this occasion. I will do so when it happens again.

By: Jennifer Hales (jennifer hales) 2005-11-08 17:24:42.000-0600

Things went wrong again last night and this morning. The guy who was here last night caputured things at the beginning, in the middle and at the end. These files are 081output to 083.  I only captured the end result this morning.  I hope this is all helps.

By: Paul Hales (paulh) 2005-11-19 05:19:50.000-0600

We spoke to Digium technicial support this morning - with some we can close this bug soon.
The time difference means that we only have a short time each day to contact Digium technical support, but that's life.

By: Matthew Fredrickson (mattf) 2005-12-19 06:32:22.000-0600

Should be through support

By: Jennifer Hales (jennifer hales) 2005-12-19 20:16:52.000-0600

This problem has not been resolved as of yet.  We have implemented a work around that allows us to function in a production environment.  We did this by changing agentcallback login to addqueuemember login, however this condition is still reproducible.

By: dalabera (dalabera) 2005-12-20 08:21:37.000-0600

We have experienced the same problem. Since upgrading to 1.2.x we seen to aliviate the issue. We are using an old t400p quadspan digium card with phones connected to a channel bank.
What I seen when the T1 get a lot of calls and the agents start hanging and picking calls to fast a deadlock occurs and after that you start receiving the annoying message.
We did contact Digium Support, however they couldn't find the cause of the problem and I wasn't to reproduce it. By the time the problem ocurred their office where close.
Good Luck

By: Matthew Fredrickson (mattf) 2005-12-27 07:35:15.000-0600

dalabera: Can you provide a better backtrace on the matter?  If not, there is not anything we can really do about it right now.  Every backtrace from Jennifer Hales' system has threads with corrupted stacks, rendering them essentially useless for the purpose of debugging this problem.

By: Jennifer Hales (jennifer hales) 2006-01-03 15:29:17.000-0600

Re: Mattf's comment about threads with corrupted stacks.  Does this mean there are problems or instabilities with my system?  Is this rectifiable? So I can send you uncorrupted back traces.

By: Matthew Fredrickson (mattf) 2006-01-04 07:37:20.000-0600

It appears that whatever is causing your problem is also causing the stack corruption.  It appears that there isn't really one way to reproduce the one without having the other.

By: Ian Sherman (stitchtech) 2006-01-17 02:55:19.000-0600

This is possibly related to http://bugs.digium.com/view.php?id=6196.  We were having the same symptoms as above occuring once or twice a day i.e. deadlocked zap channels from the queue app.  We removed the "weight" option from queues.conf and are hoping this solves things.  Will report back at the end of the week as to status.

By: Ian Sherman (stitchtech) 2006-01-26 01:36:12.000-0600

I can confirm that removing the weight option has fixed the problem.

By: Matthew Fredrickson (mattf) 2006-04-21 13:10:42

This bugnote is not doing anything productive.  If you have a specific problem in regards to chan_agent with lock ups, open a new bug with your problem, and it will be handled on an individual basis.