|Summary:||ASTERISK-07870: Agent logoff not working|
|Reporter:||Matt King, M.A. Oxon. (kebl0155)||Labels:|
|Date Opened:||2006-10-04 07:45:30||Date Closed:||2007-06-30 09:20:03|
|Environment:||Attachments:||( 0) agentbt.txt|
( 1) agentbtthread.txt
( 2) agentbugbt.txt
|Description:||This is an intermittent fault. We are using Zap channels to distribute calls to agents.|
Agents log in over Manager, or over AgentCallbackLogin, and are able to take calls until the fault occurs. Agents then cannot receive calls, or logout and log back in.
****** ADDITIONAL INFORMATION ******
When this happens, the agent is shown in show agents as:
4108 (Amanda Bowld) logged in on Zap/42-1 is idle (musiconhold is 'agents')
This looks like appropriate output for AgentLogin (i.e. persistent connection), however the agent is using AgentCallbackLogin.
The agent is then unable to receive calls. The agent is unable to log out over manager, or through AgentCallbackLogin(||l), which says 'this agent is already logged in. Please enter another agent code'.
Doing 'show channels' gives:
Zap/42-1 s@orderlyq-inbound:1 Up (None)
even though there is no actual call.
agent logoff Agent/4108
Logging out 4108
however then doing a 'show agents' gives the same output. The agent is still logged in.
Trying to shut the zap channel from the CLI also fails.
In the end I have to stop Asterisk. 'stop now' does not work, and I have to run a killall. Asterisk then core dumps (attached, compiled with make valgrind).
I have full debug level logs available on request, but these are rather large files.
|Comments:||By: Matt King, M.A. Oxon. (kebl0155) 2006-10-04 07:54:15|
The core upload failed with APPLICATION ERROR ASTERISK-398.
By: Serge Vecher (serge-v) 2006-10-04 08:39:13
core file is not necessary. Please produce the bt from the core file as per following page http://www.voip-info.org/tiki-index.php?page=Asterisk%20debugging
By: Matt King, M.A. Oxon. (kebl0155) 2006-10-04 09:12:58
OK did a bt and a bt full - not much there though...
By: Serge Vecher (serge-v) 2006-10-04 09:28:17
matt: can you also review the deadlock section of the above link and produce a backtrace when asterisk is in a deadlocked state (before you try to kill it).
By: Matt King, M.A. Oxon. (kebl0155) 2006-10-04 09:45:33
OK uploaded with the gdb thread debugging output.
I'll get a bt before killing next time this happens. Asterisk doesn't seem to hang for everything - it's just that the agent can't log out, and it can't be killed with 'stop now'.
Hope this helps,
By: jmls (jmls) 2006-11-01 12:19:27.000-0600
kebl0155, were you able to get hold of that bt ?
By: Matt King, M.A. Oxon. (kebl0155) 2006-11-01 12:29:33.000-0600
Hello, we haven't seen this since my last post - but this does seem to be an intermittent fault.
I will send a bt as requested next time this happens.
By: Matt King, M.A. Oxon. (kebl0155) 2006-11-27 15:36:29.000-0600
This happened twice again today (after not happening for quite some time). The call centre is now on 1.4 beta 3. The second time this happened I was able to persuade the call centre manager to leave the agent logged in until after closing time so I could debug properly.
Attached is a bt, bt full, info thread and thread apply all bt from the running asterisk as requested.
The agent login that had the problem was Claire Bunclark, who shows up in show agents as
4110 (Claire Bunclark) logged in on Local/6310@orderlyq-inbound-d7ea,1 is idle (musiconhold is 'agents')
When I turned on debug logging (using logger rotate), I got a whole load of messages like this:
[Nov 27 21:19:00] DEBUG: channel.c:758 ast_queue_frame: Dropping voice to exceptionally long queue on Local/6310@orderlyq-inbound-d7ea,1
Claire cannot log out over Manager, or from the command line as before. When I did a show channels, I got
Zap/55-1 s@orderlyq-inbound:1 Up (None)
Local/6310@orderlyq- s@orderlyq-inbound:1 Down (None)
Zap/57-1 s@orderlyq-inbound:1 Up (None)
Zap/37-1 s@orderlyq-inbound:1 Up (None)
Zap/52-1 s@orderlyq-inbound:1 Up (None)
Zap/48-1 s@orderlyq-inbound:1 Up (None)
Local/6304@orderlyq- s@orderlyq-inbound:1 Down (None)
Zap/53-1 s@orderlyq-inbound:1 Up (None)
Zap/41-1 s@orderlyq-inbound:1 Up (None)
Zap/33-1 s@orderlyq-inbound:1 Up (None)
Zap/43-1 s@orderlyq-inbound:1 Up (None)
Zap/42-1 s@orderlyq-inbound:1 Up (None)
Zap/40-1 s@orderlyq-inbound:1 Up (None)
Zap/32-1 s@orderlyq-inbound:1 Up (None)
Zap/38-1 s@orderlyq-inbound:1 Up (None)
Zap/39-1 s@orderlyq-inbound:1 Up (None)
Zap/36-1 s@orderlyq-inbound:1 Up (None)
Zap/35-1 s@orderlyq-inbound:1 Up (None)
Zap/34-1 s@orderlyq-inbound:1 Up (None)
even though there were no active calls at the time.
I'm thinking Asterisk isn't closing channels down for some reason, and that this is why Claire couldn't log out - Asterisk thinks she's still on a call that has in fact ended.
Please let me know if there's any further information you need.
By: Joshua C. Colp (jcolp) 2006-12-04 14:26:44.000-0600
Is a core dump available that you can open in gdb and examine for this? I have an idea based on the backtrace you provided. Thanks.
By: Matt King, M.A. Oxon. (kebl0155) 2006-12-04 14:55:14.000-0600
Thanks again for your help. Yes we still have the original core.
Unfortunately it's too large to be uploaded here - I can email it to you though. My email is m AT orderlyq DOT com
Hope this helps,
By: Adolfo R. Brandes (arbrandes) 2006-12-08 13:46:57.000-0600
This is easily reproducible in 1.2, 1.4, and trunk. You just have to login an agent with AgentCallbackLogin(), have it accept a call, and then run "agent logoff" from the CLI while the call is still up. One of the legs of the call (in my experience, the agent's leg) will get invariably hung. "agent logoff XXX soft" is no good either.
By: Joshua C. Colp (jcolp) 2007-02-20 16:49:59.000-0600
Fixed in 1.2 as of revision 55669, 1.4 as of revision 55670, and trunk as of revision 55671.