Summary:ASTERISK-06768: Segfault on show channels after Agent is removed from Meetme
Reporter:Frank Waller (explidous)Labels:
Date Opened:2006-04-12 22:01:27Date Closed:2011-06-07 14:01:08
Versions:Frequency of
Environment:Attachments:( 0) AMI.log
( 1) bt.txt
( 2) btfull.txt
( 3) console.txt
( 4) gdb.txt
( 5) new_crash_after_update_to_r21002M.txt
Description:We are curently working on a inhouse predictive dialer solution, between 1.2.4 and 1.2.6/Head 19466 we noticed a number of critical changes:
An Agent that is removed from a Meetme will not be getting any more calls (we use a redirect to take him out). Asterisk will become instantly unstable as in: show agents hangs the CLI, show channels segfaults, new manager connections are either denied or do not respond. I am able to reproduce the issue with HEAD 19466 as of today. We tried various other ways to remove the Agent from the conference but all end up without the agent getting ready to accept the next call.


Control and dialing is thru AMI, I can provide a complete log of the communication as well as the Asterisk console output.
Outbound calling:
- Agentlogin thru Dial from AMI using Local/ Extension (a workaround to use channels would be major work and unfeasible because we have to have users connected to multiple servers)
- AMI Programm dials from Channel (e.g. SIP/test/xxxxxxxxxxx or ZAP/g1/xxxxxxxxxx ) and directed to exten for queue.
- Agent gets beep and is on call
- AMIProgramm redirects Agent and external channel into Meetme
- AMIProgramm Dials second Agentqueue into Meetme
- AMIProgramm transfers first Agent out to Dialplan extension which plays wav and Hangs up

---> Agent is now reported to be ready to take new calls (Up) and sitting in app Agentlogin but calls are sitting in queue and system becomes unstable...
Comments:By: Frank Waller (explidous) 2006-04-12 22:06:37

Hardware is Dual dualcore Opteron 4GB Ram running SuSE 10.0 (modified)
running all in 64bit mode.

By: Frank Waller (explidous) 2006-04-17 14:49:13

After updating svn to the current version the problem is still there, but beside of that we see
Apr 17 15:45:02 WARNING[18161]: channel.c:1642 ast_waitfor_nandfds: Thread 1082526048 Blocking 'Zap/pseudo-1228926242', already blocked by thread 1080928608 in procedure ast_waitfor_nandfds
Apr 17 15:46:16 WARNING[18161]: chan_zap.c:4607 zt_read: Thread 1082526048 Blocking 'Zap/pseudo-1228926242', already blocked by thread 1082526048 in procedure ast_waitfor_nandfds
Apr 17 15:48:18 WARNING[18162]: channel.c:1642 ast_waitfor_nandfds: Thread 1080928608 Blocking 'Zap/pseudo-1228926242', already blocked by thread 1082526048 in procedure ast_waitfor_nandfds
Apr 17 15:49:05 WARNING[18162]: chan_zap.c:4607 zt_read: Thread 1080928608 Blocking 'Zap/pseudo-1228926242', already blocked by thread 1080928608 in procedure ast_waitfor_nandfds

Warnings on the console.

Any help would be greatly appreciated.

Thanks in Advance


By: Serge Vecher (serge-v) 2006-04-17 14:58:57

explidous: can you please elaborate on the nature of your modifications to the Asterisk sources? Is it feasible to test with an unpatched SVN copy?

By: Frank Waller (explidous) 2006-04-17 17:27:13

Till about an hour ago I was running unpatched svn. All the postings here are from unpachted svn versions.
I just patched it with the patch from 0006975, which seems to make my box not crash on every "show channel" after an Agent is take out of the meetme, still it does not work right yet.
Agent is not available to take new calls after beeing remove from Meetme...

By: Serge Vecher (serge-v) 2006-05-02 16:23:40

explidous: there was an important fix committed earlier to file.c in both 1.2/trunk.

Can you please test with latest unpatched trunk or 1.2 built with 'make dont-optimize' and produce a bt if the problem is still there. Thanks.

By: Frank Waller (explidous) 2006-05-04 16:47:28

I most likely can not retest before Monday, but as soon as our new development box is up I am going to update to svn-HEAD again to test it.


By: Serge Vecher (serge-v) 2006-05-12 12:24:38

please feel free to reopen if the problem still occurs in trunk with rev > 26000. Don't forget to attach a backtrace from non-optimized build.