Summary: | ASTERISK-03510: Crash on unknown situation | ||
Reporter: | laserfox (laserfox) | Labels: | |
Date Opened: | 2005-02-14 10:10:14.000-0600 | Date Closed: | 2008-01-15 15:28:15.000-0600 |
Priority: | Critical | Regression? | No |
Status: | Closed/Complete | Components: | Core/General |
Versions: | Frequency of Occurrence | ||
Related Issues: | |||
Environment: | Attachments: | ( 0) agents.conf ( 1) backtrace.txt ( 2) backtrace-20052202.txt ( 3) cli-beforesegfault.txt ( 4) extensions.conf ( 5) newbacktrace.txt ( 6) queues.conf | |
Description: | I can receive PSTN calls perfectly and use all the applications that i need. After some time an unexpected segmentation fault and nothing is reported in asterisk CLI. ****** ADDITIONAL INFORMATION ****** Running Fedora Core 2 Digium TE405P with the 4 E1 port in use. | ||
Comments: | By: nick (nick) 2005-02-14 10:29:19.000-0600 Backtrace makes it look like a SIP bug. By: nick (nick) 2005-02-14 10:30:40.000-0600 Also, your backtrace might be more useful if you did a make clean && make valgrind && make install. By: Mark Spencer (markster) 2005-02-14 13:36:23.000-0600 This is a technical support issue, please pursue through support@digium.com By: laserfox (laserfox) 2005-02-17 17:08:36.000-0600 I've talked with Kevin in #asterisk, send to him the backtrace and he recommended that i reopen the bug. I'm sending a new backtrace (asterisk recompiled with valgrind). edited on: 02-17-05 17:10 By: Mark Spencer (markster) 2005-02-17 17:11:51.000-0600 This backtrace is just filled with garbage. It doesn't contain any useful information. Is Kevin going to debug this? By: laserfox (laserfox) 2005-02-17 17:35:42.000-0600 So, why my asterisk keep crashing? Here are the today "crash" logs: Restart -> Thu Feb 17 03:04:24 BRST 2005 Restart -> Thu Feb 17 04:54:52 BRST 2005 Restart -> Thu Feb 17 05:59:41 BRST 2005 Restart -> Thu Feb 17 10:01:59 BRST 2005 Restart -> Thu Feb 17 12:06:02 BRST 2005 Restart -> Thu Feb 17 14:30:38 BRST 2005 Restart -> Thu Feb 17 15:28:00 BRST 2005 Restart -> Thu Feb 17 16:03:26 BRST 2005 Restart -> Thu Feb 17 16:06:52 BRST 2005 Restart -> Thu Feb 17 16:29:44 BRST 2005 Restart -> Thu Feb 17 16:48:24 BRST 2005 Restart -> Thu Feb 17 17:31:27 BRST 2005 Restart -> Thu Feb 17 19:11:48 BRST 2005 Restart -> Thu Feb 17 19:30:39 BRST 2005 =[ By: Mark Spencer (markster) 2005-02-17 19:12:50.000-0600 How are your agents logging in? Can you confirm this is totally unpatched CVS asterisk? By: Mark Spencer (markster) 2005-02-17 19:13:37.000-0600 Also, post your agents.conf, extensions.conf, queues.conf. Do you have any kind of event that in particular causes this? By: laserfox (laserfox) 2005-02-18 05:28:55.000-0600 They are login using exten => *11,1,AgentLogin Yes, this CVS is not patched. No, i can't reproduce the problem... i'll post the files. By: laserfox (laserfox) 2005-02-18 12:33:56.000-0600 After some time in lab, i was able to reproduce the problem sometimes (not always). I log a Grandstream BT100 with AgentLogin, call sometimes to the phone, then i press the Transfer key of the phone and Asterisk crash with segfault. By: Anthony Minessale (anthm) 2005-02-21 14:58:31.000-0600 chanfix.diff should take care of it. Disclaimer on file. By: laserfox (laserfox) 2005-02-21 20:24:34.000-0600 This patch resolved one of my problems (Asterisk segfault pressing the BT100 Transfer button after make a # transfer), but i'm still having segfaults. By: Anthony Minessale (anthm) 2005-02-24 11:55:57.000-0600 the issue is in chan_agent that much I know the patch I posted just hides the problem so forget it. You can reproduce the problem by making a customer call a queue so it bridges to an agent that is logged in via sip then # transfer the caller back into the same queue, once any agent gets the call gets the call again, the sip channel ends up with a corrupted _bridge pointer that explodes when you do anything that looks at it like pressing the sip transfer button. The exact steps one user followed we like this: agent 1000 agent 1001 queue 1000 (contains agent 1000) queue 1001 (contains agent 1001) queue 2000 (contains both agents) log into both agents on a sip channel call in as a customer on zap to queue 2000 (possibly any channel) whichever agent gets the call, # blind transfer it to the opposite agent's private queue. (eg 1001 xfer to an ext leading to queue 1002) once the other agent gets the call attempt a sip transfer and boom Again, this is not really related to sip xfer it's the fact that at the last step above the sip channel has a corrupted ->_bridge pointer using an older chan_agent.c eliminated this so the issue must be in that file. edited on: 02-24-05 11:56 By: twisted (twisted) 2005-03-08 15:26:10.000-0600 anthm, how would you propose we fix this? Any ideas? This one is admittantly over my head :P By: Fernando Romo (el_pop) 2005-03-13 23:14:53.000-0600 anthm: which version of chan_agent.c are working? checking the cvs log i presume you are using rev. 1.120 before the pvt changes. only reverse chan_agent.c version? or back res_musiconhold.c too? By: Mark Spencer (markster) 2005-03-14 00:04:56.000-0600 Are you using the sip transfer button to do the transfer the first two times? By: damin (damin) 2005-03-17 21:33:26.000-0600 laserfox: Your last update to this bug was almost a month ago. Are you still having the issues with current CVS? If not, then can we close this out? If so, can you provide some more debugging information to help us pinpoint with new code? anthm: You have pinpointed that this is a corrupted bridge pointer and that an earlier version of chan_agent.c doesn't exhibit the same problem. Do we know, or need to know between what versions the problem occurs? markster: You mentioned on the Dev conference that this should not be an issue in Stable, and that you believed that it had been fixed in current CVS. By: laserfox (laserfox) 2005-03-18 08:16:58.000-0600 damin, i´ve updated to current CVS (yesterday) and i can simulate the problem yet. I´ll try to debug this with Mark today. By: Mark Spencer (markster) 2005-03-20 02:10:09.000-0600 Still need you to find me to work on this if it's still an issue. By: damin (damin) 2005-03-20 11:36:43.000-0600 Perhaps we can setup a specific time/day/location to work on this issue? I think that if we can get a couple of people to reproduce and backtrace it on different systems, we might be better able to find the source of the problem and fix it. I'm not sure what country laserfox is in, but I'm located in US EST timezone, and I hang out on #asterisk, as does Kram. By: Mark Spencer (markster) 2005-03-22 13:23:41.000-0600 Fixed in CVS head. By: Russell Bryant (russell) 2005-03-31 21:01:42.000-0600 since someone said that an older version worked, I'm going to assume this is not an issue in 1.0 By: Digium Subversion (svnbot) 2008-01-15 15:28:15.000-0600 Repository: asterisk Revision: 5229 U trunk/channels/chan_agent.c ------------------------------------------------------------------------ r5229 | markster | 2008-01-15 15:28:15 -0600 (Tue, 15 Jan 2008) | 2 lines Fix chan_agent segfault (bug ASTERISK-3510) ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=5229 |