[Home]

Summary:ASTERISK-08008: Segmentation Fault when a call is transfered from a queue
Reporter:dean bath (dean bath)Labels:
Date Opened:2006-10-25 15:31:02Date Closed:2007-06-19 10:46:39
Priority:CriticalRegression?No
Status:Closed/CompleteComponents:Channels/chan_sip/Transfers
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) bt_full.txt
( 1) btfull-20070501.txt
( 2) btgdb.txt
( 3) full.txt
( 4) gdb.txt
( 5) gdbcoreseg.txt
( 6) gdblog.txt
( 7) threadall1.zip
( 8) threadall2.zip
( 9) thread_apply_all_bt_full.txt
(10) transfer.btfull.05112007.txt
(11) verbose2cut.txt
(12) verbose2cut.zip
(13) verbosedebug.zip
Description:I found that when I try and transfer an external sip call from a queue, it crashes asterisk with a 2985 Segmentation fault. A sip call direct to a phone can be transfered fine. I'm using a Polycom 501. Attached file contains full seg fault and also a gdb of the core dump. I can reproduce this if more logs required.

****** ADDITIONAL INFORMATION ******

Asterisk 1.4.0-beta3 running on Debian 2.6.17.3 machine with zaptel and libpri 1.4.0-beta1
Comments:By: Serge Vecher (serge-v) 2006-10-26 10:23:38

hmm, looks like it actually crashes in chan_sip: Please

1. Enable DONT_OPTIMIZE flag in menuselect under Compiler Options.
2. Rebuild Asterisk
3. Perform 'bt'. Then 'thread apply all bt full'

Let's see the new output. Thanks.

By: dean bath (dean bath) 2006-10-27 04:49:56

As requested, changed flag, rebuilt asterisk, tested. Asterisk did crash again, logs uploaded.
Thanks, Dean.

btgdb.txt
threadall1.zip and threadall2.zip - split log file as it was too large to upload



By: Joshua C. Colp (jcolp) 2006-11-16 15:12:09.000-0600

Can you please give the latest 1.4 branch a try? I put in a change as of revision 47764 which may help with this. Thanks!

By: Olle Johansson (oej) 2006-12-01 03:57:49.000-0600

We need feedback!

/O

By: Zlatko Ignjatovic (klaja) 2006-12-18 04:30:57.000-0600

I don't know if this issue is really the same one I'm experiencing, please move if not:

1. I call an extension on asterisk (which will just loop one message)
2. I do attended transfer from callers phone (REFER w/ Replaces header)
3. Asterisk crashes in chan_unlock I think - because the new call is still not there, and current.chan2 is a null pointer

I will post my full log, and gdb output

I test the latest 1.4 svn trunk updated this morning

By: Olle Johansson (oej) 2007-02-15 12:25:58.000-0600

Any test results? Ping!

By: Daniel D (danield) 2007-02-24 00:33:58.000-0600

Hi, I'm experiencing this issue with asterisk 1.4.0, both the release and more recent svn branch r56569, transferring from phone direct to another phone is fine, but from a call in a queue to another phone crashes asterisk, and freezes phone.   The problem was intermittant when DONT_OPTIMIZE and DEBUG_THREAD was unselected, but its happened every time since i recompiled with those options selected.

attached file is gdb trace of bt and thread apply all bt full.  OS is centos 4.4, always reproduceable.



By: Serge Vecher (serge-v) 2007-02-26 09:59:47.000-0600

danield, since you can reproduce this at will, please attach the console log prior to the crash as per the following (use code from 1.4 for testing):
1) Prepare test environment (reduce the amount of unrelated traffic on the server);
2) Make sure your logger.conf has the following line:
  console => notice,warning,error,debug
3) restart Asterisk with the following command:
  'asterisk -Tvvvvvdddddngc | tee /tmp/verbosedebug.txt'
4) Enable SIP transaction logging with the following CLI commands:
set debug 4
set verbose 4
sip debug
5) Reproduce the problem
6) Trim startup information and attach verbosedebug.txt to the issue.


By: Daniel D (danield) 2007-02-27 05:49:08.000-0600

log attached, snipped from when i got the sip debug in entered, can provide full if needed

used the same revision as previously reported

By: Joshua C. Colp (jcolp) 2007-03-05 13:56:03.000-0600

Can you please try latest 1.4? I believe I have tracked down the cause.

By: Daniel D (danield) 2007-03-06 05:27:29.000-0600

Tried again with r57914, problem still exists, providing another gdb log (bt and thread apply all bt full) and another verbose log as explained by serge-v

By: Serge Vecher (serge-v) 2007-03-06 09:46:45.000-0600

looks like a basic transfer, the crash is strange. Can you please blow out the asterisk sources, download a 1.4.1 tarball and do a fresh install?

By: Daniel D (danield) 2007-03-07 06:59:11.000-0600

Couldn't immediately reproduce the problem with 1.4.1 tarball (unoptimized build) first time since lodging the bug.  Have placed a optimized build in place (as its a production system) and will monitor for further segfaults.

By: Serge Vecher (serge-v) 2007-03-07 08:39:05.000-0600

stranger things will happen, especially if you put on patches and then run svn update on top of such repo. In either case, please keep us posted here.

By: alex (alex) 2007-03-26 04:25:31

I really think that ast_channel_unlock(current.chan2); in handle_request_refer does not needed.

By: Serge Vecher (serge-v) 2007-03-26 10:15:32

why do you think so?

By: alex (alex) 2007-03-26 14:02:11

current.chan2 = ast_bridged_channel(current.chan1); but chan2 not locked everithing in this function. Or i mistake? ast_bridged_channel not lock finded channel, why then we unlock him?

But problem not here. Look at local_attended_transfer :

/* Perform the transfer */
res = attempt_transfer(current, &target);
ast_mutex_unlock(&targetcall_pvt->lock);
if (res) {
/* Failed transfer */
/* Could find better message, but they will get the point */
transmit_notify_with_sipfrag(transferer, seqno, "486 Busy", TRUE);
append_history(transferer, "Xfer", "Refer failed");
if (targetcall_pvt->owner)
ast_channel_unlock(targetcall_pvt->owner);
/* Right now, we have to hangup, sorry. Bridge is destroyed */
ast_hangup(transferer->owner);
} else {

ast_hangup(transferer->owner); will destroy all(owner and sip private structure) , after return SIGSEGV? ast_hangup in here have access for all semaphores + RECURSIVE semaphores logic.
Need try change ast_hangup(transferer->owner); to ast_softhangup_nolock(transferer->owner, AST_SOFTHANGUP_DEV); .

By: alex (alex) 2007-03-26 14:50:09

^^^^^ True only if AST_FLAG_ZOMBIE flag set for this channel. We not call sip_hangup but destroy owner. I not sure on this, need check:)

By: alex (alex) 2007-03-28 07:09:27

static int local_attended_transfer(struct sip_pvt *transferer, struct sip_dual *current, struct sip_request *req, int seqno)

if (error) { /* Cancel transfer */
transmit_notify_with_sipfrag(transferer, seqno, "503 Service Unavailable", TRUE);
append_history(transferer, "Xfer", "Refer failed");
ast_clear_flag(&transferer->flags[0], SIP_GOTREFER);
transferer->refer->status = REFER_FAILED;
ast_mutex_unlock(&targetcall_pvt->lock);
ast_channel_unlock(current->chan1); <---- This not needed , after return to sipsock_read maybe Segmentation
ast_channel_unlock(target.chan1);
return -1;
}

By: jmls (jmls) 2007-05-01 16:35:05

we had an attender transfer segfault today. Running on SVN-branch-1.4-r59289

btfull-20070501.txt attached

By: jmls (jmls) 2007-05-02 00:57:20

is it possible that ASTERISK-9066 has some relation to this ?

By: Olle Johansson (oej) 2007-05-02 01:20:00

Added a small fix to chan_sip for the latest backtrace from jmls. Please try again and see if there's still any open issues in this bug report.

By: jmls (jmls) 2007-05-03 16:22:09

Had no problems on the test system. I will try to find a window to patch the production system and see if the error happens again.

Trouble is, it only occurs when I say "look how stable my 1.4 production server is. XX days of uptime."

Oh sh*t. I've done it again :)

By: callguy (callguy) 2007-05-11 09:18:48

we're seeing the same issue in 1.4.4, and I've attached a bt/btfull (transfer.btfull.05112007). In our case it's a regular attended transfer, no queue involved.

I don't see any diff files in the bug report - are the patches mentioned committed to svn trunk for 1.4 or are they somewhere else?

By: David J Craigon (superdjc) 2007-06-11 09:00:46

Happens to me on rev 68595 of the 1.4 branch.

By: Joshua C. Colp (jcolp) 2007-06-19 10:46:37

This issue has already been fixed in 1.4+ in subversion while I was looking at some personal transfer issues. If it is still an issue with the latest please reopen. Thanks!