[Home]

Summary:ASTERISK-07189: Cannot put a call on hold and resume
Reporter:jmls (jmls)Labels:
Date Opened:2006-06-16 12:23:30Date Closed:2006-07-02 14:58:21
Priority:MajorRegression?No
Status:Closed/CompleteComponents:Core/General
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) console20060620-1600.txt
( 1) console20060620-1823-r32229.txt
( 2) console20060620-1823-r33869.txt
( 3) console20060621-1612.txt
( 4) console20060622-2305.txt
Description:Using svn trunk 34498, Cisco 7960 phone SIP

I cannot make a call from the Cisco, put the call on hold and resume. The call sticks with the music on hold and generates a notice on the console

I have tried calling a mobile, an extension that plays music and an extension that plays the demo-congrats, each with the same result.

I am using the random native music on hold.

****** ADDITIONAL INFORMATION ******

The console says

Executing [705@from-sip:1] Answer("SIP/706-7309", "") in new stack
   -- Executing [705@from-sip:2] MusicOnHold("SIP/706-7309", "") in new stack
   -- Started music on hold, class 'default', on SIP/706-7309

put the call on hold here.

try to resume the call here.

RTCP SR transmission error, rtcp halted No such file or directory
Jun 16 18:11:52 NOTICE[1746]: sched.c:287 ast_sched_del: Attempted to delete nonexistent schedule entry 16!
Comments:By: Serge Vecher (serge-v) 2006-06-16 12:52:19

I cannot reproduce this issue. Please enable debug log in logger.conf, reload asterisk, set debug 4, sip debug and capture what's happening. Thanks.

By: jmls (jmls) 2006-06-16 13:02:17

I did what you asked for the debugging, and did the following:

1) started asterisk.
2) placed a call from 706 (my sip extension) to 705 (my moh extension)
3) music started to play
4) placed the call on hold (button on the cisco)
5) waited a little bit (pretending I was doing something)
6) tried to resume the call (didn't get my music back)
7) had to hang up the call

I also tried this with a playback of a file, and calling a mobile (but not with debug)

The debug file is 80k. Do you want me to upload this or parse the relevant bits (what are they?) ?

By: Serge Vecher (serge-v) 2006-06-16 13:08:19

jmls: upload the whole file as attachment.

By: jmls (jmls) 2006-06-16 13:27:14

debug uploaded as requested

By: jmls (jmls) 2006-06-18 01:47:30

I've tried a couple of asterisk versions
31163 works fine. Put call on hold, no errors, resumes
33376 works a little. Get console notices

(Jun 18 07:45:22 NOTICE[8303]: sched.c:287 ast_sched_del: Attempted to delete nonexistent schedule entry 19!)

but it resumes after holding

By: Serge Vecher (serge-v) 2006-06-19 09:49:04

sorry, I missed one debug command the first time around. The complete set of commands for debugging is:

set debug 4
set verbose 4
sip debug

Please redo the log and post here. Thanks.

By: jmls (jmls) 2006-06-19 09:57:03

uploaded new file as requested. Don't understand why it's half the size.
did you see the notes about the versions ?

By: Serge Vecher (serge-v) 2006-06-19 09:57:17

I don't see any of [verbose] log in the latest file, did you forget set verbose 4?

By: jmls (jmls) 2006-06-19 10:04:03

I was sure that I had, but just to make sure, did it again:

*CLI> set verbose 4
Verbosity is at least 4
*CLI> set debug 4
Core debug was 0 and is now 4
*CLI> sip debug
SIP Debugging enabled
*CLI>

i performed the same tasks, but the debug file is roughly the same length (36375) and there is no verbose in that either.
I have updated to SVN-trunk-r34762M today

By: jmls (jmls) 2006-06-19 10:17:19

I just tried with verbose set to 50 but it didn't change anything. I stop asterisk, rm /var/log/asterisk/debug, asterisk -vvvvgc, set the options as mentioned, do my tasks, stop asterisk. send the debug file.

By: Serge Vecher (serge-v) 2006-06-19 11:38:46

I see what's going on -- I thought you were capturing from the console. In your logger.conf, the debug=> line must include all items as in:

debug => notice,warning,error,debug

By: jmls (jmls) 2006-06-19 11:49:34

Sorry, my misunderstanding. Added latest debug

By: jmls (jmls) 2006-06-19 13:26:35

some more information: I've been checking out various builds to try and narrow this down for you.

r32740 works perfectly
r32919 gets the message "sched.c:287 ast_sched_del: Attempted to delete nonexistent schedule entry 16!" HOWEVER, you can resume the call after putting it on hold

this is the case up to r34162 where you get the messages, AND cannot resume the call.

hope this helps.

if there's a particular release you want me to test I'm more than willing to do so

By: Serge Vecher (serge-v) 2006-06-19 13:48:09

hmm: very strange -- I've looked at all the changes in that range and don't see an obvious culprit. I think we'll have to narrow it down one by one starting from 32740. Take a look at  http://lists.digium.com/pipermail/svn-commits/2006-June/date.html and see if you can test every trunk commit in that range to narrow down the problem to specific revision. Thanks!

By: jmls (jmls) 2006-06-19 16:59:24

ok. Part 1. This is the appearance of the "sched.c:287 ast_sched_del: Attempted to delete nonexistent schedule entry 16!" message on the console

r32229 is ok
r32230 starts displaying this message

I am working on the versions where the hold fails completely

By: jmls (jmls) 2006-06-19 17:36:34

part 2. Place a call, put on hold, and resume is working in r33869 but broken in r33890 (using a Cisco 7960 SIP 7.4)

Sorry for the misleading info in 48100 I think I misread my notes :(

By: Serge Vecher (serge-v) 2006-06-19 19:03:08

ok, well r33890 was t.38 passthrouh commit -- that couldn't have broken anything ;)

Just to be clear -- in your tests a Cisco IP phone calls another Cisco IP phone directly, right? (in your latest debug log you've got queues and all kinds of stuff involved). Also, can you please capture the console output -- for some reason this debug log does not tell me anything useful. Thanks

By: jmls (jmls) 2006-06-20 00:33:44

This happens no matter what I call. I have tested this calling (a) a moh extension, (b) another cisco phone (c) a mobile phone using our ISDN pri line

I was not using queues in the debugs I gave, but just a moh extension

exten => 705,1,Answer()
exten => 705,2,MusicOnHold()

I will do a set verbose 4, set debug 4 and sip debug and capture the console.

By: jmls (jmls) 2006-06-20 00:44:46

console captured and uploaded as requested.

By: Serge Vecher (serge-v) 2006-06-20 09:23:09

1. Hmm: I don't unverstand why the phone is INVITE'ing 705@90.0.0.62? Does your asterisk server have 2 IP addresses, one public one private?

2. Also, you console log didn't have any [debug] information. Please make sure the entry for console => in logger.conf is like this:

console => notice,warning,error,debug

By: jmls (jmls) 2006-06-20 10:05:04

The phone server itself has two networks, one for the phones, the other for data etc. The data network is 90.0.0.x and the phone network is 192.168.6.x

I work mainly from home, connected through a permanent vpn bridge. My home network is 192.168.0.x. My cisco phone is on this network.  This has worked fine since June 94 until release r33890

console info uploaded as requested.

By: Serge Vecher (serge-v) 2006-06-20 11:06:52

hmm, weird. * OK the invite, the phone ACK's, but then comes back with another INVITE. Can you please revert to the working revision and capture the call again, so we can compare what's going and what goes wrong. Thanks for your prompt responses!

By: jmls (jmls) 2006-06-20 12:16:03

Just as a matter of interest, I tried the same sequence of actions on a cisco directly attached to the same network as the phone server and had exactly the same problem. Put a call on hold, cannot resume

By: jmls (jmls) 2006-06-20 12:18:34

do you want me to revert to r32229 (the last time it worked completely) or r33869 where the call resumes but there are those notice messages on the console (NOTICE[1746]: sched.c:287 ast_sched_del: Attempted to delete nonexistent schedule entry 16!) ?

By: jmls (jmls) 2006-06-20 12:26:46

not to worry - uploaded the console files from both releases.

By: Serge Vecher (serge-v) 2006-06-20 15:17:45

hmm, very strange stuff, the debug from r32229 also shows the phone hammering * with invites, but we don't see those avoided deadlock errors.

Just to eliminate all potential unknown variables, can you upload the debug from the Cisco connected directly on the network. Also, make sure that that you put the called channel on hold, so we can see what is happening when you press the hold button on Cisco. Thanks.

By: jmls (jmls) 2006-06-20 16:46:32

I won't be able to get the console from the phone on the network until tomorrow when someone is back in the office.

"Also, make sure that that you put the called channel on hold, so we can see what is happening when you press the hold button on Cisco"

Sorry, not sure what you mean : I am calling from my cisco to either a moh extension or a mobile - neither of these can put the call on hold.

Would it be useful to call the phone on the network and get them to put me on hold ?

By: jmls (jmls) 2006-06-21 10:13:43

latest console output uploaded. This was the phone directly attached to the network, but has same problems as mine.

By: Serge Vecher (serge-v) 2006-06-21 10:47:38

jmls: do you have any patches applied to this system?

By: jmls (jmls) 2006-06-21 10:55:54

The only patch I have is in the trunk revision where I put some diagnostics into the odbc to find out why I wasn't connecting (turned out I had the wrong config option).

However, i have checked out revisions

r32183
r32229
r32230
r32255
r32596
r32597
r32759
r32779
r32798
r32817
r32819
r32845
r32846
r33595
r33614
r33676
r33690
r33694
r33706
r33869
r33890
r34065
r34190

into their own directory and made clean and installed each one at a time and th bug exists in each of these revisions except where noted in the notes above.

I have included the diff below. As you can see, only cdr_odbc has been "patched"

[root@foxtrot asterisk]# svn diff
Index: cdr/cdr_odbc.c
===================================================================
--- cdr/cdr_odbc.c      (revision 35167)
+++ cdr/cdr_odbc.c      (working copy)
@@ -254,6 +254,7 @@
       struct ast_variable *var;
       char *tmp;

+       ast_log(LOG_NOTICE, "cdr_odbc: starting odbc load\n");
       ast_mutex_lock(&odbc_lock);

       cfg = ast_config_load(config);
@@ -262,17 +263,20 @@
               goto out;
       }

+       ast_log(LOG_NOTICE, "cdr_odbc: #1\n");
       var = ast_variable_browse(cfg, "global");
       if (!var) {
               /* nothing configured */
               goto out;
       }

+       ast_log(LOG_NOTICE, "cdr_odbc: #2\n");
       tmp = ast_variable_retrieve(cfg,"global","dsn");
       if (tmp == NULL) {
               ast_log(LOG_WARNING,"cdr_odbc: dsn not specified.  Assuming asteriskdb\n");
               tmp = "asteriskdb";
       }
+       ast_log(LOG_NOTICE, "cdr_odbc: #3\n");
       dsn = strdup(tmp);
       if (dsn == NULL) {
               ast_log(LOG_ERROR,"cdr_odbc: Out of memory error.\n");
@@ -280,6 +284,7 @@
               goto out;
       }

+       ast_log(LOG_NOTICE, "cdr_odbc: #4\n");
       tmp = ast_variable_retrieve(cfg,"global","dispositionstring");
       if (tmp) {
               dispositionstring = ast_true(tmp);
@@ -287,6 +292,7 @@
               dispositionstring = 0;
       }

+       ast_log(LOG_NOTICE, "cdr_odbc: ASTERISK-1\n");
       tmp = ast_variable_retrieve(cfg,"global","username");
       if (tmp) {
               username = strdup(tmp);
@@ -296,7 +302,8 @@
                       goto out;
               }
       }
-
+
+       ast_log(LOG_NOTICE, "cdr_odbc: ASTERISK-2\n");
       tmp = ast_variable_retrieve(cfg,"global","password");
       if (tmp) {
               password = strdup(tmp);
@@ -307,6 +314,7 @@
               }
       }

+       ast_log(LOG_NOTICE, "cdr_odbc: ASTERISK-3\n");
       tmp = ast_variable_retrieve(cfg,"global","loguniqueid");
       if (tmp) {
               loguniqueid = ast_true(tmp);
@@ -320,6 +328,7 @@
               loguniqueid = 0;
       }

+       ast_log(LOG_NOTICE, "cdr_odbc: ASTERISK-4\n");
       tmp = ast_variable_retrieve(cfg,"global","usegmtime");
       if (tmp) {
               usegmtime = ast_true(tmp);
@@ -333,6 +342,7 @@
               usegmtime = 0;
       }

+       ast_log(LOG_NOTICE, "cdr_odbc: ASTERISK-5\n");
       tmp = ast_variable_retrieve(cfg,"global","table");
       if (tmp == NULL) {
               ast_log(LOG_WARNING,"cdr_odbc: table not specified.  Assuming cdr\n");
@@ -345,6 +355,7 @@
               goto out;
       }

+       ast_log(LOG_NOTICE, "cdr_odbc: ASTERISK-6\n");
       ast_config_destroy(cfg);
       if (option_verbose > 2) {
               ast_verbose( VERBOSE_PREFIX_3 "cdr_odbc: dsn is %s\n",dsn);
@@ -359,6 +370,8 @@
       }

       res = odbc_init();
+
+       ast_log(LOG_NOTICE, "cdr_odbc: ASTERISK-7\n");
       if (res < 0) {
               ast_log(LOG_ERROR, "cdr_odbc: Unable to connect to datasource: %s\n", dsn);
               if (option_verbose > 2) {
@@ -366,6 +379,8 @@
               }
       }
       res = ast_cdr_register(name, desc, odbc_log);
+
+       ast_log(LOG_NOTICE, "cdr_odbc: ASTERISK-8\n");
       if (res) {
               ast_log(LOG_ERROR, "cdr_odbc: Unable to register ODBC CDR handling\n");

By: jmls (jmls) 2006-06-21 11:05:35

Sorry, I perhaps ought to clarify that all the releases that fail (except the trunk (with the patched odbc)) are NOT patched in any way.

By: jmls (jmls) 2006-06-21 11:06:08

this bug also affects call transfers as well

By: jmls (jmls) 2006-06-21 13:12:09

is there anything else I can do or provide in order to move this issue forward ?

By: Serge Vecher (serge-v) 2006-06-21 13:21:50

jmls: thanks, I think there is plenty of info here now. This is being looked at.

By: Joshua C. Colp (jcolp) 2006-06-21 13:22:39

I can't replicate this using my Polycom IP600 using the latest trunk, could it be something specific to the Cisco 7960? can someone test?

By: jmls (jmls) 2006-06-21 13:37:42

just installed an x-lite softphone on my pc, and have exactly the same problem

By: Serge Vecher (serge-v) 2006-06-21 13:43:03

file: I do testing with Cisco IP phones and I couldn't replicate it either.

By: Serge Vecher (serge-v) 2006-06-21 14:07:28

jmls: can you please upload a trace of one phone calling another one directly and setting the call on hold using the button on the phone instead of MusicOnHold() stuff?

By: jmls (jmls) 2006-06-21 14:34:12

I will have to wait until tomorrow until there is someone in the office to take my call :)

Is there any other application I could use to pick up the call instead of moh ?

I did try with playback but got the same results. I also tried dialling a zap number with the same results

By: jmls (jmls) 2006-06-21 14:36:15

I just retried with a zap number, got the same results. When I press "hold" on the cisco, the MOH starts on my mobile. I press "resume" on the cisco, it seems to go back to the call, but the MOH keeps on playing on the mobile.

Would that be a useful trace for you ?

By: Olle Johansson (oej) 2006-06-21 14:37:31

Ok, I can repeat this. Working on it. Tried to fix the scheduler errors, now looking at hold/unhold.

By: Olle Johansson (oej) 2006-06-21 14:58:49

Fix committed to svn trunk 35336. Thanks to bug marshal vecher for coming back to this and reminding me. Thanks to the reporter for your very detailed research into these two matters!

By: jmls (jmls) 2006-06-22 13:13:45

There are still some issues with this:

I am still getting the message

Jun 22 19:03:45 NOTICE[15389]: sched.c:287 ast_sched_del: Attempted to delete nonexistent schedule entry 458! (This is the message that appears in rr32230

when I put a call between 2 SIP phones on hold, BUT it does resume now, so some of the patch made things work again since R33869.

However, my users just informed me that the transfer is still not working :

a) Call comes in
b) User A answers
c) user A presses transfer, call goes on hold - caller hears music on hold
d) User A dials User B
e) User B talks to User A
f) User A completes transfer.
g) User A gets message that transfer failed
h) User B hears nothing
i) User A and B call is canceled
j) Caller still has music on hold

By: jmls (jmls) 2006-06-22 13:15:02

I can reproduce the "Attempted to delete nonexistent schedule entry" message at will, but cannot give any detailed debug info on the call transfer failure until I am back in the office tomorrow.

By: Serge Vecher (serge-v) 2006-06-22 15:40:21

jmls: are you using "Trnsfer" or "Blndxfer" buttons on the Cisco phone or the *1 method?

By: jmls (jmls) 2006-06-22 15:59:18

we use the Transfer button on the phone itself

By: Serge Vecher (serge-v) 2006-06-22 16:07:00

jmls: please open a new issue with a SIP debug for the transfer problem. Also, please produce a new debug for the "on hold" problem with the latest revision. Thanks.

By: jmls (jmls) 2006-06-22 17:13:04

debug output uploaded as requested. This was run against r35555.

I have two phones, a cisco 7960 7.4 SIP and an eyebeam (xlite) softphone. The xlite (7066) calls the cisco (706). cisco pushes the hold button, MOH comes through on the xlite, but the message appears in the console. Push resume on the cisco, all's ok (conversation resumes)

Push hold on the xlite, get MOH on the cisco. Push resume, cisco seemed to drop the call (xlite still thought that it was on hold) had to press the end call on the xlite.

I'll open a new call for the transfer problem tomorrow when I can attach a decent debug

By: Olle Johansson (oej) 2006-06-23 03:07:54

vechers: Transfers often involve hold, so it may be the same issue.

By: jmls (jmls) 2006-06-23 12:37:17

raised issue ASTERISK-7227 as requested for SIP transfer problem

By: Olle Johansson (oej) 2006-06-26 14:18:36

New issue moved to separate issue report.

By: jmls (jmls) 2006-06-30 06:21:22

This is still not working in trunk. However, have a look at the sip-patch.txt upload on ASTERISK-7227. This patch fixed both the on hold and transfer issues I've been having.

By: Olle Johansson (oej) 2006-07-02 14:58:04

Fix committed in svn trunk rev 36679