[Home]

Summary:ASTERISK-07238: [branch] Devicestate not working correctly
Reporter:Curt Moore (jcmoore)Labels:
Date Opened:2006-06-26 18:21:12Date Closed:2007-06-30 09:19:57
Priority:MinorRegression?No
Status:Closed/CompleteComponents:Core/General
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) chan_sip.patch.txt
( 1) console_output_with_patch10_with_limitpeersonly.txt
( 2) console_output_with_patch10_without_limitpeersonly.txt
( 3) console_output_with_patch4.txt
( 4) console_output_with_patch5.txt
( 5) console_output_with_patch6.txt
( 6) console_output_with_patch7.txt
( 7) console_output_with_patch8.txt
( 8) console_with_devicestate.patch2.txt
( 9) devicestate.patch2
(10) devstate.log
(11) devstate-patch10
(12) devstate-testing2-patch10log
(13) diff_debug_outputs_with_appqueue.txt
(14) diff_debug_outputs.txt
(15) diff.chan_sip_devstate1.2debug
(16) diff.chan_sip_devstate1.4debug
(17) incoming-devicestate-call-limit0-patch9
(18) limitpeersonly=yes-incoming-devicestate-call-limit0-patch9
(19) limitpeersonly=yes-originate-devicestate-call-limit0-patch9
(20) originate-devicestate-call-limit0-patch9
(21) originate-devstate-testing2-patch10log
(22) patch_sip_10
(23) patch_sip_11
(24) patch_sip_11-annotated
(25) patch_sip_3
(26) patch_sip_4
(27) patch_sip_5
(28) patch_sip_6
(29) patch_sip_8
(30) patch_sip_9
(31) patch_sip_9b
(32) patch7
(33) putty.log
(34) sample.log
(35) sample2.log
(36) sample-onhold.log
(37) snom360-5.3
Description:It appears that the changes made to sip_devicestate in trunk since the 1.2 branch have somehow broken SIP devicestate notifications.  I came upon this when using SIP queue members with app_queue.  Once a queue member answered a queue call, their devicestate would never be set to "In use" from "Ringing" or "Not in use".  This was bad since app_queue would not recognize that the SIP queue member had already received and was on a queue call.  As a result, app_queue would continue to send other calls to the same SIP queue member while that member was still on the original queue call.

****** ADDITIONAL INFORMATION ******

Latest code located in team/group/bug7433 on the Digium SVN server

I'm not sure of the intent of the current logic in the trunk version of sip_devicestate but if I revert to the logic in the 1.2 branch, things seem to work as they should.  This is to say that once a queue member answers a call, app_queue does not send any further calls until that queue member has finished the previous call and that queue member's wrapuptime has expired.

I'm not currently using any other Asterisk features which set/check the devicestate so I'm not sure if using the older logic is good or bad for other Asterisk apps or functionality.

This could instead be a bug in app_queue but from my analasis of the 1.2 code vs the trunk code, it appears to be in chan_sip instead of app_queue.

The attached patch reverts sip_devicestate to the 1.2 logic.  This older logic is likely not what is indended moving forward in trunk but it seems to fix this particular issue.
Comments:By: Olle Johansson (oej) 2006-06-27 11:24:14

There is certainly a conflict between the way app_queue parses states and the way SIP devices parses the various states and makes lamp blink. The changes was done with blinking lamps in focus.

We need to discuss how to solve this, since your patch is a big step backwards, not forward.

By: Olle Johansson (oej) 2006-06-27 11:25:03

Do you have hints for these devices? Can you show me one device configuration in sip.conf?

By: Curt Moore (jcmoore) 2006-06-27 11:36:08

oej: As I mentioned, I figured that my logic was bad moving forward but I wasn't quite sure of the overall picture with devicestate, thanks for setting me straight. :-)

I use realtime for all of my sip configuration and I don't currently have any hints set in the dialplan.  If I should be using hints to make this work correctly, I'm open to doing that.

If there needs to be discussion on how to make app_queue play better with devicestate, I can just run my changes here internally until we can redesign the way app_queue handles device states.

I'm open to whatever, just let me know if you need anything from me.

By: Serge Vecher (serge-v) 2006-07-07 15:57:20

what do we do with this?

By: BJ Weschke (bweschke) 2006-07-08 07:13:14

I think I understand the issue here, and oej is right. This is really more of an app_queue issue and it needs to be corrected in /trunk such that it now understands that the new "RINGING" state is also an unavailable state and it shouldn't be considered as an available member in the ring_entry routine.

By: Olle Johansson (oej) 2006-08-08 09:54:33

We need to watch out both for ringing and the new hold state in app_queue.

By: Serge Vecher (serge-v) 2006-09-01 15:02:48

ok, so who is writing the patch here?

By: BJ Weschke (bweschke) 2006-09-02 13:37:59

r41810 has been committed to /trunk to try and address this in app_queue there. Please test and let me know if you have any additional and/or new issues from this.

Thanks!

By: BJ Weschke (bweschke) 2006-09-25 10:25:06

re-opening. there's an issue in chan_sip where inUse on the peer is the value getting checked on device state, but inUse on the user is the one getting incremented on type=friend. This causes all type=friend's to not have an inUse value which confirms this issue.

By: Amilcar S Silvestre (amilcar) 2006-10-02 14:43:47

I'm trying latest 1.4 branch here and SIP devicestate for app_queue is broken.

When it's ringing (a call is been offered to the channel by app_queue), the state don't changes (Not in use). When the call is answered, the same (Not in use). But if a put this call on hold, the status of the channel changes to (On Hold), and the state never changes back (only restarting asterisk), neighter after the call hangup. It keeps the state (On Hold), preventing app_queue to send other calls to this channel.

I'm using dynamic members, SIP channels.

Example:

Members: I>
 SIP/snom (dynamic) (Not in use) has taken no calls yet

Members: I>
 SIP/snom (dynamic) (On Hold) has taken 1 calls (last was 406 secs ago)

By: jmls (jmls) 2006-10-03 12:31:11

this is not a problem for local channels (with the /n).

By: Amilcar S Silvestre (amilcar) 2006-10-03 13:34:06

jmls, using local channels with /n there's no "ringing" or "on hold" states at all.

The devicestate works, but only with "Not in use" and "In use" (Ringing is considered In use, On hold is considered In use).

By: Amilcar S Silvestre (amilcar) 2006-10-03 13:39:16

And local channels with /n never makes the state "unavailable" for the device. If a sip channel is offline, the state of the local channel remains "Not in use".

By: Olle Johansson (oej) 2006-10-29 14:47:54.000-0600

Just documenting discussion earlier on:

First test by *only* using a peer. If that works, I can create an option in sip.conf to only apply states to the peer part of a friend.

By: Amilcar S Silvestre (amilcar) 2006-11-04 09:59:04.000-0600

The devicestate with IAX2 channels works 100% in queues. This problem only affects SIP clients.

app_queue with ringinuse=no is unusable with SIP channels as members. (as the members get stucked in 'On Hold' state and never changes until asterisk restart).

By: Amilcar S Silvestre (amilcar) 2006-11-04 10:05:22.000-0600

The same behaviour with 'peer' or 'friend'.

By: Steve Murphy (murf) 2006-11-10 09:51:34.000-0600

OK, Anthony suggested I give this bug a spin. So, I've carefully mapped out the
state differences between sip_devicestate in 1.2 and 1.4, and head swimming,
I resolved on a strategem to narrow down the possibilities.

I have attached two patches to this bug. Both are aimed at the current 1.4 chan_sip.c. One simply adds some debugging to the current 1.4 implementation,
to reveal what paths are being taken thru the code when devicestate is called.

The other takes the latest version of sip_devicestate from the 1.2 branch, replaces the 1.4 version with it in the 1.4 chan_sip.c. I then added the same sort of debug to it.

So, testers, if you want to help kill this bug, bring me up to speed. What do your logs say when there's success, and what do your logs say when there is failure? Run the 1.4 stuff and chart out what the logs say when you have problems. Then, recompile with the 1.2 patch applied, and tell me what the logs say when there is success. I have the debugs as LOG_NOTICE, so you shouldn't have to play with the logger config. But you might plan on reverting back to the original chan_sip, as these notifications might take up some space.

Let's solve this problem! If 1.2 works because it's returning a lot of UNKOWN states, that would be a clue... we'll see.



By: Amilcar S Silvestre (amilcar) 2006-11-10 13:25:17.000-0600

Ok, debug uploaded.

With diff.chan_sip_devstate1.4debug we're always in NOT_IN_USE state, except for ON_HOLD. And when it reaches ON_HOLD, it stays that way forever.

With diff.chan_sip_devstate1.2debug we're always in UNKNOWN state.

In this config, device "xlite" is type=friend and device "snom" is type=peer. And they don't have call-limit.

By: Anthony LaMantia (alamantia) 2006-11-10 13:52:39.000-0600

Hi,
if it's not a huge trouble is there anyway you can test these out while using queues?

By: Amilcar S Silvestre (amilcar) 2006-11-10 14:35:24.000-0600

Uploaded a log with the use of app_queue using both patches.

With diff.chan_sip_devstate1.4debug, the state never changes in show queues.

With diff.chan_sip_devstate1.2debug, the state is correct in show queues, despite the fact that in the DEBUG messages, it shows "UNKNOWN".

By: Steve Murphy (murf) 2006-11-10 22:49:07.000-0600

There's actually a few different bugs operating here, with some overlap, methinks.

First of all, ast_parse_device() gets called when the 1.2 stuff returns AST_DEVICE_UNKNOWN. It's not real choosy. It checks the channel->_state variable, and returns AST_DEVICE_RINGING if _state is AST_STATE_RINGING. All other cases return AST_DEVICE_INUSE.  I think this isn't right. It looks like AST_STATE_BUSY should return AST_DEVICE_BUSY, and maybe AST_STATE_DOWN should return AST_DEVICE_NOT_INUSE. The other channel states, like OFFHOOK, DIALING, RING, UP, DIALING_OFFHOOK, PRERING, I guess would be OK to map to INUSE.

I kinda think that returning AST_DEVICE_UNKNOWN from sip_devicestate() when peer's call_limit was 0, was not the best thing to do in 1.2, but because it did, you got some extra life out of the device state stuff, which resorted to the channel for it's RING/INUSE information.

And sip_devicestate is broken, I think. It uses several peer vars to determine state, and I think it's messing up a little. I've rewritten the logic slightly and I'll leave one of the call_limit tests so INUSE info is correct, whether the call_limit is set or not... they seem orthogonal.
And lastly, it looks like the code to take the sip devicestate out of ONHOLD is not getting run-- OR-- not as many times as the code to put it IN hold... We'll have to see why! It's all in that SDP stuff.

Hang on a minute, and upload another patch for you to test...

By: Steve Murphy (murf) 2006-11-10 23:00:59.000-0600

OK, revert your channels/chan_sip.c file, and then apply this patch. It will also add some code to devicestate.c, which hopefully won't be called anyway, but, hey, what the heck.

Put it thru the normal paces, and report back on what the logs say. I have some debugs in the sip_peer_hold routine, to let us know if it's getting called from the process_sdp routine, and what the onHold field is getting set to. The easy thing to fix will be if sip_peer_hold is getting called more times to set onHold, than to unset it. If this isn't the case, then we'll have to figure out what's going on in process_sdp.

By: Amilcar S Silvestre (amilcar) 2006-11-11 07:28:43.000-0600

Ok, uploaded console output with devicestate.patch2.

The same behavior yet. Always NOT_IN_USE, andget stucked in ONHOLD. I also tried with call-limit=(any number here) for the peers, but didn't change anything.

About the ONHOLD issue:
Entering...
[Nov 11 10:08:36] NOTICE[20944]: chan_sip.c:8028 sip_peer_hold: onHold now: 1;  hold arg was: 1
Leaving...
[Nov 11 10:08:42] NOTICE[20944]: chan_sip.c:8028 sip_peer_hold: onHold now: 1;  hold arg was: 0

By: Steve Murphy (murf) 2006-11-11 09:35:34.000-0600

Bingo. the  patch_sip_3  should put the onHold prob to rest (I hope). Revert and apply this patch; send me the logs as previous.

By: Ronald Chan (loloski) 2006-11-11 10:56:44.000-0600

Murf:

After applying your patch it immedietly segfault upon asterisk restart or start
rebooting the machine doesn't solve the issue, i will attach bt from non optimize built if needed ?

Ronald

By: Ronald Chan (loloski) 2006-11-11 10:57:27.000-0600

i forgot :( this is from SVN-branch-1.4-r47490M

By: Ronald Chan (loloski) 2006-11-11 11:18:32.000-0600

murf: please see bt from non optimize build after applying your patch

Path: .
URL: http://svn.digium.com/svn/asterisk/branches/1.4
Repository UUID: 65c4cc65-6c06-0410-ace0-fbb531ad65f3
Revision: 47495
Node Kind: directory
Schedule: normal
Last Changed Author: russell
Last Changed Rev: 47494
Last Changed Date: 2006-11-11 23:31:08 +0800 (Sat, 11 Nov 2006)
Properties Last Updated: 2006-11-12 01:10:30 +0800 (Sun, 12 Nov 2006)

By: Steve Murphy (murf) 2006-11-11 11:40:44.000-0600

Doh! Sorry about that. This one shouldn't bork if the peer pointer (p) is null.
Revert and apply the attached patch_sip_4 patch.

By: Ronald Chan (loloski) 2006-11-11 13:20:40.000-0600

Murf: when i set the peer setting with call-limit=1 i think it's working now :) thanks

when incoming call's the devicestate change into Busy after you hangup the call it came back to Not in Use


Best reagards,

Ronald

By: Ronald Chan (loloski) 2006-11-11 13:29:21.000-0600

murf: please check sample.log and sample2.log for details when i set the call-limit=2 it's more clearer now that it's been fixed. many thanks



Ronald

By: Ronald Chan (loloski) 2006-11-11 14:10:52.000-0600

oh wait, it's seems like the devicestate was correct if there's an incoming call houw about outgoing call from member of the queues?

[operator]

member => Sip/201
member => Sip/202

does the state of device will reflect also if he/she made an outgoing call via sip/zap channels? if yes, there's something left to clear on this because the state isn't chaning if i made an outgoing call

i hope i made my clear on this one.


Ronald

By: Steve Murphy (murf) 2006-11-11 14:16:30.000-0600

Hmmm. Can you put the sip call on hold and see if the state comes out of onHold OK? I provided a patch_sip_5 to get rid of the -101 values (I hope!!).

Can you try all this with call_limit == 0?

Many thanks!!!

By: Ronald Chan (loloski) 2006-11-11 14:21:03.000-0600

as per your request here's the log

By: Ronald Chan (loloski) 2006-11-11 14:22:26.000-0600

murf: when i set the call on hold then the state goes on hold and when i came back it's in use, then i hangup does it okay ?

By: Amilcar S Silvestre (amilcar) 2006-11-11 14:28:36.000-0600

Ok, uploaded console with patch4.

You definitely solved the ONHOLD bug.

ONHOLD is now working as it should, but not with SJPhone. I tested with xlite and sjphone. xlite is 100% OK, as you can see in the logs.

SJPhone stays in ONHOLD state. Seems like it don't reach sip_peer_hold when leaving ONHOLD status, only when it's entering ONHOLD, so it keeps rising the "onHold now" value.

Still "Not in use" for all other states.

By: Amilcar S Silvestre (amilcar) 2006-11-11 15:07:08.000-0600

Uploaded console output with patch5.

To be sure, in this log i tested ONHOLD state change with a snom phone too. 100% correct, like xlite. I think ONHOLD is fixed! You've got it! :-)

The issue with SJPhone seems to be another bug (SDP related, maybe).

By: Steve Murphy (murf) 2006-11-11 15:19:24.000-0600

Hmmm. Try using the ASTERISK-1 patch, which hopefully will list out the variables that determine the states. For those extensions not showing their state, are they "reinviting"?

By: Amilcar S Silvestre (amilcar) 2006-11-11 15:41:20.000-0600

murf,

The last file i've uploaded (console_output_with_patch5.txt) is using the ASTERISK-1 patch. All the three sip devices in this log are canreinvite=no.

By: Steve Murphy (murf) 2006-11-11 15:54:40.000-0600

Neither can I explain the prob with the sjphone right now. That should probably get filed as a new bug; And extension originated activity not being reflected in the state is bothersome. I will begin investigating that.

By: Amilcar S Silvestre (amilcar) 2006-11-11 20:02:39.000-0600

I think that the sip activity not being reflected in the state (except for onhold now, that's correctly reflected) is the main bug here. (acording to the description at least) :-)

As i see, 1.2 only works (for queues, at least) not because sip_devicestate works right, but because it always returned UNKNOWN, and ast_parse_device_state "corrected" things like you described. In 1.4, sip_devicestate doesn't return UNKNOWN anymore, so the state remains NOT_IN_USE. (please forgive me if i'm wrong)

By: Steve Murphy (murf) 2006-11-11 20:59:21.000-0600

Well, sort of.

It's your turn to tell me if **I** am wrong.

1. In 1.2, when call_limit is 0, sip_devicestate always returned UNKNOWN, and the parse_state routine would return either INUSE or RING. With call_limit non-zero, you'd get more state info.

2. In 1.4, and my fixes, you get state accurately reported for all states IFF the call was TO the SIP device. onHold is accurately reflected, it looks like, in either direction, as an exception. This was demonstrated in one or two of the logs attached.

3. If a sip device initiates a call thru asterisk, the whole chain goes without state updates; I don't know why yet, but looking at the code for app_dial, I don't see any devicestate updating going on. So, it could just be that the devicestate stuff isn't fully implemented yet. I'll discuss this with those more intimately familiar with it on Monday. Maybe it is fully there, but a bug or two somewhere might be keeping those updates from getting thru.

By: Olle Johansson (oej) 2006-11-12 03:29:43.000-0600

murf - look at my change to trunk too, which I did for call queues. There's a new option which will make sure that all call limit and usage changes only happen to the peer part of a friend. It will make all of this much easier, since the subscription only is on the peer and not on the user.

If that patch makes subscriptions better, we might consider it for 1.4 too.

By: Steve Murphy (murf) 2006-11-13 07:41:08.000-0600

Just for kicks, I added some code to sip_devicestate to consult the sip_user
info, if the peer is null. it's in the patch_sip_6 attached. See if that helps.
My guess is that if they are BOTH available, and either one could have active state information, then this patch will not work well.

OEJ-- many thanks, I will look for the changes you introduced, and see if they make life better in 1.4.

By: Ronald Chan (loloski) 2006-11-13 09:00:29.000-0600

murf: with your patch6 still the same issue with call-limit=0 if the call originate from asterisk to sip/pstn the device state for the member of the queue still not in use, if there's an incoming call, setting the call on hold will reflect the device state to On-Hold as we have tested on the previous patch.


Ronald

By: Ronald Chan (loloski) 2006-11-13 09:01:29.000-0600

please check devstate.txt for info. many thanks !!!!





Ronald

By: Amilcar S Silvestre (amilcar) 2006-11-13 14:50:07.000-0600

murf,

1. You're totally right in this one, despite the fact that i never used call_limit > 0 in 1.2, therefore i don't know if it changes things;

2. Right. But if i started a call FROM a sip device TO another sip device, the TO part doesn't have state updates eighter.

3. That's exactly what's going on. I also have looked in app_dial to find where is the update state code, but haven't find one.

By: BJ Weschke (bweschke) 2006-11-13 15:10:30.000-0600

ya - there isn't an "update state code" clause anywhere because it isn't actually implemented that way.

Basically, when a channel tech registers with core it registers a "devicestate" callback function which is a pointer to a function in the channel code that will produce "on demand" what state the channel is in. The calling application polls for this information via ast_device_state(..) via int ast_device_state(const char *device). app_queue makes use of this information when a new interface is first registering with app_queue.

On the flip side, there exists a rather kludgy way to have a function callback of your choice registered to receive calls to the callback when ANY channel on the entire asterisk system changes state. This is also in main/devicestate.c and is ast_devstate_add(ast_devstate_cb_type callback, void *data). app_queue uses this guy to register to receive the whole enchilada of updates. One of the changes we made during the 1.4 dev cycle was to add logic to only traverse through the logic of processing these changes when the interface itself we were receiving a change for matched the interface of a queue member. Otherwise, we returned immediately from the callback and never locked/unlocked mutexes,etc.

When initally troubleshooting this one with amilcar we found that type=friend indeed caused both a peer and a user to be present, and devicestate would, by default, look at the user (which was going to be blank) for the device state info which was always going to present the channel as "available" even though it sometimes was not.

By: Steve Murphy (murf) 2006-11-13 15:17:01.000-0600

OK, revert and try patch7. After you update & compile, but before you start asterisk, add the following to the sip.conf file, in the [general] section:

limitpeersonly=yes

This should fold inUse and call_limit info from users to peers.

By: Amilcar S Silvestre (amilcar) 2006-11-13 15:20:56.000-0600

uploaded console output with patch6. Like with loloski, no changes here.

By: Amilcar S Silvestre (amilcar) 2006-11-13 15:55:05.000-0600

murf, same behaviour with patch7 and limitpeersonly=yes. (uploaded console output)

bweschke, i understand now the way it is implemented, but i don't think that the problem is with the user/peer or both been present or not.

Like i said, the behaviour is the same setting the device friend or peer. And i can't get the sip devicestate even without using a queue (as you can see in the notice events produced by murf's patchs).

By: BJ Weschke (bweschke) 2006-11-13 16:02:13.000-0600

amilcar - sure. it sounds like there's "something more going on" here beyond what we originally thought/confirmed to be going on. I started getting that impression when I saw the notes about the hold states.
unfortunately, I've been really, really buried with other stuff to put some good time into it, and am real grateful murf has been able to put some good efforts into it. I just wanted to document what I knew about the situation so far so someone wasn't spending the same effort to get where I was with this already. :-/

By: Steve Murphy (murf) 2006-11-13 16:51:55.000-0600

Worry not. I don't concede failure; I'm still learning this stuff, and I'll keep diving into it until I reach the bottom of the problem. The steps I've taken are educational, and gives me feel what's going on underneath, and what's not going on underneath. More to come.

By: Amilcar S Silvestre (amilcar) 2006-11-13 17:22:23.000-0600

i'll glad to help in any way! :-)

By: Steve Murphy (murf) 2006-11-13 17:24:07.000-0600

Revert again. I left oej's stuff alone, but this time, I want more data on
what's going with sip_peer and sip_user. If they are both existing, I want to see what the values are inside them at the time the devstate func is called. It's obvious the sip_peer is NOT getting the state info in this scenario.

It's also obvious that sip_devicestate is being called at all the important change-of-state times, like ring, hangup, etc. It's just not seeing the state info in the peer.

So, show me what happens with patch_sip_8 in effect.

By: Amilcar S Silvestre (amilcar) 2006-11-13 18:05:05.000-0600

Uploaded console output with patch_sip_8.

The same output for both using and don't using "limitpeersonly=yes". All devices NOT using "call-limit" settings.

By: Steve Murphy (murf) 2006-11-13 18:35:13.000-0600

OK, we can't find the user from sip_devicestate(). Why not? What are the names of the users? We'll list them out if find_user returns 0. Try patch_sip_9 after reverting, and let's see what's in there. Hmmm. What if there aren't any?

By: Ronald Chan (loloski) 2006-11-13 20:55:11.000-0600

murf: please check this out :) patch9 won't patch cleanly on this release

[root@stealth asterisk]# patch -p0 < ../patch/devstate.diff --dry-run
patching file channels/chan_sip.c
Hunk ASTERISK-2 FAILED at 10534.
Hunk ASTERISK-3 succeeded at 14914 (offset 34 lines).
Hunk ASTERISK-5 succeeded at 15880 (offset 34 lines).
1 out of 10 hunks FAILED -- saving rejects to file channels/chan_sip.c.rej
patching file main/devicestate.c
patching file configs/sip.conf.sample

Path: .
URL: http://svn.digium.com/svn/asterisk/branches/1.4
Repository UUID: 65c4cc65-6c06-0410-ace0-fbb531ad65f3
Revision: 47587
Node Kind: directory
Schedule: normal
Last Changed Author: file
Last Changed Rev: 47584
Last Changed Date: 2006-11-14 05:28:57 +0800 (Tue, 14 Nov 2006)
Properties Last Updated: 2006-11-14 10:50:03 +0800 (Tue, 14 Nov 2006)

By: Steve Murphy (murf) 2006-11-13 21:45:32.000-0600

OOps, sorry, somebody committed updates to chan_sip.c; I've uploaded patch_sip_9b and that is against the current 1.4 tree.

(And, the hunk that failed isn't critical, the file should compile anyway).

Try this, and show me the logs...

By: Ronald Chan (loloski) 2006-11-14 05:54:09.000-0600

murf: this is the result of patch9 call-limit = 0; unset limitpeersonly=no. incoming call from another * via zap.

By: Ronald Chan (loloski) 2006-11-14 06:01:04.000-0600

murf: another side note, if there's an incoming call regardless from what technology via sip/zap. with call-limit > 0. devstate (In-Use) for e.g Sip/201 is stuck forever until next asterisk reboot. if you want that log, please tell me i'll be happy to produce that also. but On-Hold state is working as expected.

incoming-devicestate-call-limit0-patch9 [^]
originate-devicestate-call-limit0-patch9 [^]

By: Steve Murphy (murf) 2006-11-14 09:38:17.000-0600

Thank you, gentlemen, for your patience and hard work.

I neglected to notice that Olle's code was doing a check to see if call_count was non-zero. If call_count was 0, then no devicestate updates are made. I erased this test. This should show some devicestate updating now.

revert, and apply the attached patch_sip_10... then, in logger.conf, make sure your console log includes "debug", and then do a "core debug 3" to catch the logging outputs from update_call_counter().

Let's then see why we get locked in the INUSE state, and see if we can put this bug out of our misery.

By: Ronald Chan (loloski) 2006-11-14 10:07:22.000-0600

murf: in patch 10 if there's an incoming call the with call-limit=0 the state change to ringing first, as soon as you answer the call in turn's to In-Use which is good, once you hold the call in turns to On Hold, which is good also but when you hangup the call the state still _IN_USE_ :).

originate call from asterisk to sip/zap still has no effect :)

please check devstate-patch10 for details with core set debug 3


Ronald

By: Olle Johansson (oej) 2006-11-14 10:24:31.000-0600

murf, you need to test this with multiple clients using various formats before you change too much. Use an Eyebeam to check the rich format.

By: Ronald Chan (loloski) 2006-11-14 10:46:15.000-0600

As per OEJ suggestion here is the debug trace from snom 3.60 firmware 5.3 still the same thing happens on LinkSys PAP2 1.3.3 LS ATA adapter. with core set debug 3 Thanks!!



Ronald

By: Steve Murphy (murf) 2006-11-14 11:26:17.000-0600

Ronald--

Try running with limitpeersonly=yes and see if you still get stuck in inUse. What I see is update_call_counter incrementing and decrementing inUse on the sip_user struct, but devicestate still sees it in the peer.

All I have to do is figure out how inUse got set on the peer.

By: Steve Murphy (murf) 2006-11-14 12:16:04.000-0600

Ah-ha-- as to getting stuck in "In Use"... I see that inUse is incremented on the peer structure....  and then decremented on the user structure! That leaves the peer marked as "in use", and that's why it gets stuck. A little more study. Maybe the call to update_call_counter needs to be moved up, as the pvt struct may be losing the peer info before it's called. I shall investigate.

Oej--- fear not; I shall commit nothing before you give it a review. A thorough review, perhaps. You may not approve of some of my changes. You may like others, tho. We shall see. I am trying to be as absolutely minimalistic in my changes as I can.

By: Ronald Chan (loloski) 2006-11-14 12:33:01.000-0600

murf: you got it, it's working now please check the recent logs....

By: Ronald Chan (loloski) 2006-11-14 12:37:57.000-0600

murf: yes, if oej will approve your changes we can put this bug to end :) originating call from asterisk to sip/zap is working also now :)

By: Steve Murphy (murf) 2006-11-14 16:46:34.000-0600

OEJ: near 13066 (in my version of chan_sip.c) there occurs the code:

  } else {        /* Re-invite on existing call */
           ast_clear_flag(&p->flags[0], SIP_OUTGOING);     /* This is now an inbound dialog */

It is the only place in the file, where an outgoing call would be switched to an incoming call. This is also the only place (so far) where things will go haywire if we did not specify limitpeersonly=yes. We need to do something here, limitpeersonly=yes or no, to make sure the state in sip_user is accurate at this point in time. We could copy the state fields, or we can swap them, or whatever.

It would be really nice if the sip_devicestate() routine could have access to the call's sip_pvt struct, and determine the direction (incoming, outgoing) and therefore the proper source of the device state info...!!!

By: Steve Murphy (murf) 2006-11-14 17:17:09.000-0600

amilcar--

aren't you going to test out the patch_sip_10 and let me know how it flies?

Make sure to include debug in the logger.conf for console (or logs).

Make sure to core debug 3.

Make sure to set limitpeersonly=yes in the [general] section of the
sip.conf file.

and...   Profit!

By: Steve Murphy (murf) 2006-11-14 19:20:14.000-0600

OEJ-- please review patch_sip_11, which is in (hopefully) final, or close to final format.

the patch_sip_11-annotated is the same file, but I've edited to add comments
to every hunk of changes. My changes are really quite restricted and minor.

Testers-- the patch_sip_11 is the final version. Please try it out and make sure it still works. No debugs. No log notify messages. Don't forget to set
limitpeersonly=yes in the [general] section of the sip.conf file.

I'll try to keep it up to date up until it's merged, if it indeed is merged.

By: Steve Murphy (murf) 2006-11-14 19:32:09.000-0600

oej--

If you get a chance, please review the changes I made for 7433.
I have two files, patch_sip_11 and patch_sip_11-annotated, where
the second is the first, with explanations for each change inserted.

many thanks!

murf


By: Olle Johansson (oej) 2006-11-15 02:38:05.000-0600

Ok, we need to coordinate this with BJ as well to make sure that we don't mess up the queue system that is also depending on device states.

By: Amilcar S Silvestre (amilcar) 2006-11-15 08:36:08.000-0600

murf, sorry i'm late on this! :-)

And CONGRATULATIONS, man!! You fixed it! :-) Now, all the states are working like a charm! I've uploaded two logs using patch 10 (with and without the use of limitpeersonly).

Like you said, the TO part of the call is always correct now. The FROM part only get the states right if i set limitpeersonly. I totally agree with you (comment 0054634) about treating sip_user state in sip_devicestate! For me, the "limitpeersonly" thing seems a little like a hack (i may be completely wrong on this, oej! sorry if i am) :-)

And i think that the changes in the code that you did so far are minimal and very unobtrusive. The main two, the check for hold > 0 on onhold state and the remove of the test for SIP_CALL_LIMIT set are (again, for me!) clearly implementation bugs, not changes in the implementation form.

Many thanks again! SIP device state is a VERY important feature!!!!

By: Steve Murphy (murf) 2006-11-15 09:29:01.000-0600

amilcar-- many thanks! good work!

oej, BJ-- if it speaks peace to your hearts, let me note that both amilcar and loloski both tested my fixes in the context of app_queue. Indeed, if the devicestate stuff isn't doing its thing in sip, then app_queue isn't going to work right with sip phones.  The reason I depended on them so heavily is that I'm not geared up to play with queues on my test machine, and I'd have spent a good deal of time (possibly) setting up and debugging the queue mechanism, and complicating the issue, perhaps. Plus, I have only one sip phone (x10) available to me locally. Life is tough, isn't it? So, many thanks to amilcar and loloski for seeing it thru to the bitter end! Please make sure they get an extra helping of karma when this is all resolved.

It's your show now!

By: BJ Weschke (bweschke) 2006-11-16 14:16:20.000-0600

patch_11 looks good to me. +1 to commit it from me. Thanks everyone for jumping in to get this one solved. The team effort here is a real tribute to open source and how it can work to everyone's advantage to solve common problems.

By: Olle Johansson (oej) 2006-11-16 14:50:33.000-0600

Well, some of the changes I don't agree with. You're re-enabling un-necessary database lookups for realtime users by disabling the flag in the start of the counter function.

I'll check the branch, fix it and commit.

By: Steve Murphy (murf) 2006-11-16 18:37:48.000-0600

OEJ-- If you re-instate that short-circuit, you will be turning off all the state updates nec. for this bug to be solved. Best to dream up some other conditions that would allow silence instead; it looks like, at some point in history, the call_limit was the only reason for keeping states like in_use, etc. Now, tho, call_limit is just one of the factors in calculating state. You can't just pop that back in without wrecking all the state tracking that app_q and others need.

By: Olle Johansson (oej) 2006-11-17 07:42:53.000-0600

I don't follow you. We have to be very careful about not hurting realtime too much. I will look into this and try to understand your point.

We've always had a limited call state handling if there's no call limit set on the peer and more detailed, better one if there is a call limit. If there's no call limit set, we should not run all those lookups and don't bother with database lookups. If you need states in the queues, enable call limits and all will be fine. Removing that option will hurt more than forcing people with queues to enable limits.

By: Steve Murphy (murf) 2006-11-17 07:49:32.000-0600

OEJ-- many thanks, you have added to my knowledge. I have but one suggestion: I added a little explanation to the queue doc about what to do if SIP devices are in use; perhaps a few words about setting call_limit, also, should be added there, so as to keep queue users from filing bug reports and wasting time trying to figure out what's going on...?

By: BJ Weschke (bweschke) 2006-11-17 07:56:05.000-0600

I'm going to agree with oej on this. We shouldn't code channel state around app_queue. So long as there's settings we can make in the channel that will "accomodate" it. Actually, the way you guys have it now, I think it even kind of deprecates the need for ringinuse that exists in app_queue today.

By: Jared Smith (jsmith) 2006-11-17 08:48:25.000-0600

OEJ: While I understand your concern about realtime performance, my personal opinion is that we ought to get call states working *correctly* first, and then worry about performance.  

Also, having to have call limits in place for call states to work either ought to well documented, or completely changed.  I think I can safely say that this feature (if you can call it that) has caused more headaches to many users (and even consultants such as myself) than any other part of chan_sip.  Personally, I'd rather see a different flag that says "trackcallstate=yes/no" for the people who would like to disable call states for performance reasons.  But tying it to call limits without *any* documentation is just plain wrong.

Now, that being said, I'm not sure if adding a new switch in sip.conf to the 1.2 series is the right thing to do either (although it too appears to have been done lately for other things.)  Someone with more authority than me will have to make that decision -- I'm just stating my opinion.

By: Amilcar S Silvestre (amilcar) 2006-11-17 11:31:25.000-0600

OEJ, i think you missed the point of this bug. We must have correct device state information for sip devices, as we have for IAX devices, zap devices and so on.

What you are saying is that correct device state is un-necessary unless one uses call_limit? This could be true at some point in the past, but now SIP must have this information concisely, or don't have it at all. Other applications uses device states, as app_queue.

I don't have problems if I run a queue of iax devices or zap devices, but with sip, the state is always wrong (if we don't want to give this information for performance reasons, we must return UNKNOWN, and not NOT_IN_USE, as this is wrong!).

I agree with you that performance is very important. But i think we must have, at first, working sip device state tracking, and then, we may have options to disabled it if we need more performance or something like that.

The check for the flag SIP_CALL_LIMIT in the start of the counter function is like conditioning the return of correct results from one "feature" by the use of another (and all this undocumented).

By: Olle Johansson (oej) 2006-11-18 03:47:25.000-0600

We should never send wrong information, I agree with that. But there are only a few situations when you need Asterisk to produce complete information - if you have subscriptions and if you have queues (or possibly your own manager application). This does not apply to all Asterisk servers.

Remember that the SIP channel delivers (unless we have a bug) more information than other channels and we also must handle both subscriptions and queue demands - and the problem is that they're different.

By: Ronald Chan (loloski) 2006-11-19 21:30:41.000-0600

Guys, where are we with this? since according to oej the implementation of murf's fix has a significant impact on realtime user's, does it sounds like another patch needs to be pickup???. IMHO amilcar suggestion was not bad at all, i might be wrong too but it's just my opinion just like with jsmith :)



Ronald

By: Amilcar S Silvestre (amilcar) 2006-11-23 15:13:55.000-0600

Friends,

Please, let's not permit that all the effort that we've put in this go to nowhere! Subscriptions/hints, app_queue, ChanIsAvail, call_limit and others, all this depends on consistent devicestate tracking (like we have in other channels)!

oej: If one have problems with performance, we may have an option like "devicestatetrack=no" to disable it (and return UNKNOWN, the correct info if we are not going to track state). That is better than forcing anyone using any applications/commands/features/manager events that need devicestate to limit the number of the calls for every sip device connected.

murf (and the rest of us) put almost an entire week studying and debuging this. With the patch, all the states works ok, despite the fact that i don't like the way that "limitpeersonly=yes" option solves some problems, and i think that we must make sure that the state in sip_user is accurate too, one way or another. The suggestion of murf about accessing sip_pvt inside sip_devicestate() to determine the direction and therefore set the proper source of device state is very good IMHO.

Right now, i'm feeling like i waste my time in vain.

By: Steve Murphy (murf) 2006-11-23 18:25:36.000-0600

amilcar--

Patience. Rome was not built in a day. It's safe to say that no-one knows
the sip code better than OEJ. He's a good man. He'll do the right thing. Our
labor will not go in vain.

oej--

You've got a ton of stuff in your inbox. I am here to serve. If you need anything from me, or I can help in any way, [uh, except money ;)], let me know.

By: Manuel Merelles (manuel) 2006-12-04 10:46:13.000-0600

Murf:
You have merged my ASTERISK-8043 case with this one, however my problem was on asterisk 1.2.13 version and the patch 11 seems to be focused on 1.4.

I tried to apply it, but it fails. So, how can i solve my devicestates issue on 1.2.13 ?


console output
[root@atk asterisk-1.2.13]# patch -p0 < ./chansip.patch --dry-run
patching file channels/chan_sip.c
Hunk #1 FAILED at 513.
Hunk #2 FAILED at 2942.
Hunk #3 FAILED at 2951.
Hunk #4 FAILED at 8033.
Hunk ASTERISK-1 FAILED at 10129.
Hunk ASTERISK-2 FAILED at 13066.
Hunk ASTERISK-3 succeeded at 11772 (offset -3163 lines).
Hunk ASTERISK-4 FAILED at 11782.
Hunk ASTERISK-5 succeeded at 14960 (offset -6 lines).
Hunk ASTERISK-6 FAILED at 15853.
Hunk ASTERISK-7 succeeded at 12719 with fuzz 1 (offset -3264 lines).
8 out of 11 hunks FAILED -- saving rejects to file channels/chan_sip.c.rej
patching file doc/queues-with-callback-members.txt
Hunk #1 FAILED at 522.
1 out of 1 hunk FAILED -- saving rejects to file doc/queues-with-callback-members.txt.rej
can't find file to patch at input line 164
Perhaps you used the wrong -p or --strip option?
The text leading up to this was:
--------------------------
|Index: main/devicestate.c
|===================================================================
|--- main/devicestate.c (.../branches/1.4)      (revision 47644)
|+++ main/devicestate.c (.../team/murf/bug7433) (revision 47644)

By: Steve Murphy (murf) 2006-12-04 18:15:52.000-0600

Manuel--

Don't try to apply the patch to 1.2. When this bug actually gets resolved, and we are sure of the solution, we'll attack your bug number as if it were a separate issue. There are some issues in 1.4 that didn't exist in 1.2; the problems and solutions could end up being different. So, please, be patient, and we will see that your issue gets resolved also.



By: Manuel Merelles (manuel) 2006-12-04 18:25:46.000-0600

ok Murf, thanks for your prompt response. We'll be waiting. And if you need support for testing, we'll be pleased doing that.

By: Olle Johansson (oej) 2007-02-01 15:38:41.000-0600

Please test with latest 1.4 from subversion. Thanks.

By: Olle Johansson (oej) 2007-02-01 15:40:47.000-0600

Also, please check the new option I've added to svn trunk, called busy-limit.

Busy-limit sets the number of calls that are required to report BUSY to status subscribers, like manager, SIP and queue. Call-limit sets the total number of calls that are allowed. By separating these limits, we can report busy but still allow the phone to transfer calls without hitting the call limit for the new call leg.

If this works properly for queue, we might consider backporting this to 1.4 - but we need tests and feedback from you users out there in the wilderness.

Thanks for testing!

By: Curt Moore (jcmoore) 2007-02-02 09:34:00.000-0600

oej: I'm still seeing the same behavior with 1.4-r53108.  When using dynamic queue members, they are not marked as "In Use" when they are on a call.

By: Olle Johansson (oej) 2007-02-02 09:49:46.000-0600

tgrman: I  am missing something here, must re-read. If they're in a call, the phone most certainly is in use. What am I missing?

By: Curt Moore (jcmoore) 2007-02-02 12:25:46.000-0600

oej: I agree, if the queue member is on a call, they _should_ be marked as "In use" but presently, they are not.  Each queue member is instead marked as "Not in use" I can give you access to the server and you can login if it will help.  However, the code which is currently running on that server uses the logic from 1.2 in sip_devicestate() as this is my production queue server and this has to work correctly there.

The _only_ way I can get this to work correctly in 1.4 is to use the logic in the patch I originally submitted on 2006-06-26, which is the logic from 1.2.

By: BJ Weschke (bweschke) 2007-02-02 15:24:49.000-0600

With the debugging assistance and knowledge of chan_sip from oej this afternoon, I think we finally figured this one out. Please take a look at the recent commit to UPGRADE.txt and if you're still getting LOG_WARNINGs about a queue member's state being 'Not in Use' when they're about to be bridged with a caller, read it again.

I'm going to mark this one in feedback - and then we can close when ready.

BJ

By: BJ Weschke (bweschke) 2007-02-02 15:28:04.000-0600

Please test the latest 1.4 branch and double check configs to make sure everything is set the way it should be, and then we can close this out if no further negative feedback. BJ

By: Olle Johansson (oej) 2007-02-02 22:05:27.000-0600

Also check http://svn.digium.com/view/asterisk/trunk/doc/queue.txt?view=markup
and please help us add information to that document so we'll get good documentation on all this.

By: Serge Vecher (serge-v) 2007-02-05 15:37:48.000-0600

keep those test reports coming!

By: Amilcar S Silvestre (amilcar) 2007-02-07 08:18:03.000-0600

Ok, i've tested with SVN-branch-1.4-r53324, and here is the results:

* With limitonpeers=yes in General and call-limit on each device, it reports the status ok, both in "sip show inuse" and queues. Thanks! :-)

* The devicestate "OnHold" doesn't show anymore in any conditions.

I've tested with the following clients: XLite 3.0, SJPhone 1.65.366d and a snom 190 hardphone.

By: BJ Weschke (bweschke) 2007-02-07 08:58:38.000-0600

Ok. So we have good feedback from one person. Let's leave this open until Friday 2/9 for more feedback, and provided we get no negative feedback, we'll close. Thanks!

By: Ronald Chan (loloski) 2007-02-07 09:53:23.000-0600

oej,bweschke:

same result like with amilcar you can safe close this now, Good work guys.
tested with PAP2-NA, Counterpath Eyebeam, Kapanga Softphone.

Connected to Asterisk SVN-branch-1.4-r53355

Thanks

Ronald

By: Amilcar S Silvestre (amilcar) 2007-02-07 11:16:07.000-0600

oej, what about onHold state?

By: Carl Thorner (cthorner) 2007-02-08 02:53:05.000-0600

Hi guys,

Good news:
If I set "Call limit   : 1" on the peer and "Call limit peers only:  Yes" under general, I detect the phone as InUse both under "sip show inuse" and "show hints" when I call the device and pick it up.

Bad news:
If I make a call from the SIP device(i.e., outbound) it still shows as Idle even after I pick up the call on the other end.

To summarize: it works for inbound but not for outbound calls.

By: Serge Vecher (serge-v) 2007-02-08 09:33:46.000-0600

do you have a verbose/debug trace for "bad" news?

By: Serge Vecher (serge-v) 2007-02-08 15:27:34.000-0600

amilcar: please enable debug logging and reproduce the problem while putting a device on hold. Post the console log here.

By: Carl Thorner (cthorner) 2007-02-09 04:16:24.000-0600

serge-v,

My bad. I tried a new install and got it working fine for both inbound and outbound. A "sip reload" fixed the old system, not sure why but maybe I missed a call-limit setting for one of the phones or something stupid like that. Nice to have this feature working again.

By: BJ Weschke (bweschke) 2007-02-12 10:16:41.000-0600

Didn't receive any negative feedback. Putting this one to bed. Finally!