Summary:ASTERISK-01823: Periodic reset of idle PRI B-channels affect non idle as well
Reporter:petersv (petersv)Labels:
Date Opened:2004-06-15 07:16:32Date Closed:2011-06-07 14:05:13
Versions:Frequency of
Environment:Attachments:( 0) asterisk.prireset.diff
( 1) log.bug.e1.restart.dropped.call.txt
Description:We have noticed that the periodic reset of idle B channels on the E1 pri connected to our pbx also coses the loss of the non-idle calls. This does not happen on our pstn connection (also E1 pri) when those channels are reset. This is 100% reporducable. Withing a split second of the hourly reset kicking in the calls were lost.

Based on the "Sucessfully reset" output only non-idle B channels were ordered to reset. I guess something in the reset messages does not agree with pur Panasonic switch.

I have attached a pri intense debug of the pbx pri line for such a reset. I dialed 00462882270 which comes back as a meetme conference from the pstn (that is not important, it happens for all calls). There is an unrelated disconnect in there as well. then first the pstn pri (Zap/1) resets without a problem and then the internal (Zap/2) pri resets and the call drops.

Finally I try to dial out, fail and try again successfully.

I have changed the reset interval to 100 days and we have had no further drops. I hope to load new software in less than 100 days. :-)

****** STEPS TO REPRODUCE ******

Make a call soon before it is an even hour since the last startup. Watch the reset printout and listen to the beep-beep-beep of the pbx being disconnected.
Comments:By: petersv (petersv) 2004-06-15 09:38:50

Just to clarify: I am not able to analyse the logs well enough to see if asterisk does something funny. If it is just our pbx that is broken then we will just disable resetting the B channels and live with it.

By: Mark Spencer (markster) 2004-06-15 09:57:16

You're going to need to provide some sort of annotation here, I can't tell from your debug what it is you're saying is going wrong.

By: petersv (petersv) 2004-06-15 10:50:16

I can't find anything wrong with the log either. That is the problem. For some reason, after the whole "RESTART - RESTART ACK" the conversation that came on zap span 2 channel 31 was lost by the pbx attached there. Asterisk still thinks it was up and there was no element from the pbx to suggest why it tought that channel went down.

Asterisk apparently still thought channel 31 was up since the next call tried to SETUP on channel 31 which asterisk rejected.

Unless something is wrong with the data in the RESET element I think it is a bug in the Panasonic PBX. If you cannot see anything wrong with the reset message then I will try to add an option to the zap interface to disable the automatic resets per span. I have very little hope of getting Panasonic to change anything in the pbx firmware.

By: Paul Cadach (pcadach) 2004-06-15 10:54:41

What I see in log:
1) There is not CALL PROCEEDING message after full number is received;
2) Looks like restart bug at petersv's PBX - there was not restart for channel ASTERISK-27 on span 2, where PBX sent SETUP packet after sequential reset of idle channels.

Also, it could be side-effect of missing CALL PROCEEDING....

petersv, look at your log more detailed. There was 5 channels in use at a time you issued "show channels". Then channels Zap/19-1 (channel ASTERISK-15 on span 1) and Zap/33-1 (channel #2 on span 2) is hung up. So, you left next channels in use:
1) Zap/1-1 (channel #1 on span 1);
2) Zap/22-1 (channel ASTERISK-18 on span 1);
3) Zap/62-1 (channel ASTERISK-27 on span 2).
Next, look at B-channels restarts on span1 - there WAS NOT shown anything for channels 1 and 22, because they are busy. Also, nearly before SETUP which drops call on channel ASTERISK-27 on span 2, you could see there was RESTART on span 2 for channel ASTERISK-26, not ASTERISK-27, so your PBX is mis-placed call on channel ASTERISK-27 after RESTARTs... Ask your PBX supplier for fixes.

My resume - not Asterisk's bug, but petersv's PBX...

By: Andrew Kohlsmith (akohlsmith) 2004-06-15 10:54:49

I have had the B channel reset affect an active line as well, but only once.  I'm trying to recreate the problem to give a better bug report.  I'm on a Bell Canada PRI, nothing out of the ordinary that I can see anyway.

By: Paul Cadach (pcadach) 2004-06-15 11:12:33

IMHO to solve petersv situation there must be configurable option for chan_zap driver to schedule idle channels restart on specified time interval (in seconds) or at specified time of day (or list of such times). For channel-hungry installation there must be short time interval to reset "broken" channels into idle state, for other cases restart interval could be much longer or restarts could be done never or at specified times only, when it will not affect working calls.

By: Mark Spencer (markster) 2004-06-15 12:22:37

Ugh, PROCEEDING makes things complicated now because of support for incoming and outbound (and thus pass-through) overlap dial.  Originally we sent PROCEEDING on progress, but that's not really quite right.  The real problem is trying to determine a reasonable way in which to know we need to send PROCEEDING when there isn't an overlap dial situation on the outbound side.  Otherwise we're waiting for RINGING.  I don't really have a method to communicate back to zt_request that we are going to manually send PROCEEDING.

By: Mark Spencer (markster) 2004-06-15 12:25:21

I don't have my Q.931 spec handy, but is there any reason to suspect we have to have PROCEEDING before CONNECT?  It could be added in libpri fairly easily.

By: Paul Cadach (pcadach) 2004-06-15 13:14:14

Consulted with Q.931 - no problem with missed CALL PROCEEDING except for incorrect overlap dialing implementation somewhere. When remote side sends ALERTING or CONNECT without leading PROCEEDING message, no PROCEEDING shall be sent to the calling party (i.e. ALERTING/CONNECT directly as received, no PROCEEDING).

By: trmckee3 (trmckee3) 2004-06-15 15:39:30

One of the symptoms, a pri channel being marked as in use in asterisk, but free at the other end, is the same as the problem I encountered in bug 1809.  Something is causing the other end to go down and the state machine in asterisk is not recognising the drop.

I'm trying to schedule a time where I can try to duplicate the bug 1809.  The root problem is different, but one of the end problems is the same.

By: Paul Cadach (pcadach) 2004-06-15 21:28:11

trmchee3, this ticket is definitely relies on bug at Panasonic PBX firmware, not at Asterisk. petersv just asks about making idle channel restarts to be configurable.

I don't see any relation between this ticket and 1809.

By: trmckee3 (trmckee3) 2004-06-15 21:44:23

This statement: "the conversation that came on zap span 2 channel 31 was lost by the pbx attached there. Asterisk still thinks it was up and there was no element from the pbx to suggest why it tought that channel went down." matches what I see except that there is a DMS500 on the other end of my PRI.
It might be a firmware bug in the Panasonic triggering the state mismatch in his case, but there is a similar problem on my end - which seems to be related to a call coming into an undefined DID number from a DMS500. Once I defined all possible DIDs to go to a voice menu the problem disappeared.

By: Mark Spencer (markster) 2004-06-15 22:18:09

Agreed this is definitely related to 1809, and the duplicate ring portion of the debug (which is not what this bug's main title is about) is also definitely an Asterisk issue, but I need to see how the channel is being "stuck" in this case.  I will need to be able to attach with gdb to a system stuck in that state.

By: Paul Cadach (pcadach) 2004-06-15 22:39:12

As I pointed before, there was NOT ANY message related to span 2 channel 31, so call must be in UP state. Possible Panasonic PBX doesn't support single-channel restarts and treats such cases as span restarts.

Attached patch allows to specify restart interval on per-span basis (it will reset to default value if this option is not specified directly), or just disable restarts when "resetinterval=0".

By: petersv (petersv) 2004-06-16 01:05:27

I would like to test restarting just one channel and see if the problem is only on cahnnel 31, if it is an off-by-one in the Panasonic or if it just restarts all channels when it gets a restart one cahnnel message.

I will not be able to do this for a few days.

By: Paul Cadach (pcadach) 2004-06-16 18:38:27

I'll try to provide a patch this weekend.

By: Mark Spencer (markster) 2004-06-17 14:34:39

This should now be fixed in CVS.  Please try updating and lets see if I really got it.

By: Mark Spencer (markster) 2004-06-18 14:52:34

Please confirm this is fixed at your earliest convenience.

By: petersv (petersv) 2004-06-19 09:32:35

The whole E1 to the pbx reset when the idle channels are reset. I guess our Panasonic PBX is alergic to the resets. I have disabled the periodic restarts for now in our installation (or rather, they are reset once every 100 days).

It is a bug in our pbx. A nice workaround would be to only di the reset if all channels are idle.