[Home]

Summary:ASTERISK-03205: [patch] Logic bug causes PRI to block on first channel
Reporter:Tilghman Lesher (tilghman)Labels:
Date Opened:2005-01-07 14:17:27.000-0600Date Closed:2008-01-15 15:20:37.000-0600
Priority:MinorRegression?No
Status:Closed/CompleteComponents:Core/General
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) 2004017__fix_pri_badchannel.diff.txt
Description:Jan  7 04:51:31 WARNING[12191]: Channel 73 already has a Real call
Jan  7 04:51:31 WARNING[12191]: Unable to start PBX on channel 0/1, span 4
Jan  7 04:51:31 WARNING[12191]: Call specified, but not found?
Jan  7 04:51:31 WARNING[12191]: Hangup on bad channel 0/1 on span 4
Jan  7 04:51:35 WARNING[12191]: Call specified, but not found?
Jan  7 04:51:35 WARNING[12191]: Hangup on bad channel 0/1 on span 4
Jan  7 04:51:36 WARNING[12568]: Channel 73 already has a Real call
Jan  7 04:51:36 NOTICE[12568]: Unable to create channel of type 'Zap' (cause 0)
Jan  7 04:51:39 WARNING[12191]: Call specified, but not found?
Jan  7 04:51:46 WARNING[12568]: Timeout, but no rule 't' in context 'span3pri'
Jan  7 04:52:11 WARNING[12573]: Channel 73 already has a Real call
Jan  7 04:52:11 NOTICE[12573]: Unable to create channel of type 'Zap' (cause 0)
Jan  7 04:52:21 WARNING[12573]: Timeout, but no rule 't' in context 'span3pri'
Jan  7 04:53:01 WARNING[12584]: Channel 73 already has a Real call
Jan  7 04:53:01 NOTICE[12584]: Unable to create channel of type 'Zap' (cause 0)
Jan  7 04:53:11 WARNING[12584]: Timeout, but no rule 't' in context 'span3pri'

The logic problem that this patch solves is this: available() can return a channel whose owner is set, but zt_request() initially checks if owner is set and refuses to continue if that is true.  Since zt_request() occurs outside of the loop where other channels are checked if available(), no retry is made for other (possibly free) channels.

****** ADDITIONAL INFORMATION ******

I'm not sure if this fix obscures another, more serious problem (in that channel owners are not getting freed properly); however, fixing this basic logic error fixes the immediate problem.

Disclaimer on file.
Comments:By: Mark Spencer (markster) 2005-01-07 14:38:38.000-0600

This code presumably makes no difference because:

/* If it's not an FXO, forget about call wait */
if ((p->sig != SIG_FXOKS) && (p->sig != SIG_FXOLS) && (p->sig != SIG_FXOGS))
return 0;


would catch a PRI.

By: Clod Patry (junky) 2005-01-07 14:59:00.000-0600

I'm working on this since more then 1 month.
Today i saw this:
gate1*CLI> show channels
       Channel  (Context    Extension    Pri )   State Appl.         Data
0 active channel(s)
   -- Accepting call from '154' to '5540' on channel 0/7, span 1
   -- Executing SetCDRUserField("Zap/7-1", "machine=gate1") in new stack
   -- Executing Dial("Zap/7-1", "Zap/G3/5540") in new stack
Jan  7 11:47:50 WARNING[10304]: chan_zap.c:4500 zt_new: Channel 71 already has a Real call
Jan  7 11:47:50 NOTICE[10304]: app_dial.c:828 dial_exec: Unable to create channel of type 'Zap' (cause 0)
 == Everyone is busy/congested at this time (1:0/0/1)
   -- Channel 0/7, span 1 got hangup
   -- Hungup 'Zap/7-1'
gate1*CLI> show channels
       Channel  (Context    Extension    Pri )   State Appl.         Data
0 active channel(s)
gate1*CLI>

which is really odd.

By: Mark Spencer (markster) 2005-01-07 15:14:07.000-0600

Fixed in CVS head, reopen if this isn't it.

By: Russell Bryant (russell) 2005-01-08 16:07:41.000-0600

fixed in 1.0

By: Clod Patry (junky) 2005-01-10 08:48:03.000-0600

I just got this with CVS-HEAD-01/10/05-08:56:26 .
gate1*CLI> show channels
       Channel  (Context    Extension    Pri )   State Appl.         Data
      Zap/49-1  (from_dialogic s            1   )      Up Bridged Call  Zap/16-1
      Zap/16-1  (from_bell  5353         2   )      Up Dial          Zap/g3/5353
      Zap/49-1  (from_dialogic s            1   )      Up Bridged Call  Zap/19-1
      Zap/19-1  (from_bell  2711         2   )      Up Dial          Zap/g3/2711
      Zap/49-1  (from_dialogic s            1   )      Up Bridged Call  Zap/5-1
       Zap/5-1  (from_bell  5634         2   )      Up Dial          Zap/g3/5634
      Zap/55-1  (from_dialogic s            1   )      Up Bridged Call  Zap/17-1
      Zap/17-1  (from_bell  7111         2   )      Up Dial          Zap/g3/7111
8 active channel(s)
   -- Executing SetCDRUserField("Zap/18-1", "machine=gate1") in new stack
   -- Executing Dial("Zap/18-1", "Zap/g3/5540") in new stack
   -- Called g3/5540
   -- Accepting call from '154' to '5540' on channel 0/18, span 1
   -- Moving call from channel 49 to channel 70
   -- Zap/49-1 answered Zap/18-1
   -- Attempting native bridge of Zap/18-1 and Zap/49-1
   -- Channel 0/18, span 1 got hangup
   -- Hungup 'Zap/49-1'
 == Spawn extension (from_bell, 5540, 2) exited non-zero on 'Zap/18-1'
   -- Hungup 'Zap/18-1'
Jan 10 09:44:31 WARNING[2911]: chan_zap.c:7182 pri_fixup_principle: Call specified, but not found?
Jan 10 09:44:31 WARNING[2911]: chan_zap.c:8026 pri_dchannel: Hangup on bad channel 0/22 on span 3
gate1*CLI> show channels
       Channel  (Context    Extension    Pri )   State Appl.         Data
      Zap/49-1  (from_dialogic s            1   )      Up Bridged Call  Zap/16-1
      Zap/16-1  (from_bell  5353         2   )      Up Dial          Zap/g3/5353
      Zap/49-1  (from_dialogic s            1   )      Up Bridged Call  Zap/19-1
      Zap/19-1  (from_bell  2711         2   )      Up Dial          Zap/g3/2711
      Zap/49-1  (from_dialogic s            1   )      Up Bridged Call  Zap/5-1
       Zap/5-1  (from_bell  5634         2   )      Up Dial          Zap/g3/5634
      Zap/55-1  (from_dialogic s            1   )      Up Bridged Call  Zap/17-1
      Zap/17-1  (from_bell  7111         2   )      Up Dial          Zap/g3/7111
8 active channel(s)
Jan 10 09:44:35 WARNING[2911]: chan_zap.c:7182 pri_fixup_principle: Call specified, but not found?
Jan 10 09:44:35 WARNING[2911]: chan_zap.c:8026 pri_dchannel: Hangup on bad channel 0/22 on span 3

2 Zap/49-1, how can this be possible exactly?

If you need pri debug, just ask, i've them.

edited on: 01-10-05 08:50

By: Mark Spencer (markster) 2005-01-10 08:53:31.000-0600

Most likely a simultaneous disconnect at both ends.  No worries.

By: Clod Patry (junky) 2005-01-10 10:51:23.000-0600

with all warnings, sometimes i see all channels displayed like that:
gate1*CLI> show channels
       Channel  (Context    Extension    Pri )   State Appl.         Data
      Zap/49-1  (from_dialogic s            1   )      Up Bridged Call  Zap/13-1
      Zap/13-1  (from_bell  9212         2   )      Up Dial          Zap/g3/9212
      Zap/49-1  (from_dialogic s            1   )      Up Bridged Call  Zap/9-1
       Zap/9-1  (from_bell  7111         2   )      Up Dial          Zap/g3/7111
      Zap/49-1  (from_dialogic s            1   )      Up Bridged Call  Zap/16-1
      Zap/16-1  (from_bell  7111         2   )      Up Dial          Zap/g3/7111
6 active channel(s)

It's like all channels are bridged to the 49, which is impossible.

Plus, i see that:
   -- Accepting call from '154' to '5540' on channel 0/7, span 1
   -- Executing SetCDRUserField("Zap/7-1", "machine=gate1") in new stack
   -- Executing Dial("Zap/7-1", "Zap/g3/5540") in new stack
   -- Called g3/5540
   -- Moving call from channel 49 to channel 70
   -- Zap/49-1 answered Zap/7-1
   -- Attempting native bridge of Zap/7-1 and Zap/49-1
gate1*CLI> show channels
       Channel  (Context    Extension    Pri )   State Appl.         Data
      Zap/49-1  (from_dialogic s            1   )      Up Bridged Call  Zap/7-1
       Zap/7-1  (from_bell  5540         2   )      Up Dial          Zap/g3/5540
      Zap/49-1  (from_dialogic s            1   )      Up Bridged Call  Zap/16-1
      Zap/16-1  (from_bell  7111         2   )      Up Dial          Zap/g3/7111
4 active channel(s)

But it works a MUCH better then before.

By: Clod Patry (junky) 2005-01-11 10:41:42.000-0600

any answers why i've a lot of Zap/49-1 ?

and still today i've output like:
gate1*CLI> show channels
       Channel  (Context    Extension    Pri )   State Appl.         Data
      Zap/52-1  (from_dialogic s            1   )      Up Bridged Call  Zap/12-1
      Zap/12-1  (from_bell  9212         2   )      Up Dial          Zap/g3/9212
      Zap/51-1  (from_dialogic s            1   )      Up Bridged Call  Zap/11-1
      Zap/11-1  (from_bell  9212         2   )      Up Dial          Zap/g3/9212
      Zap/50-1  (from_dialogic s            1   )      Up Bridged Call  Zap/15-1
      Zap/15-1  (from_bell  2485         2   )      Up Dial          Zap/g3/2485
      Zap/49-1  (from_dialogic s            1   )      Up Bridged Call  Zap/8-1
       Zap/8-1  (from_bell  9212         2   )      Up Dial          Zap/g3/9212
      Zap/49-1  (from_dialogic s            1   )      Up Bridged Call  Zap/20-1
      Zap/20-1  (from_bell  2507         2   )      Up Dial          Zap/g3/2507
      Zap/49-1  (from_dialogic s            1   )      Up Bridged Call  Zap/17-1
      Zap/17-1  (from_bell  9212         2   )      Up Dial          Zap/g3/9212
      Zap/49-1  (from_dialogic s            1   )      Up Bridged Call  Zap/4-1
       Zap/4-1  (from_bell  5540         2   )      Up Dial          Zap/g3/5540
14 active channel(s)

I suspect all this due to moving.

By: Mark Spencer (markster) 2005-01-11 10:59:01.000-0600

This is just an issue with the channel not getting renamed when the fixup takes place.  It's fixed in CVS head.

By: Clod Patry (junky) 2005-01-11 12:25:49.000-0600

I've CVS-HEAD-01/10/05-08:56:26 and still have this issue.

Did ya add something else after the:
pri->pvts[x]->subs[SUB_REAL].owner = NULL; (which was last friday), cause i've that line too, and still having the issue cited after 01-10-05 08:48.



Thanks

By: Mark Spencer (markster) 2005-01-11 12:57:10.000-0600

Yes, I added something today, now stop reopening this bug everytime you have a question!

By: Digium Subversion (svnbot) 2008-01-15 15:20:27.000-0600

Repository: asterisk
Revision: 4707

U   trunk/channels/chan_zap.c

------------------------------------------------------------------------
r4707 | markster | 2008-01-15 15:20:26 -0600 (Tue, 15 Jan 2008) | 2 lines

Make sure we clear out owner in SUB_REAL when moving a call (bug ASTERISK-3205)

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=4707

By: Digium Subversion (svnbot) 2008-01-15 15:20:37.000-0600

Repository: asterisk
Revision: 4719

U   branches/v1-0/channels/chan_zap.c

------------------------------------------------------------------------
r4719 | russell | 2008-01-15 15:20:37 -0600 (Tue, 15 Jan 2008) | 2 lines

Make sure we clear out owner in SUB_REAL when moving a call (bug ASTERISK-3205)

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=4719