[Home]

Summary:ASTERISK-13175: mISDN layer stops working
Reporter:Thomas Omerzu (t-o)Labels:
Date Opened:2008-12-08 07:11:06.000-0600Date Closed:2009-06-05 07:39:39
Priority:MajorRegression?No
Status:Closed/CompleteComponents:Channels/chan_misdn
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) mISDN_deadlock.patch
Description:After having accepted a lot of calls, the mISDN layer gets unusable; no more calls are accepted.

asterix*CLI> misdn show channels
Chan List: 0x83872c0
bc with pid:0 has no Ast Leg
P[ 0] received 1k Unhandled Bchannel Messages: prim 120282 len 128
from addr 52010101, dinfo ffffffff on this port.

A restart of the asterisk software clears the problem.

****** ADDITIONAL INFORMATION ******

mISDN version is 1.1.8, the ISDN hardware is a BeroNet BN4S0 which has currently 3 lines connected to a PBX.

The incoming calls are handled by an AGI skript which leads the caller through a menu and eventually hangs up.
Comments:By: André Lehmann (blacktux) 2008-12-08 10:49:35.000-0600

Same Problem here.

A "misdn restart port x" clears sometimes the Problem without Restart from Asterisk.



By: saghul (saghul) 2009-01-22 02:38:03.000-0600

Same problem here. Asterisk 1.4.21.1 and mISDN 1.1.7.2.

misdn restart port X does not solve the problem.

By: saghul (saghul) 2009-01-22 03:00:36.000-0600

Problem gets 'apparently' solved by restarting both mISDN and Asterisk.

By: Matteo Piscitelli (picciux) 2009-02-04 05:42:16.000-0600

Maybe it's the same of ASTERISK-12734 ? I have both symptoms: 'Requested Channel Already in Use...' (from ASTERISK-12734 ) and 'P[ 0] received 1k Unhandled Bchannel Messages: prim 120282 len 128 from addr 52010101, dinfo ffffffff on this port.' from this bug.

By: Thomas Omerzu (t-o) 2009-02-04 09:45:38.000-0600

Some notes:

1)
The version information above was wrong.
The problem in the form described occurs in 1.4.21.2.
In 1.4.22, the "no ast leg" problem also occurs from time to time, but
seems to vanish always by itself after some seconds.
Nevertheless, misdn.log keeps to get messages like
 Wed Feb  4 07:39:25 2009: P[ 1]   --> !! lib: No free channel!
 Wed Feb  4 07:39:25 2009: P[ 1]   --> we have already send Release_complete
although there definitely would be free channels on other ports in the
same group.

2)
@picciuX: Yes, I agree that this looks quite similar!

By: saghul (saghul) 2009-02-27 02:56:56.000-0600

Any updates on this issue?

By: Richard Mudgett (rmudgett) 2009-02-27 17:30:54.000-0600

Asterisk 1.4 SVN -r168622 which is now in Asterisk 1.4.23 may have a bearing on this problem.

*  Fixed create_process() allocation of process ID values.
The allocated process IDs could overflow their respective
NT and TE fields.  Affects outgoing calls.

By: Thomas Omerzu (t-o) 2009-03-12 06:24:33

Also with 1.4.23.1, I continue getting

Thu Mar 12 06:42:16 2009: P[ 1]   --> !! lib: No free channel!
Thu Mar 12 06:42:16 2009: P[ 1]   --> we have already send Release_complete

messages in misdn.log, although there are definitively free channels available.

But as I said in my original report: This happens on INCOMING connections!

By: Sven Hirschmueller (sodom) 2009-03-26 03:42:33

There is some kind of problem with the channel counter of every port. Sometimes the channel counter didn't realises that a channel of a port has gone free again. (Mostly if some kind of low level error frees the channel again.) You can't find any hints if a freed channel is realy recounted to the free channel counter, as you don't find a channel or a blocked port in asterisk. (As fair as i know)
On my side i can reset the channel counter by issue a:
misdn port restart <port>

this seem to reset the channel counter and i can use the port again.

Sorry, but i still don't know witch scenario leads to this "forgotten" channel.

By: Richard Mudgett (rmudgett) 2009-04-13 12:29:21

The attached single line patch file (mISDN_deadlock.patch) applies to mISDN 1.1.8, 1.1.9.x, and should work for other 1.1.x versions of mISDN.  Please report if this resolves the issue.


Added handling of EV_RELEASE_CNF to ST_L3_LC_ESTAB_WAIT state to bring L2 back up.

The mISDN L2 and L3 state machines could get into a deadlocked condition
while L2 is going down just as the TE side attempts to initiate a call
or respond to an incoming call.

JIRA ABE-1816, issue 14030

Sequence of events between L2 and L3:
ST_L3_LC_ESTAB::EV_RELEASE_REQ -> ST_L3_LC_REL_DELAY
       Event for various reasons when L3 is no longer needed.
       i.e. RELEASE_COMPLETE is sent.
ST_L3_LC_REL_DELAY::TIMEOUT -> ST_L3_LC_REL_WAIT
       Event sent to L2 state machine to initiate bringing L2 down.
ST_L2_7::EV_L2_DL_RELEASE_REQ -> ST_L2_6
       L2 sends disconnect and waits for a reply.
ST_L3_LC_REL_WAIT::EV_ESTABLISH_REQ -> ST_L3_LC_ESTAB_WAIT
       L3 message (SETUP, SETUP_ACK, or PROCEEDING) attempts to bring
       L2 back up by sending event to L2 to establish link.
ST_L2_6::EV_L2_DL_ESTABLISH_REQ (Ignored)
       L2 is busy still waiting for a reply to the disconnect.
ST_L2_6::EV_L2_UA -> ST_L2_4
       L2 completes going down and sends event to L3.
ST_L3_LC_ESTAB_WAIT::EV_RELEASE_CNF (Ignored)
       L3 thinks L2 is trying to go up so does not know what
       to do when L2 confirms that is has completed going down.
ST_L3_LC_ESTAB_WAIT::EV_ESTABLISH_REQ (Ignored)
       Subsequent messages cannot go out because L3 is already
       waiting for L2 to go up.  Since L2 is never going
       to try to go up anymore, the state machines are
       deadlocked.

By: André Lehmann (blacktux) 2009-05-14 03:46:28

Hi,

I'am now using asterisk-1.4.21.1 and misdn-1_1_9.1 with the Patch.

Today, atfer approximate 1 Week Uptime, the same Problem happened.

:-(

By: Richard Mudgett (rmudgett) 2009-05-14 09:50:07

The fix for issue 0013488 also needs to be applied since it can cause the same kind of symptoms.  It is also a one line change.



By: André Lehmann (blacktux) 2009-05-14 10:37:32

I would test it, but I can't download the patch isdn_lib.patch.txt from https://issues.asterisk.org/view.php?id=13488 ?

There's no License...

:-(

By: Richard Mudgett (rmudgett) 2009-05-14 11:09:49

I extracted the patch from the 1.4 commit (-r185120) and put the patch back on issue 0013488 as defer_channel_selection.patch.txt.

By: André Lehmann (blacktux) 2009-05-14 13:30:32

Thank you!

The Asterisk is patched and rebuilded.

I wait again a few Weeks. ;-)

By: Thomas Omerzu (t-o) 2009-06-04 10:12:53

At least for me, this seems to have eliminated the problem!

By: André Lehmann (blacktux) 2009-06-05 02:13:42

The Asterisk has an uptime of 3 Weeks and 12 Hours. No Problems.

:-)

By: Russell Bryant (russell) 2009-06-05 07:39:38

Thanks for reporting back, guys!