[Home]

Summary:DAHLIN-00239: Red Alarm after many [PRI got event: HDLC Abort (6) on Primary D-channel of span 1]
Reporter:Giovanni Lovato (heruan)Labels:
Date Opened:2011-04-14 02:37:36Date Closed:2019-05-31 09:24:48
Priority:BlockerRegression?No
Status:Closed/CompleteComponents:wcb4xxp
Versions:2.4.1 Frequency of
Occurrence
Related
Issues:
Environment:Attachments:
Description:After many [PRI got event: HDLC Abort (6) on Primary D-channel of span 1] (every 30 seconds), span 1 goes down with Red Alarm.

*CLI> dahdi show status
Description                              Alarms  IRQ    bpviol CRC4   Fra Codi Options  LBO
B4XXP (PCI) Card 0 Span 1                OK      0      0      0      CCS AMI  YEL      0 db (CSU)/0-133 feet (DSX-1)
B4XXP (PCI) Card 0 Span 2                OK      0      0      0      CCS AMI  YEL      0 db (CSU)/0-133 feet (DSX-1)
[Apr 13 17:53:10] NOTICE[2301]: chan_dahdi.c:12696 pri_dchannel: PRI got event: HDLC Abort (6) on Primary D-channel of span 1
[Apr 13 17:53:41] NOTICE[2301]: chan_dahdi.c:12696 pri_dchannel: PRI got event: HDLC Abort (6) on Primary D-channel of span 1
[Apr 13 17:54:11] NOTICE[2301]: chan_dahdi.c:12696 pri_dchannel: PRI got event: HDLC Abort (6) on Primary D-channel of span 1
// same message 16 times, every 30 seconds
[Apr 13 18:03:17] NOTICE[2301]: chan_dahdi.c:12696 pri_dchannel: PRI got event: HDLC Abort (6) on Primary D-channel of span 1
[Apr 13 18:03:47] NOTICE[2301]: chan_dahdi.c:12696 pri_dchannel: PRI got event: HDLC Abort (6) on Primary D-channel of span 1
[Apr 13 18:04:17] NOTICE[2301]: chan_dahdi.c:12696 pri_dchannel: PRI got event: HDLC Abort (6) on Primary D-channel of span 1
[Apr 13 18:04:48] NOTICE[2301]: chan_dahdi.c:12696 pri_dchannel: PRI got event: Alarm (4) on Primary D-channel of span 1
[Apr 13 18:04:48] WARNING[2303]: chan_dahdi.c:5767 handle_alarms: Detected alarm on channel 1: Red Alarm
[Apr 13 18:04:48] WARNING[2303]: chan_dahdi.c:5767 handle_alarms: Detected alarm on channel 2: Red Alarm
*CLI> dahdi show status
Description                              Alarms  IRQ    bpviol CRC4   Fra Codi Options  LBO
B4XXP (PCI) Card 0 Span 1                RED    0      0      0      CCS AMI  YEL      0 db (CSU)/0-133 feet (DSX-1)
B4XXP (PCI) Card 0 Span 2                OK      0      0      0      CCS AMI  YEL      0 db (CSU)/0-133 feet (DSX-1)

****** ADDITIONAL INFORMATION ******

ISDN controller: Cologne Chip Designs GmbH ISDN network Controller [HFC-4S] (rev 01)

Asterisk 1.6.2.9

DAHDI-linux: 2.4.1

$ cat /etc/dahdi/modules:
dahdi
dahdi_transcode
wcb4xxp

$ cat /etc/dahdi/system.conf
# Span 1: B4/0/1 "B4XXP (PCI) Card 0 Span 1" (MASTER)
span=1,1,0,ccs,ami
# termtype: te
bchan=1-2
hardhdlc=3
echocanceller=oslec,1-2
# Span 2: B4/0/2 "B4XXP (PCI) Card 0 Span 2"
span=2,2,0,ccs,ami
# termtype: nt
bchan=4-5
hardhdlc=6
echocanceller=oslec,4-5
# Global data
loadzone = it
defaultzone = it

$ cat /etc/asterisk/dahdi-channels.conf
; Span 1: B4/0/1 "B4XXP (PCI) Card 0 Span 1" (MASTER)
group=0,11
context=from-pstn
switchtype=euroisdn
signalling=bri_cpe_ptmp
channel => 1-2
; Span 2: B4/0/2 "B4XXP (PCI) Card 0 Span 2"
group=0,12
context=from-isdn
switchtype=euroisdn
signalling=bri_net_ptmp
channel => 4-5
Comments:By: Giovanni Lovato (heruan) 2011-04-15 01:49:06

$ sudo dahdi_scan
[1]
active=yes
alarms=RED
description=B4XXP (PCI) Card 0 Span 1
name=B4/0/1
manufacturer=Digium
devicetype=HFC-2S Junghanns.NET duoBRI PCI
location=PCI Bus 02 Slot 02
basechan=1
totchans=3
irq=17
type=digital-TE
syncsrc=0
lbo=0 db (CSU)/0-133 feet (DSX-1)
coding_opts=B8ZS,AMI,HDB3
framing_opts=ESF,D4,CCS,CRC4
coding=AMI
framing=CCS
[2]
active=yes
alarms=OK
description=B4XXP (PCI) Card 0 Span 2
name=B4/0/2
manufacturer=Digium
devicetype=HFC-2S Junghanns.NET duoBRI PCI
location=PCI Bus 02 Slot 02
basechan=4
totchans=3
irq=17
type=digital-NT
syncsrc=0
lbo=0 db (CSU)/0-133 feet (DSX-1)
coding_opts=B8ZS,AMI,HDB3
framing_opts=ESF,D4,CCS,CRC4
coding=AMI
framing=CCS



By: Giovanni Lovato (heruan) 2011-04-15 03:05:03

Cabling schema:

TELCO == NT1 Plus == duoBRI PCI == Asterisk*

I tried several different configuration of NT1 Plus, like disabling analog ports and resetting.
What does [PRI got event: HDLC Abort (6) on Primary D-channel of span 1] mean precisely?
Could it be a configuration error? Missing/wrong modules loaded? Here's my lsmod:

$ lsmod
Module                  Size  Used by
dahdi_echocan_oslec    12570  4
echo                   13381  1 dahdi_echocan_oslec
wcb4xxp                48164  6
dahdi_transcode        13956  0
dahdi                 205492  15 dahdi_echocan_oslec,wcb4xxp,dahdi_transcode
usb_storage            43946  0
uas                    17676  0
veth                   13174  0
bridge                 75021  0
stp                    12811  1 bridge
snd_hda_codec_realtek   255820  1
i915                  450979  1
snd_hda_intel          24140  0
snd_hda_codec          90901  2 snd_hda_codec_realtek,snd_hda_intel
snd_hwdep              13274  1 snd_hda_codec
snd_pcm                80244  2 snd_hda_intel,snd_hda_codec
drm_kms_helper         40745  1 i915
drm                   184133  2 i915,drm_kms_helper
crc_ccitt              12595  1 dahdi
hfcmulti               78413  0
snd_timer              28659  1 snd_pcm
serio_raw              12990  0
mISDN_core             81562  1 hfcmulti
snd                    55295  6 snd_hda_codec_realtek,snd_hda_intel,snd_hda_codec,snd_hwdep,snd_pcm,snd_timer
i2c_algo_bit           13184  1 i915
shpchp                 32345  0
video                  18951  1 i915
soundcore              12600  1 snd
snd_page_alloc         14073  2 snd_hda_intel,snd_pcm
lp                     13349  0
parport                36746  1 lp
e1000e                138627  0

And my dahdi/modules:

$ cat /etc/dahdi/modules
dahdi
dahdi_transcode
wcb4xxp

$ cat /proc/interrupts
          CPU0       CPU1      
 0:      41583          0   IO-APIC-edge      timer
 1:          2          0   IO-APIC-edge      i8042
 8:          1          0   IO-APIC-edge      rtc0
 9:          0          0   IO-APIC-fasteoi   acpi
14:        474          0   IO-APIC-edge      ata_piix
15:       9750          0   IO-APIC-edge      ata_piix
16:        249          0   IO-APIC-fasteoi   ehci_hcd:usb1, uhci_hcd:usb2, i915
17:     483656          0   IO-APIC-fasteoi   uhci_hcd:usb3, b4xxp
18:          0          0   IO-APIC-fasteoi   uhci_hcd:usb4
19:          0          0   IO-APIC-fasteoi   uhci_hcd:usb5
43:        733          0   PCI-MSI-edge      eth0
45:        151          0   PCI-MSI-edge      hda_intel

$ dmesg | tail
[  401.289801] dahdi: Telephony Interface Registered on major 196
[  401.289811] dahdi: Version: 2.4.1
[  401.322409] dahdi_transcode: Loaded.
[  401.334881] wcb4xxp 0000:02:01.0: probe called for b4xx...
[  401.334924] wcb4xxp 0000:02:01.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
[  401.335572] wcb4xxp 0000:02:01.0: Identified HFC-2S Junghanns.NET duoBRI PCI (controller rev 1) at 0001bf00, IRQ 17
[  401.337540] wcb4xxp 0000:02:01.0: NOTE: hardware echo cancellation has been disabled
[  401.337747] wcb4xxp 0000:02:01.0: Port 1: TE mode
[  401.337869] wcb4xxp 0000:02:01.0: Port 2: NT mode
[  401.350594] wcb4xxp 0000:02:01.0: Did not do the highestorder stuff
[  401.450083] hfc_handle_state: 3 callbacks suppressed
[  401.450094] wcb4xxp 0000:02:01.0: new card sync source: port 4
[  401.550107] wcb4xxp 0000:02:01.0: new card sync source: port 1
[  401.630887] echo: module is from the staging directory, the quality is unknown, you have been warned.
[  401.642298] dahdi_echocan_oslec: Registered echo canceler 'OSLEC'
[  401.643765] dahdi: Registered tone zone 11 (Italy)

$ sudo lspci -vv -t  
-[0000:00]-+-00.0  Intel Corporation Mobile 945GME Express Memory Controller Hub
          +-02.0  Intel Corporation Mobile 945GME Express Integrated Graphics Controller
          +-02.1  Intel Corporation Mobile 945GM/GMS/GME, 943/940GML Express Integrated Graphics Controller
          +-1b.0  Intel Corporation N10/ICH 7 Family High Definition Audio Controller
          +-1c.0-[01-02]----00.0-[02]----01.0  Cologne Chip Designs GmbH ISDN network Controller [HFC-4S]
          +-1c.2-[03]----00.0  Intel Corporation 82574L Gigabit Network Connection
          +-1c.3-[04]----00.0  Intel Corporation 82574L Gigabit Network Connection
          +-1d.0  Intel Corporation N10/ICH 7 Family USB UHCI Controller #1
          +-1d.1  Intel Corporation N10/ICH 7 Family USB UHCI Controller #2
          +-1d.2  Intel Corporation N10/ICH 7 Family USB UHCI Controller #3
          +-1d.3  Intel Corporation N10/ICH 7 Family USB UHCI Controller #4
          +-1d.7  Intel Corporation N10/ICH 7 Family USB2 EHCI Controller
          +-1e.0-[05]--+-0e.0  Advantech Co. Ltd Device a102
          |            \-0e.1  Advantech Co. Ltd Device f100
          +-1f.0  Intel Corporation 82801GBM (ICH7-M) LPC Interface Bridge
          +-1f.2  Intel Corporation 82801GBM/GHM (ICH7 Family) SATA IDE Controller
          \-1f.3  Intel Corporation N10/ICH 7 Family SMBus Controller



By: Birger "WIMPy" Harzenetter (wimpy) 2011-04-15 05:26:59

Two drivers for the same hardware are certainly not a good thing.
Make sure that you only load _EITHER_ dahdi / wcb4xxp _OR_ mISDN / hfcmulti.
In theory one of them shouldn't do anything, but I'd make sure that is fixed.

And as you have one port in NT Mode you want to use a 1.8 Version of Asterisk.



By: Giovanni Lovato (heruan) 2011-04-15 05:50:12

I've already tried to unload mISDN/hfcmulti, the issue stills hold.
I don't know why mISDN/hfcmulti are loaded, since they are not in /etc/dahdi/modules... Maybe some Debian setting?

Why do you say "as you have one port in NT Mode you want to use a 1.8 Version of Asterisk"? Just curious, they made any improvements related to NT ports? BTW, my NT port has no issues.

By: Birger "WIMPy" Harzenetter (wimpy) 2011-04-15 08:18:23

Blacklist them or remove them.
There is no relation between mISDN and dahdi, except that they have the same purpose. Depending on your situation, choose one of them.

Yes, 1.8 supports NT ptmp. It also has other improvements for ISDN interoperability.

By: Giovanni Lovato (heruan) 2011-04-15 09:08:02

Yes, I blacklisted them.
On a fresh install of Asterisk 1.8, I have the same issue:

[Apr 15 14:02:25] NOTICE[8117]: chan_dahdi.c:2982 my_handle_dchan_exception: PRI got event: HDLC Abort (6) on D-channel of span 1
[Apr 15 14:02:55] NOTICE[8117]: chan_dahdi.c:2982 my_handle_dchan_exception: PRI got event: HDLC Abort (6) on D-channel of span 1
[Apr 15 14:03:25] NOTICE[8117]: chan_dahdi.c:2982 my_handle_dchan_exception: PRI got event: HDLC Abort (6) on D-channel of span 1
[Apr 15 14:03:56] NOTICE[8117]: chan_dahdi.c:2982 my_handle_dchan_exception: PRI got event: HDLC Abort (6) on D-channel of span 1
[Apr 15 14:04:26] NOTICE[8117]: chan_dahdi.c:2982 my_handle_dchan_exception: PRI got event: HDLC Abort (6) on D-channel of span 1
[Apr 15 14:04:56] NOTICE[8117]: chan_dahdi.c:2982 my_handle_dchan_exception: PRI got event: HDLC Abort (6) on D-channel of span 1

Could be an IRQ issue? How can I reserve a whole IRQ to the ISDN interface PCI card?

By: Birger "WIMPy" Harzenetter (wimpy) 2011-04-15 09:11:07

If it's not not receiving garbage for some (physical) reason, an IRQ issue seems likely.
Try to put the card in another slot.

By: Giovanni Lovato (heruan) 2011-04-15 09:38:56

Tried to put the card in another slot, same HDLC aborts - so, no IRQ issue.
Do you mean garbage from the TELCO NT? It's a NT1 Plus. Or maybe a cabling issue?

By: Birger "WIMPy" Harzenetter (wimpy) 2011-04-15 09:48:40

Garbage produced outside the PC, yes.
Like a faulty NT or cabling.

By: Giovanni Lovato (heruan) 2011-04-15 10:08:52

This may be interesting: to configure the NT1 Plus, I use a tone phone. While in configuration mode, I hear a "click" sound every 30 seconds: exactly when Asterisk logs the [PRI got event: HDLC Abort (6) on D-channel of span 1]. So it's something coming from the local NT or even from the TELCO. They say there are no issues on the line, so maybe it's the NT. The "click" is every 30 seconds, not at random. Should wcb4xxp handle this? Why is it going to Red Alarm? If I unload wcb4xxp, the NT stays up, so it's wcb4xxp to hang it down.

Any hint on how to debug that?
I tried to set debug level to 9, enabling console to show debug and verbose, but it shows me the same "PRI got event: HDLC Abort" message only.



By: Shaun Ruffell (sruffell) 2011-04-15 10:43:59

heruan: Just out of curiousity, if you change the timer_3_ms module parameter on the wcb4xxp driver to something like 20000, does that make the clicks happen on 20s intervals?



By: Giovanni Lovato (heruan) 2011-04-15 10:53:38

sruffell: no, the clicks still happen on 30s intervals...

By: Birger "WIMPy" Harzenetter (wimpy) 2011-04-15 10:57:26

Do you still hear them if you've got nothing else connected to the NT?

Is it really an NT or is it an IAD?

By: Shaun Ruffell (sruffell) 2011-04-15 10:59:16

heruan: If you have multiple cores, try forcing all the wcb4xxp interrupts onto one core by itself (not the first one).  The fact that your problem is so regular is/should be a major clue.  I've seen things like this when SMI interrupts run periodically.

I.e.: http://article.gmane.org/gmane.comp.telephony.pbx.asterisk.user/251300

By: Giovanni Lovato (heruan) 2011-04-15 11:03:43

wimpy: nothing else, nor the ISDN line from TELCO? If I disconnect the ISDN cable from the TELCO the channel goes down and I don't hear nothing.

sruffell: I'll try immediately, thank you for the link!

By: Birger "WIMPy" Harzenetter (wimpy) 2011-04-15 11:06:34

Ok, that was probably not well verbalized.
I meant nothing but that phone and the telco line.

By: Pavel Odvárka (odvel) 2011-04-18 11:04:36

I think it should be related to 14301 and telco power safe mode. See note 0108344 and 0126897 there.
I have the same warnings every ten seconds but it works fine. Before upgrade at me I remember there were No D-channels available warnings, they changed to HDLC Abort (6) on Primary D-channel warnings now.
Wouldn´t be better to give span=2,0,0,ccs,ami for NT mode?

By: Birger "WIMPy" Harzenetter (wimpy) 2011-04-18 11:22:28

Quite possible, but:
If it really causes disturbances to other devices on the bus, there's something really bad going on.

By: Pavel Odvárka (odvel) 2011-04-19 02:53:53

wimpy: you are right it is not good until wcb4xxp handles power safe mode (maybe it is not the reason of the solved problem). I think if there is no activity from telco, wcb4xxp driver makes restart of ISDN bus - you hear the click.

heruan: what other problems do you have (other than warnings and clicks)? Can you place any call?

By: Rafael Prado Rocchi (prado) 2011-04-19 16:57:03

What is the libpri version?

By: Giovanni Lovato (heruan) 2011-04-30 03:30:26

Sorry guys for not being available in the last few days...

prado: libpri version is 1.4.11.3

odvel: while the span is OK, I can place calls and I have no problems (only the "PRI got event: HDLC Abort (6) on D-channel" warning). Then, when the span goes on RED, I can no more place calls.

sruffel: I tried to use irqbalance to force all wcb4xxp interrupts onto one core, but it doesn't work (I see other modules onto that core). I'm trying again now.

By: Giovanni Lovato (heruan) 2011-04-30 04:31:33

sruffel: I finally managed to force wcb4xxp interrupts on one dedicated core, but I still get "PRI got event: HDLC Abort (6) on D-channel of span 1" every 30 seconds.

I discovered another interesting thing: while the wcb4xxp card is connected to the NT, I hear the "ticks" generating the HDLC Abort every 30 seconds. Then, if I disconnect the cable from the wcb4xxp to the NT, I still hear the "ticks" from the NT, *but* every 135 seconds (more and less). Definitely no more every 30 seconds.

Question: those "ticks" generate the HDLC Abort warning, then after many of this warnings the span goes on RED alarm. Is the RED alarm a consequence of the HDLC Abort warnings? I thought yes until now, but I want to be secure.

By: Pavel Odvárka (odvel) 2011-05-03 10:45:46

My case:
After restart (and some other time too) on an asterisk server I get these warnings every 10s on span 1. Another time (I did not see that at the same time together) every 35s on span 2. Both lines are to the same telco operator. No warnings on span 3 or 4, lines to older PBX (asterisk server is in NT mode for span 3 and 4).

After a call is placed on the line to telco operator (the line that generates warnings, the direction of the call does not matter) warnings stop and all is OK. It may be after hours. There is no alarm all the time.

heruan, so I think these warnings are not necessarily the cause of your RED alarm.

To be sure: did you test span=2,0,0,ccs,ami for NT mode?

By: Giovanni Lovato (heruan) 2011-09-25 06:50:01.863-0500

Sorry for the long vacancy.
Pavel, I tried span=2,0,0,ccs,ami but I still get HDLC Aborts on Primary D-channel of span 1 every 30 seconds with the "ticks" from the NT box.
Do you think the HDLC Aborts are not related to the RED Alarm? Then, how can I debug those RED Alarms which are a blocking issue?

By: Shaun Ruffell (sruffell) 2011-10-17 09:23:21.315-0500

I haven't really thought about this issue until today and I wanted to ping it and see if it is still an issue?

By: Sebastian Gutierrez (sum) 2011-11-03 14:37:31.142-0500

this is still an issue at least with this config:

asterisk 1.6.2.20
dahdi 2.5.0.2
libpri 1.4.12