Summary: | DAHLIN-00239: Red Alarm after many [PRI got event: HDLC Abort (6) on Primary D-channel of span 1] | ||
Reporter: | Giovanni Lovato (heruan) | Labels: | |
Date Opened: | 2011-04-14 02:37:36 | Date Closed: | 2019-05-31 09:24:48 |
Priority: | Blocker | Regression? | No |
Status: | Closed/Complete | Components: | wcb4xxp |
Versions: | 2.4.1 | Frequency of Occurrence | |
Related Issues: | |||
Environment: | Attachments: | ||
Description: | After many [PRI got event: HDLC Abort (6) on Primary D-channel of span 1] (every 30 seconds), span 1 goes down with Red Alarm. *CLI> dahdi show status Description Alarms IRQ bpviol CRC4 Fra Codi Options LBO B4XXP (PCI) Card 0 Span 1 OK 0 0 0 CCS AMI YEL 0 db (CSU)/0-133 feet (DSX-1) B4XXP (PCI) Card 0 Span 2 OK 0 0 0 CCS AMI YEL 0 db (CSU)/0-133 feet (DSX-1) [Apr 13 17:53:10] NOTICE[2301]: chan_dahdi.c:12696 pri_dchannel: PRI got event: HDLC Abort (6) on Primary D-channel of span 1 [Apr 13 17:53:41] NOTICE[2301]: chan_dahdi.c:12696 pri_dchannel: PRI got event: HDLC Abort (6) on Primary D-channel of span 1 [Apr 13 17:54:11] NOTICE[2301]: chan_dahdi.c:12696 pri_dchannel: PRI got event: HDLC Abort (6) on Primary D-channel of span 1 // same message 16 times, every 30 seconds [Apr 13 18:03:17] NOTICE[2301]: chan_dahdi.c:12696 pri_dchannel: PRI got event: HDLC Abort (6) on Primary D-channel of span 1 [Apr 13 18:03:47] NOTICE[2301]: chan_dahdi.c:12696 pri_dchannel: PRI got event: HDLC Abort (6) on Primary D-channel of span 1 [Apr 13 18:04:17] NOTICE[2301]: chan_dahdi.c:12696 pri_dchannel: PRI got event: HDLC Abort (6) on Primary D-channel of span 1 [Apr 13 18:04:48] NOTICE[2301]: chan_dahdi.c:12696 pri_dchannel: PRI got event: Alarm (4) on Primary D-channel of span 1 [Apr 13 18:04:48] WARNING[2303]: chan_dahdi.c:5767 handle_alarms: Detected alarm on channel 1: Red Alarm [Apr 13 18:04:48] WARNING[2303]: chan_dahdi.c:5767 handle_alarms: Detected alarm on channel 2: Red Alarm *CLI> dahdi show status Description Alarms IRQ bpviol CRC4 Fra Codi Options LBO B4XXP (PCI) Card 0 Span 1 RED 0 0 0 CCS AMI YEL 0 db (CSU)/0-133 feet (DSX-1) B4XXP (PCI) Card 0 Span 2 OK 0 0 0 CCS AMI YEL 0 db (CSU)/0-133 feet (DSX-1) ****** ADDITIONAL INFORMATION ****** ISDN controller: Cologne Chip Designs GmbH ISDN network Controller [HFC-4S] (rev 01) Asterisk 1.6.2.9 DAHDI-linux: 2.4.1 $ cat /etc/dahdi/modules: dahdi dahdi_transcode wcb4xxp $ cat /etc/dahdi/system.conf # Span 1: B4/0/1 "B4XXP (PCI) Card 0 Span 1" (MASTER) span=1,1,0,ccs,ami # termtype: te bchan=1-2 hardhdlc=3 echocanceller=oslec,1-2 # Span 2: B4/0/2 "B4XXP (PCI) Card 0 Span 2" span=2,2,0,ccs,ami # termtype: nt bchan=4-5 hardhdlc=6 echocanceller=oslec,4-5 # Global data loadzone = it defaultzone = it $ cat /etc/asterisk/dahdi-channels.conf ; Span 1: B4/0/1 "B4XXP (PCI) Card 0 Span 1" (MASTER) group=0,11 context=from-pstn switchtype=euroisdn signalling=bri_cpe_ptmp channel => 1-2 ; Span 2: B4/0/2 "B4XXP (PCI) Card 0 Span 2" group=0,12 context=from-isdn switchtype=euroisdn signalling=bri_net_ptmp channel => 4-5 | ||
Comments: | By: Giovanni Lovato (heruan) 2011-04-15 01:49:06 $ sudo dahdi_scan [1] active=yes alarms=RED description=B4XXP (PCI) Card 0 Span 1 name=B4/0/1 manufacturer=Digium devicetype=HFC-2S Junghanns.NET duoBRI PCI location=PCI Bus 02 Slot 02 basechan=1 totchans=3 irq=17 type=digital-TE syncsrc=0 lbo=0 db (CSU)/0-133 feet (DSX-1) coding_opts=B8ZS,AMI,HDB3 framing_opts=ESF,D4,CCS,CRC4 coding=AMI framing=CCS [2] active=yes alarms=OK description=B4XXP (PCI) Card 0 Span 2 name=B4/0/2 manufacturer=Digium devicetype=HFC-2S Junghanns.NET duoBRI PCI location=PCI Bus 02 Slot 02 basechan=4 totchans=3 irq=17 type=digital-NT syncsrc=0 lbo=0 db (CSU)/0-133 feet (DSX-1) coding_opts=B8ZS,AMI,HDB3 framing_opts=ESF,D4,CCS,CRC4 coding=AMI framing=CCS By: Giovanni Lovato (heruan) 2011-04-15 03:05:03 Cabling schema: TELCO == NT1 Plus == duoBRI PCI == Asterisk* I tried several different configuration of NT1 Plus, like disabling analog ports and resetting. What does [PRI got event: HDLC Abort (6) on Primary D-channel of span 1] mean precisely? Could it be a configuration error? Missing/wrong modules loaded? Here's my lsmod: $ lsmod Module Size Used by dahdi_echocan_oslec 12570 4 echo 13381 1 dahdi_echocan_oslec wcb4xxp 48164 6 dahdi_transcode 13956 0 dahdi 205492 15 dahdi_echocan_oslec,wcb4xxp,dahdi_transcode usb_storage 43946 0 uas 17676 0 veth 13174 0 bridge 75021 0 stp 12811 1 bridge snd_hda_codec_realtek 255820 1 i915 450979 1 snd_hda_intel 24140 0 snd_hda_codec 90901 2 snd_hda_codec_realtek,snd_hda_intel snd_hwdep 13274 1 snd_hda_codec snd_pcm 80244 2 snd_hda_intel,snd_hda_codec drm_kms_helper 40745 1 i915 drm 184133 2 i915,drm_kms_helper crc_ccitt 12595 1 dahdi hfcmulti 78413 0 snd_timer 28659 1 snd_pcm serio_raw 12990 0 mISDN_core 81562 1 hfcmulti snd 55295 6 snd_hda_codec_realtek,snd_hda_intel,snd_hda_codec,snd_hwdep,snd_pcm,snd_timer i2c_algo_bit 13184 1 i915 shpchp 32345 0 video 18951 1 i915 soundcore 12600 1 snd snd_page_alloc 14073 2 snd_hda_intel,snd_pcm lp 13349 0 parport 36746 1 lp e1000e 138627 0 And my dahdi/modules: $ cat /etc/dahdi/modules dahdi dahdi_transcode wcb4xxp $ cat /proc/interrupts CPU0 CPU1 0: 41583 0 IO-APIC-edge timer 1: 2 0 IO-APIC-edge i8042 8: 1 0 IO-APIC-edge rtc0 9: 0 0 IO-APIC-fasteoi acpi 14: 474 0 IO-APIC-edge ata_piix 15: 9750 0 IO-APIC-edge ata_piix 16: 249 0 IO-APIC-fasteoi ehci_hcd:usb1, uhci_hcd:usb2, i915 17: 483656 0 IO-APIC-fasteoi uhci_hcd:usb3, b4xxp 18: 0 0 IO-APIC-fasteoi uhci_hcd:usb4 19: 0 0 IO-APIC-fasteoi uhci_hcd:usb5 43: 733 0 PCI-MSI-edge eth0 45: 151 0 PCI-MSI-edge hda_intel $ dmesg | tail [ 401.289801] dahdi: Telephony Interface Registered on major 196 [ 401.289811] dahdi: Version: 2.4.1 [ 401.322409] dahdi_transcode: Loaded. [ 401.334881] wcb4xxp 0000:02:01.0: probe called for b4xx... [ 401.334924] wcb4xxp 0000:02:01.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17 [ 401.335572] wcb4xxp 0000:02:01.0: Identified HFC-2S Junghanns.NET duoBRI PCI (controller rev 1) at 0001bf00, IRQ 17 [ 401.337540] wcb4xxp 0000:02:01.0: NOTE: hardware echo cancellation has been disabled [ 401.337747] wcb4xxp 0000:02:01.0: Port 1: TE mode [ 401.337869] wcb4xxp 0000:02:01.0: Port 2: NT mode [ 401.350594] wcb4xxp 0000:02:01.0: Did not do the highestorder stuff [ 401.450083] hfc_handle_state: 3 callbacks suppressed [ 401.450094] wcb4xxp 0000:02:01.0: new card sync source: port 4 [ 401.550107] wcb4xxp 0000:02:01.0: new card sync source: port 1 [ 401.630887] echo: module is from the staging directory, the quality is unknown, you have been warned. [ 401.642298] dahdi_echocan_oslec: Registered echo canceler 'OSLEC' [ 401.643765] dahdi: Registered tone zone 11 (Italy) $ sudo lspci -vv -t -[0000:00]-+-00.0 Intel Corporation Mobile 945GME Express Memory Controller Hub +-02.0 Intel Corporation Mobile 945GME Express Integrated Graphics Controller +-02.1 Intel Corporation Mobile 945GM/GMS/GME, 943/940GML Express Integrated Graphics Controller +-1b.0 Intel Corporation N10/ICH 7 Family High Definition Audio Controller +-1c.0-[01-02]----00.0-[02]----01.0 Cologne Chip Designs GmbH ISDN network Controller [HFC-4S] +-1c.2-[03]----00.0 Intel Corporation 82574L Gigabit Network Connection +-1c.3-[04]----00.0 Intel Corporation 82574L Gigabit Network Connection +-1d.0 Intel Corporation N10/ICH 7 Family USB UHCI Controller #1 +-1d.1 Intel Corporation N10/ICH 7 Family USB UHCI Controller #2 +-1d.2 Intel Corporation N10/ICH 7 Family USB UHCI Controller #3 +-1d.3 Intel Corporation N10/ICH 7 Family USB UHCI Controller #4 +-1d.7 Intel Corporation N10/ICH 7 Family USB2 EHCI Controller +-1e.0-[05]--+-0e.0 Advantech Co. Ltd Device a102 | \-0e.1 Advantech Co. Ltd Device f100 +-1f.0 Intel Corporation 82801GBM (ICH7-M) LPC Interface Bridge +-1f.2 Intel Corporation 82801GBM/GHM (ICH7 Family) SATA IDE Controller \-1f.3 Intel Corporation N10/ICH 7 Family SMBus Controller By: Birger "WIMPy" Harzenetter (wimpy) 2011-04-15 05:26:59 Two drivers for the same hardware are certainly not a good thing. Make sure that you only load _EITHER_ dahdi / wcb4xxp _OR_ mISDN / hfcmulti. In theory one of them shouldn't do anything, but I'd make sure that is fixed. And as you have one port in NT Mode you want to use a 1.8 Version of Asterisk. By: Giovanni Lovato (heruan) 2011-04-15 05:50:12 I've already tried to unload mISDN/hfcmulti, the issue stills hold. I don't know why mISDN/hfcmulti are loaded, since they are not in /etc/dahdi/modules... Maybe some Debian setting? Why do you say "as you have one port in NT Mode you want to use a 1.8 Version of Asterisk"? Just curious, they made any improvements related to NT ports? BTW, my NT port has no issues. By: Birger "WIMPy" Harzenetter (wimpy) 2011-04-15 08:18:23 Blacklist them or remove them. There is no relation between mISDN and dahdi, except that they have the same purpose. Depending on your situation, choose one of them. Yes, 1.8 supports NT ptmp. It also has other improvements for ISDN interoperability. By: Giovanni Lovato (heruan) 2011-04-15 09:08:02 Yes, I blacklisted them. On a fresh install of Asterisk 1.8, I have the same issue: [Apr 15 14:02:25] NOTICE[8117]: chan_dahdi.c:2982 my_handle_dchan_exception: PRI got event: HDLC Abort (6) on D-channel of span 1 [Apr 15 14:02:55] NOTICE[8117]: chan_dahdi.c:2982 my_handle_dchan_exception: PRI got event: HDLC Abort (6) on D-channel of span 1 [Apr 15 14:03:25] NOTICE[8117]: chan_dahdi.c:2982 my_handle_dchan_exception: PRI got event: HDLC Abort (6) on D-channel of span 1 [Apr 15 14:03:56] NOTICE[8117]: chan_dahdi.c:2982 my_handle_dchan_exception: PRI got event: HDLC Abort (6) on D-channel of span 1 [Apr 15 14:04:26] NOTICE[8117]: chan_dahdi.c:2982 my_handle_dchan_exception: PRI got event: HDLC Abort (6) on D-channel of span 1 [Apr 15 14:04:56] NOTICE[8117]: chan_dahdi.c:2982 my_handle_dchan_exception: PRI got event: HDLC Abort (6) on D-channel of span 1 Could be an IRQ issue? How can I reserve a whole IRQ to the ISDN interface PCI card? By: Birger "WIMPy" Harzenetter (wimpy) 2011-04-15 09:11:07 If it's not not receiving garbage for some (physical) reason, an IRQ issue seems likely. Try to put the card in another slot. By: Giovanni Lovato (heruan) 2011-04-15 09:38:56 Tried to put the card in another slot, same HDLC aborts - so, no IRQ issue. Do you mean garbage from the TELCO NT? It's a NT1 Plus. Or maybe a cabling issue? By: Birger "WIMPy" Harzenetter (wimpy) 2011-04-15 09:48:40 Garbage produced outside the PC, yes. Like a faulty NT or cabling. By: Giovanni Lovato (heruan) 2011-04-15 10:08:52 This may be interesting: to configure the NT1 Plus, I use a tone phone. While in configuration mode, I hear a "click" sound every 30 seconds: exactly when Asterisk logs the [PRI got event: HDLC Abort (6) on D-channel of span 1]. So it's something coming from the local NT or even from the TELCO. They say there are no issues on the line, so maybe it's the NT. The "click" is every 30 seconds, not at random. Should wcb4xxp handle this? Why is it going to Red Alarm? If I unload wcb4xxp, the NT stays up, so it's wcb4xxp to hang it down. Any hint on how to debug that? I tried to set debug level to 9, enabling console to show debug and verbose, but it shows me the same "PRI got event: HDLC Abort" message only. By: Shaun Ruffell (sruffell) 2011-04-15 10:43:59 heruan: Just out of curiousity, if you change the timer_3_ms module parameter on the wcb4xxp driver to something like 20000, does that make the clicks happen on 20s intervals? By: Giovanni Lovato (heruan) 2011-04-15 10:53:38 sruffell: no, the clicks still happen on 30s intervals... By: Birger "WIMPy" Harzenetter (wimpy) 2011-04-15 10:57:26 Do you still hear them if you've got nothing else connected to the NT? Is it really an NT or is it an IAD? By: Shaun Ruffell (sruffell) 2011-04-15 10:59:16 heruan: If you have multiple cores, try forcing all the wcb4xxp interrupts onto one core by itself (not the first one). The fact that your problem is so regular is/should be a major clue. I've seen things like this when SMI interrupts run periodically. I.e.: http://article.gmane.org/gmane.comp.telephony.pbx.asterisk.user/251300 By: Giovanni Lovato (heruan) 2011-04-15 11:03:43 wimpy: nothing else, nor the ISDN line from TELCO? If I disconnect the ISDN cable from the TELCO the channel goes down and I don't hear nothing. sruffell: I'll try immediately, thank you for the link! By: Birger "WIMPy" Harzenetter (wimpy) 2011-04-15 11:06:34 Ok, that was probably not well verbalized. I meant nothing but that phone and the telco line. By: Pavel Odvárka (odvel) 2011-04-18 11:04:36 I think it should be related to 14301 and telco power safe mode. See note 0108344 and 0126897 there. I have the same warnings every ten seconds but it works fine. Before upgrade at me I remember there were No D-channels available warnings, they changed to HDLC Abort (6) on Primary D-channel warnings now. Wouldn´t be better to give span=2,0,0,ccs,ami for NT mode? By: Birger "WIMPy" Harzenetter (wimpy) 2011-04-18 11:22:28 Quite possible, but: If it really causes disturbances to other devices on the bus, there's something really bad going on. By: Pavel Odvárka (odvel) 2011-04-19 02:53:53 wimpy: you are right it is not good until wcb4xxp handles power safe mode (maybe it is not the reason of the solved problem). I think if there is no activity from telco, wcb4xxp driver makes restart of ISDN bus - you hear the click. heruan: what other problems do you have (other than warnings and clicks)? Can you place any call? By: Rafael Prado Rocchi (prado) 2011-04-19 16:57:03 What is the libpri version? By: Giovanni Lovato (heruan) 2011-04-30 03:30:26 Sorry guys for not being available in the last few days... prado: libpri version is 1.4.11.3 odvel: while the span is OK, I can place calls and I have no problems (only the "PRI got event: HDLC Abort (6) on D-channel" warning). Then, when the span goes on RED, I can no more place calls. sruffel: I tried to use irqbalance to force all wcb4xxp interrupts onto one core, but it doesn't work (I see other modules onto that core). I'm trying again now. By: Giovanni Lovato (heruan) 2011-04-30 04:31:33 sruffel: I finally managed to force wcb4xxp interrupts on one dedicated core, but I still get "PRI got event: HDLC Abort (6) on D-channel of span 1" every 30 seconds. I discovered another interesting thing: while the wcb4xxp card is connected to the NT, I hear the "ticks" generating the HDLC Abort every 30 seconds. Then, if I disconnect the cable from the wcb4xxp to the NT, I still hear the "ticks" from the NT, *but* every 135 seconds (more and less). Definitely no more every 30 seconds. Question: those "ticks" generate the HDLC Abort warning, then after many of this warnings the span goes on RED alarm. Is the RED alarm a consequence of the HDLC Abort warnings? I thought yes until now, but I want to be secure. By: Pavel Odvárka (odvel) 2011-05-03 10:45:46 My case: After restart (and some other time too) on an asterisk server I get these warnings every 10s on span 1. Another time (I did not see that at the same time together) every 35s on span 2. Both lines are to the same telco operator. No warnings on span 3 or 4, lines to older PBX (asterisk server is in NT mode for span 3 and 4). After a call is placed on the line to telco operator (the line that generates warnings, the direction of the call does not matter) warnings stop and all is OK. It may be after hours. There is no alarm all the time. heruan, so I think these warnings are not necessarily the cause of your RED alarm. To be sure: did you test span=2,0,0,ccs,ami for NT mode? By: Giovanni Lovato (heruan) 2011-09-25 06:50:01.863-0500 Sorry for the long vacancy. Pavel, I tried span=2,0,0,ccs,ami but I still get HDLC Aborts on Primary D-channel of span 1 every 30 seconds with the "ticks" from the NT box. Do you think the HDLC Aborts are not related to the RED Alarm? Then, how can I debug those RED Alarms which are a blocking issue? By: Shaun Ruffell (sruffell) 2011-10-17 09:23:21.315-0500 I haven't really thought about this issue until today and I wanted to ping it and see if it is still an issue? By: Sebastian Gutierrez (sum) 2011-11-03 14:37:31.142-0500 this is still an issue at least with this config: asterisk 1.6.2.20 dahdi 2.5.0.2 libpri 1.4.12 |