Summary: | ASTERISK-00670: zaptel driver hard-locks kernel | ||
Reporter: | Andrew Kohlsmith (akohlsmith) | Labels: | |
Date Opened: | 2003-12-16 12:14:21.000-0600 | Date Closed: | 2008-06-07 10:33:47 |
Priority: | Critical | Regression? | No |
Status: | Closed/Complete | Components: | Core/General |
Versions: | Frequency of Occurrence | ||
Related Issues: | |||
Environment: | Attachments: | ||
Description: | This is the second time this has happened in a week. I have a Duron1300 in an ECS L7VMM2 motherboard. The only addon card in it is the T100P and it's connected to a Carrier Access Channel Bank 1 (12FXS/12FXO). The *only* unusual thing I can say about this system is that I do run UML on it for our intranet (the old intranet box died and we're slowly moving services over, running it in user-mode linux was the quickest fix). Anyway I get a kernel panic now and again and ksymoops points to zaptel. Both crashes have _identical_ call traces, although the processor registers have been different. When I reset the box and it boots up, my channel bank is in a very funky state -- channels are either very choppy and jittery or they are all unavailable (CB itself shows all idle, but * can't place any calls -- fast busies) -- I power-cycle the CB1 and everything's normal again. The call trace is as follows: >>EIP; c0115c80 <__wake_up+20/60> <===== >>esi; c2158c50 <_end+1d7faec/1fc3aefc> >>ebp; c88a7a90 <_end+84ce92c/1fc3aefc> >>esp; c88a7a7c <_end+84ce918/1fc3aefc> Trace; e00548d8 <[zaptel]process_timers+38/50> Trace; e004fe2c <[zaptel]zt_receive+6c/f10> Trace; e007d820 <[wct1xxp]t1xxp_receiveprep+190/360> Trace; e007c942 <[wct1xxp]t1xxp_interrupt+1b2/1f0> Trace; c013aa40 <__block_prepare_write+1d0/330> Trace; c011a12d <qm_refs+13d/190> Trace; c010a2a8 <do_IRQ+68/b0> Trace; c010ca48 <call_do_IRQ+5/d> Trace; c012c8c2 <do_generic_file_write+252/3e0> Trace; c012cd53 <generic_file_write+103/120> Trace; c01610b2 <ext3_file_write+22/c0> Code; c0115c80 <__wake_up+20/60> 00000000 <_EIP>: Code; c0115c80 <__wake_up+20/60> <===== 0: 8b 53 fc mov 0xfffffffc(%ebx),%edx <===== Code; c0115c83 <__wake_up+23/60> 3: 8b 02 mov (%edx),%eax Code; c0115c85 <__wake_up+25/60> 5: 85 c7 test %eax,%edi Code; c0115c87 <__wake_up+27/60> 7: 75 17 jne 20 <_EIP+0x20> Code; c0115c89 <__wake_up+29/60> 9: 8b 16 mov (%esi),%edx Code; c0115c8b <__wake_up+2b/60> b: 39 f3 cmp %esi,%ebx Code; c0115c8d <__wake_up+2d/60> d: 75 f1 jne 0 <_EIP> Code; c0115c8f <__wake_up+2f/60> f: ff 75 f0 pushl 0xfffffff0(%ebp) Code; c0115c92 <__wake_up+32/60> 12: 9d popf Code; c0115c93 <__wake_up+33/60> 13: 8d 00 lea (%eax),%eax <0>Kernel panic: Aiee, killing interrupt handler! | ||
Comments: | By: Andrew Kohlsmith (akohlsmith) 2003-12-16 12:16:13.000-0600 I forgot to mention: zaptel is CVS from Dec 11. Same for libpri and asterisk itself. new-intra*CLI> show version Asterisk CVS-11/05/03-02:13:03 built by andrew@new-intra on a i686 running Linux By: zoa (zoa) 2003-12-16 12:27:03.000-0600 i've had it twice on 2 different servers in a period of 4 months, one with TE410p, the other server with an X100p. i use kernel 2.4.20 on both machines. By: Brian West (bkw918) 2004-01-07 00:11:23.000-0600 Any other input on this? By: Andrew Kohlsmith (akohlsmith) 2004-01-07 08:00:54.000-0600 Just a datapoint -- I have had this happen about once or twice a week since I originally reported it. Call traces are always the same. It's almost as if it'll die when it's been doing HDD activity and gets a zaptel interrupt. The IDE controller and the T100P are not on the same interrupt (the T100P is on its own entirely, as is the IDE controller). I hope to test this theory soon by creating metric buttloads of IDE activity and seeing if the crash occurs more often. By: zoa (zoa) 2004-01-07 08:08:25.000-0600 i have no ide stuff in my server... happens anyway, so i don't think that can be the problem. By: Brian West (bkw918) 2004-01-10 17:57:46.000-0600 Make sure you do not have MMX turned on when you compile zaptel and report those findings. By: zoa (zoa) 2004-01-10 19:30:10.000-0600 there is no such thing as mmx in the zaptel Makefile By: zoa (zoa) 2004-01-10 19:38:27.000-0600 so you mean zconfig.h because he is using duron cpu By: Andrew Kohlsmith (akohlsmith) 2004-01-12 05:36:48.000-0600 I *do* have that set. # Define if you want MMX optimizations in zaptel # KFLAGS+=-DCONFIG_ZAPTEL_MMX however I was under the impression that my CPU knew what MMX was: # cat /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 6 model : 8 model name : AMD Athlon(tm) XP 2000+ stepping : 1 cpu MHz : 1673.812 cache size : 256 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse syscall mmxext 3dnowext 3dnow bogomips : 3342.33 (you'll note the 'mmx' in the flags) I will remove that though to see if it makes a difference. By: jrollyson (jrollyson) 2004-01-14 03:14:33.000-0600 Did that define fix the issue? By: James Golovich (jamesgolovich) 2004-01-15 11:00:00.000-0600 It's pretty much a rule (unwritten afaik) that if you have an athlon enabled kernel you should not enable MMX support in zaptel. By: Malcolm Davenport (mdavenport) 2004-01-15 11:19:20.000-0600 In my experience, the problem seemed to only arise on Athlon chips that included SSE optimizations, i.e XP, and latter MPs, in the CPU and in the Kernel. I have an old Thunderbird-core Athlon Socket A, no SSE, that works just fine with MMX enabled in Zaptel and the Kernel compiled for Athlon cpu type. Not a hard rule, just sharing what I've seen. By: James Golovich (jamesgolovich) 2004-01-15 12:48:10.000-0600 Does anyone know what the real cause of this is, or a way to test for it? If there was then it would be simple to put together a program that tests for the best build options (or allow them to be specified manually in case of cross-compiling) or perhaps just some general hard rules in the zapconf.h or wherever these things are defined now. Like checking if CONFIG_MK7 then don't set MMX By: Mark Spencer (markster) 2004-02-06 22:25:32.000-0600 I did at least document the incompatibility but I don't know how to try to make it work better. By: Digium Subversion (svnbot) 2008-06-07 10:33:47 Repository: dahdi Revision: 310 U trunk/zconfig.h ------------------------------------------------------------------------ r310 | markster | 2008-06-07 10:33:46 -0500 (Sat, 07 Jun 2008) | 2 lines Warn of AMD incompatibility with MMX in zconfig.h (bug ASTERISK-670) ------------------------------------------------------------------------ http://svn.digium.com/view/dahdi?view=rev&revision=310 |