|Summary:||ASTERISK-00670: zaptel driver hard-locks kernel|
|Reporter:||Andrew Kohlsmith (akohlsmith)||Labels:|
|Date Opened:||2003-12-16 12:14:21.000-0600||Date Closed:||2008-06-07 10:33:47|
|Description:||This is the second time this has happened in a week. I have a Duron1300 in an ECS L7VMM2 motherboard. The only addon card in it is the T100P and it's connected to a Carrier Access Channel Bank 1 (12FXS/12FXO). |
The *only* unusual thing I can say about this system is that I do run UML on it for our intranet (the old intranet box died and we're slowly moving services over, running it in user-mode linux was the quickest fix).
Anyway I get a kernel panic now and again and ksymoops points to zaptel. Both crashes have _identical_ call traces, although the processor registers have been different.
When I reset the box and it boots up, my channel bank is in a very funky state -- channels are either very choppy and jittery or they are all unavailable (CB itself shows all idle, but * can't place any calls -- fast busies) -- I power-cycle the CB1 and everything's normal again.
The call trace is as follows:
>>EIP; c0115c80 <__wake_up+20/60> <=====
>>esi; c2158c50 <_end+1d7faec/1fc3aefc>
>>ebp; c88a7a90 <_end+84ce92c/1fc3aefc>
>>esp; c88a7a7c <_end+84ce918/1fc3aefc>
Trace; e00548d8 <[zaptel]process_timers+38/50>
Trace; e004fe2c <[zaptel]zt_receive+6c/f10>
Trace; e007d820 <[wct1xxp]t1xxp_receiveprep+190/360>
Trace; e007c942 <[wct1xxp]t1xxp_interrupt+1b2/1f0>
Trace; c013aa40 <__block_prepare_write+1d0/330>
Trace; c011a12d <qm_refs+13d/190>
Trace; c010a2a8 <do_IRQ+68/b0>
Trace; c010ca48 <call_do_IRQ+5/d>
Trace; c012c8c2 <do_generic_file_write+252/3e0>
Trace; c012cd53 <generic_file_write+103/120>
Trace; c01610b2 <ext3_file_write+22/c0>
Code; c0115c80 <__wake_up+20/60>
Code; c0115c80 <__wake_up+20/60> <=====
0: 8b 53 fc mov 0xfffffffc(%ebx),%edx <=====
Code; c0115c83 <__wake_up+23/60>
3: 8b 02 mov (%edx),%eax
Code; c0115c85 <__wake_up+25/60>
5: 85 c7 test %eax,%edi
Code; c0115c87 <__wake_up+27/60>
7: 75 17 jne 20 <_EIP+0x20>
Code; c0115c89 <__wake_up+29/60>
9: 8b 16 mov (%esi),%edx
Code; c0115c8b <__wake_up+2b/60>
b: 39 f3 cmp %esi,%ebx
Code; c0115c8d <__wake_up+2d/60>
d: 75 f1 jne 0 <_EIP>
Code; c0115c8f <__wake_up+2f/60>
f: ff 75 f0 pushl 0xfffffff0(%ebp)
Code; c0115c92 <__wake_up+32/60>
12: 9d popf
Code; c0115c93 <__wake_up+33/60>
13: 8d 00 lea (%eax),%eax
<0>Kernel panic: Aiee, killing interrupt handler!
|Comments:||By: Andrew Kohlsmith (akohlsmith) 2003-12-16 12:16:13.000-0600|
I forgot to mention: zaptel is CVS from Dec 11. Same for libpri and asterisk itself.
new-intra*CLI> show version
Asterisk CVS-11/05/03-02:13:03 built by andrew@new-intra on a i686 running Linux
By: zoa (zoa) 2003-12-16 12:27:03.000-0600
i've had it twice on 2 different servers in a period of 4 months, one with TE410p, the other server with an X100p.
i use kernel 2.4.20 on both machines.
By: Brian West (bkw918) 2004-01-07 00:11:23.000-0600
Any other input on this?
By: Andrew Kohlsmith (akohlsmith) 2004-01-07 08:00:54.000-0600
Just a datapoint -- I have had this happen about once or twice a week since I originally reported it. Call traces are always the same. It's almost as if it'll die when it's been doing HDD activity and gets a zaptel interrupt. The IDE controller and the T100P are not on the same interrupt (the T100P is on its own entirely, as is the IDE controller).
I hope to test this theory soon by creating metric buttloads of IDE activity and seeing if the crash occurs more often.
By: zoa (zoa) 2004-01-07 08:08:25.000-0600
i have no ide stuff in my server... happens anyway, so i don't think that can be the problem.
By: Brian West (bkw918) 2004-01-10 17:57:46.000-0600
Make sure you do not have MMX turned on when you compile zaptel and report those findings.
By: zoa (zoa) 2004-01-10 19:30:10.000-0600
there is no such thing as mmx in the zaptel Makefile
By: zoa (zoa) 2004-01-10 19:38:27.000-0600
so you mean zconfig.h because he is using duron cpu
By: Andrew Kohlsmith (akohlsmith) 2004-01-12 05:36:48.000-0600
I *do* have that set.
# Define if you want MMX optimizations in zaptel
however I was under the impression that my CPU knew what MMX was:
# cat /proc/cpuinfo
processor : 0
vendor_id : AuthenticAMD
cpu family : 6
model : 8
model name : AMD Athlon(tm) XP 2000+
stepping : 1
cpu MHz : 1673.812
cache size : 256 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse syscall mmxext 3dnowext 3dnow
bogomips : 3342.33
(you'll note the 'mmx' in the flags)
I will remove that though to see if it makes a difference.
By: jrollyson (jrollyson) 2004-01-14 03:14:33.000-0600
Did that define fix the issue?
By: James Golovich (jamesgolovich) 2004-01-15 11:00:00.000-0600
It's pretty much a rule (unwritten afaik) that if you have an athlon enabled kernel you should not enable MMX support in zaptel.
By: Malcolm Davenport (mdavenport) 2004-01-15 11:19:20.000-0600
In my experience, the problem seemed to only arise on Athlon chips that included SSE optimizations, i.e XP, and latter MPs, in the CPU and in the Kernel. I have an old Thunderbird-core Athlon Socket A, no SSE, that works just fine with MMX enabled in Zaptel and the Kernel compiled for Athlon cpu type. Not a hard rule, just sharing what I've seen.
By: James Golovich (jamesgolovich) 2004-01-15 12:48:10.000-0600
Does anyone know what the real cause of this is, or a way to test for it? If there was then it would be simple to put together a program that tests for the best build options (or allow them to be specified manually in case of cross-compiling)
or perhaps just some general hard rules in the zapconf.h or wherever these things are defined now. Like checking if CONFIG_MK7 then don't set MMX
By: Mark Spencer (markster) 2004-02-06 22:25:32.000-0600
I did at least document the incompatibility but I don't know how to try to make it work better.
By: Digium Subversion (svnbot) 2008-06-07 10:33:47
r310 | markster | 2008-06-07 10:33:46 -0500 (Sat, 07 Jun 2008) | 2 lines
Warn of AMD incompatibility with MMX in zconfig.h (bug ASTERISK-670)