Summary:ASTERISK-01180: pulse dial endless loop deadlock
Reporter:Matt Florell (mflorell)Labels:
Date Opened:2004-03-09 14:07:27.000-0600Date Closed:2004-09-25 02:54:38
Versions:Frequency of
Description:I started getting this message repeating millions of times until my machine eventually froze:

Mar  9 13:28:35 DEBUG[64274450]: Got event Event -1(-1) on channel 2 (index 0)
Mar  9 13:28:35 DEBUG[64274450]: Pulse dial 'ΓΏ'
Mar  9 13:28:35 DEBUG[64274450]: Short read (0/160), must be an event...
Mar  9 13:28:35 DEBUG[64274450]: Exception on 15, channel 2

a gbd bt didn't show anything and everything was deadlocked. My messages log file stopped at over 2GB, most of it being the lines above repeating over and over.

This is on a machine with a TE410P and 3 T1s installed on Redhat9.0 with asterisk CVS 2004-03-02

any idea what causes a message like this?
Comments:By: James Golovich (jamesgolovich) 2004-03-10 03:07:49.000-0600

Seems like we need to check the return value of zt_get_event to make sure its not -1 in this case.  but there might be some underlying issue here that caused this.

Were you actually sending pulse digits?  is this something you can reproduce easily?

By: Matt Florell (mflorell) 2004-03-10 10:14:25.000-0600

This machine has been up and running for about a month and I've never seen an error like this on in ever before. We have no pulse devices hooked up to it, and I have no idea how to attempt to duplicate this error.

What exactly causes the debug messages that I posted?

By: James Golovich (jamesgolovich) 2004-03-10 12:26:23.000-0600

I'm not sure what the ultimate cause is.  But that little piece of code that prints the debugs you saw gets called when zt_get_event returns -1.  and zt_get_event returns -1 when the ZT_GET_EVENT ioctl fails.  So I suspect this is just the side effect of some kind of serious failure

By: Mark Spencer (markster) 2004-03-11 02:28:46.000-0600

What is the channel configured as?  FXOKS?

By: Matt Florell (mflorell) 2004-03-11 07:36:51.000-0600

The channel is the second channel of a T1(B8ZS ESF) going into a TE410P

By: Mark Spencer (markster) 2004-03-14 16:15:23.000-0600

What's at the other end, i mean, and what signalling type is used on the channels of the span?

By: Matt Florell (mflorell) 2004-03-15 07:59:16.000-0600

This is a straight telco Wink E&M full T1, never had any problems with it before, could this be caused by noise on the circuit, we've had that problems with other T1s here in the building?

By: Mark Spencer (markster) 2004-03-15 11:31:05.000-0600

Is it possible that you ran ztcfg while Asterisk was still running?  There is no way for zt_get_event to return -1 unless something really bizarre has happened.  Even the zaptel code doesn't permit it.

By: Matt Florell (mflorell) 2004-03-15 11:52:37.000-0600

Nope, I wasn't running ztcfg, it just happened out of the blue. I hadn't command-line executed any within a week involving asterisk on that machine.

Is there any other way for this message to have been generated?
Mar 9 13:28:35 DEBUG[64274450]: Got event Event -1(-1) on channel 2 (index 0)

By: James Golovich (jamesgolovich) 2004-03-15 12:07:29.000-0600

zt_get_event will return -1 if the ioctl command returns -1, which shouldn't happen but who knows

By: Mark Spencer (markster) 2004-03-16 12:20:09.000-0600

But the ioctl can't return -1 from the zaptel code, meaning the only way the -1 can occur is if it's on a bad fd.  i'm going to resolve this one as user error  (e.g. running ztcfg) unless it can be duplicated.