ASTERISK-03334: Crash on an unknown situation related to caller-id

[Home]

Summary: ASTERISK-03334: Crash on an unknown situation related to caller-id

Reporter: paradise (paradise) Labels:

Date Opened: 2005-01-23 01:23:23.000-0600 Date Closed: 2011-06-07 14:00:54

Priority: Critical Regression? No

Status: Closed/Complete Components: Core/General

Versions: Frequency of
Occurrence

Related
Issues:

Environment: Attachments: ( 0) backtrace.txt
( 1) messages

Description: it seems this problem is related to caller-ids which is recieved from PSTN lines connected to Zap channels.

****** ADDITIONAL INFORMATION ******

i've attached both bt and bt full info of core dump file.

Comments: By: twisted (twisted) 2005-01-23 02:00:00.000-0600

Can you recreate this? Can you provide the steps necessary to recreate this?
By: paradise (paradise) 2005-01-23 02:53:10.000-0600

No, because the caller-id is recieved from PSTN (callerid=asrecieved) and this problem occures compeletly random. till now i had 3 crashs regarding this issue during last 3 weeks.
maybe the caller-id information recieved from PSTN sometimes is buggy and not standard as excpected which causes * to crash.

edited on: 01-23-05 02:54
By: Mark Spencer (markster) 2005-01-23 03:03:35.000-0600

Do you have any strange messages in your logs regarding callerid?
By: paradise (paradise) 2005-01-23 03:08:19.000-0600

just these logs before the crash:

Jan 23 09:07:54 NOTICE[2438]: Got event 2 (Ring/Answered)...
Jan 23 09:07:55 ERROR[2438]: fsk_serie made mylen < 0 (-18)
Jan 23 09:07:55 WARNING[2438]: CallerID feed failed: Success
Jan 23 09:07:55 WARNING[2438]: CallerID returned with error on channel 'Zap/20-1'
Jan 23 09:07:55 NOTICE[2438]: Got event 2 (Ring/Answered)...
Jan 23 09:07:55 NOTICE[2438]: Got event 2 (Ring/Answered)...
Jan 23 09:07:55 WARNING[2438]: Ring/Off-hook in strange state 6 on channel 20
Jan 23 09:07:55 WARNING[2438]: CallerID returned with error on channel 'Zap/3-1'

edited on: 01-25-05 12:49
By: paradise (paradise) 2005-01-24 15:34:20.000-0600

any other information should i provide regarding this bug?
By: paradise (paradise) 2005-02-03 08:46:53.000-0600

isn't it considered as a bug?
By: Olle Johansson (oej) 2005-02-13 13:17:15.000-0600

It is considered a bug, but we need more information on why this happens in order to be able to fix it. You need to run your asterisk with a high level of debugging information and catch a crash, then upload the log files here.

If you need help with the debugging and gdb core analysis, find developers in the #asterisk-bugs or #asterisk-dev channel on IRC and you will get their attention.

/Housekeeping :-)
By: Mark Spencer (markster) 2005-02-28 00:32:58.000-0600

We need to figure out how fsk_serie is getting < 0. That means there's probably a logic error in one of the state machiens allowing it to eat up too many characters.
By: twisted (twisted) 2005-03-08 15:28:46.000-0600

where does bt.txt relate to this? I'm guessing laserfox meant to post that to the other 'unknown situation' bug?
By: Christopher L. Wade (clwade) 2005-03-08 15:47:30.000-0600

I'm not meaning to 'me too' this but I just have to add the following...

Mark's comment on 2-28 about 'how fsk_serie is getting < 0', I get this regularly right now. It happens when one of two things has been done to the lines * is plugged into. First, when a device like a 'line-saver' (think multiple fax machines on same line) has been used. Second, when my * server is sittle in parallel to our legacy pbx. While I cannot see how the first could affect fsk_serie[s] getting the CallerID blip, I do have a very about how the second affects the function.

Whenever my * server and my legacy pbx are both in parallel on the same lines, if the call is answered by the legacy pbx before the CallerID blip, fsk_serie[s] will return < 0 ~99.999% of the time. If the call is answered by the legacy pbx during the CallerID blip, I get only the number portion of the blip, but no name about 75% of the time, nothing the other 25%. If the call is not answered until after the blip, * picks up the name and number just fine.

If it would help, I would be happy to attach any and all logs related to the issue. I will state that this has NEVER crashed my * server, unlike the oringal reporting is stating. Finally, I understand this may be totally unrelated, I just wanted to throw a little extra information out there for the guru's to chew over.
By: Clod Patry (junky) 2005-03-08 22:34:40.000-0600

clwade: if you have any informations that you suspect that could help us, go ahead and attach it.
Specify anything related to version, hardware and configs too.
By: Brian West (bkw918) 2005-03-17 20:55:05.000-0600

If this is still a problem please find someone on #asterisk-bugs to reopen.

/b
By: Mark Spencer (markster) 2005-03-17 23:35:54.000-0600

I have no reason to suspect this has been fxied, and until it is, closing it won't help.
By: Clod Patry (junky) 2005-03-28 03:15:00.000-0600

paradise and clwade:
can you provide us more informations about how to reproduce it?

paradise: which digium card do you have btw?

And if you can try it with a newest head, that would be appreciated, cause 01/23/05 is now kinda old.

im deleting bt.txt, cause i think that's not appropriated in here.

Thanks.

edited on: 03-28-05 03:16
By: Christopher L. Wade (clwade) 2005-03-28 08:57:42.000-0600

For me, there really is nothing special to reproduce this.

Simply plug in a seperate phone system in parallel to your * box and answer the phone on the other pbx before callerid blurb comes over the wire. Obviously this IS NOT standard procedure for using *, but my boss still requires our old system to be present on the wire... go figure :(

The other way to reproduce this is to plug a 'line-saver' into the line before * gets it, this will ALWAYS produce the 'fsk_serie made mylen <0' as per paradise' notes. Simply call the line plugged into the box and you'll get these messages EVERY time. In fact, Digium tech support -- and all the Matt's at Digium -- troubleshot this for about two weeks before I realized what the 'line-saver' was doing. (line-savers are those rj12 couplers that 'prevent' two devices from picking up the same line at the same time -- more accurately, they prevent a phone plugged into the same line as a fax/modem from picking up while the fax/modem is sending/receiving tones).

As for debug information, I'll try and get that posted tonight. Regardless, for me this issue never results in a crash, in fact CVS-HEAD has never crashed on me except when I'm developing an app or doing something stupid through the manager.

PS, right now I always run head - cannot live without the 'n' priority!

edited on: 03-28-05 08:59
By: pupfuzz (pupfuzz) 2005-03-30 23:45:26.000-0600

In the case where one does not subscribe to callerID services, are 'fsk_serie made mylen <0' messages normal? I had assumed so until reading Mark's 2/28 comment. I see these all the time (remember, I do not subscribe to callerID). Asterisk CVS-HEAD-03/30/05-00:05:12 with TDM400P REV E/F 4 port FXO. Asterisk box connected directly to telephone company demarcation terminal block with no other equipment sharing the line.
By: Michael Jerris (mikej) 2005-05-14 22:08:31

paradise, clwade. We need debug and core info. Please post or contact a bug marshall in asterisk-bugs on IRC.
By: Christopher L. Wade (clwade) 2005-05-16 09:22:52

MikeJ, as I've stated previously, this NEVER crashes my box. I just get some of the same messages as paradise. I'll see about catching some output from my box later today and posting it. Please Hold....
By: Olle Johansson (oej) 2005-06-05 17:26:15

Any updates? We need to find this irritating bug.

/Housekeeping
By: Michael Jerris (mikej) 2005-06-19 09:05:27

paradise, we are awaiting your backtracke on this. If you need assistance, find somone on irc in #asterisk-bugs.
By: Arcadiy Ivanov (arcivanov) 2005-07-01 11:39:21

I've been experiencing problems with caller ID since 1.0.8 (thats why I stayed with 1.0.7 for a long time). The setup is as follows: FXO demarc + IAX2 would be routed to FXS for incoming. When a call would come in on FXO the dialplan would invoke Zapateller (answer|nocallerid) and then dial the FXS. If the CallerID problem happened for whatever reason then the FXS would ring until taken of hook, but would produce only the dialtone. The caller on FXO would actually never hear line being picked up, would not be disconnected and essentially wait until decided to hang up. I've attached my asterisk messages log.
In 1.0.9 I'm not so sure that the behavior still persists. While logs have been indicating problem existing I have not personally observed calls not connecting.

Hardware - I use TDM400P with a single FXS + Generic FXO.

By: Kevin P. Fleming (kpfleming) 2005-07-05 21:56:57

OK, I see multiple things going on here: there are messages appearing in the logs, without a crash, when the CallerID feed is interrupted/lost and the channel is configured to expect it. That's normal, as Asterisk has been told to watch for it.

If someone else is having this issue and it's causing a crash, we'll need a backtrace from the crashing system. Since the original poster has not responded in quite some time, we'll have to close this until we have a participant who can reproduce the crash and provide a workable backtrace... otherwise this bug will just sit here and accomplish nothing.
By: Arcadiy Ivanov (arcivanov) 2005-07-11 17:38:34

After testing 1.0.9 for some time now I have found that although rare, although rare, the "Caller*ID failed checksum" or "CallerID feed failed: Success" followed by the "CallerID returned with error on channel 'Zap#-#'" the call MIGHT NOT connect. This problem occurs erratically but ONLY when the messages above appear in the logs. This is not a crash, but definitely a critical problem.
By: Michael Jerris (mikej) 2005-07-22 23:46:57

arcivianov- That sounds like the connection get's interrupted while it is reciving the callerid. This could be the same problem causing the call not to connect. This is not the same as this bug. Please open up a new bug report on that issue with full debug of the call that does this, ect, as required inthe bug guidelines. Also, please test on head and see if you can reproduce on head before posting. Thanks.
By: Michael Jerris (mikej) 2005-07-22 23:48:37

I am suspending this bug due to lack of response. If you are able to produce an updaed backtrace of this bug, please reopen. Thanks!