[Home]

Summary:ASTERISK-01546: [CAPI] Asterisk Crashes on Incoming Call
Reporter:farnwomt (farnwomt)Labels:
Date Opened:2004-05-06 08:36:15Date Closed:2011-06-07 14:05:19
Priority:MajorRegression?No
Status:Closed/CompleteComponents:Core/General
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:
Description:From time to time (roughly every day) I find that an incoming call crashes the asterisk daemon.  It appears to always occur with an incoming CAPI call, but the segmentation fault occurs in ast_queue_frame.  Basically it appears that the channel doesn't have pvt set.

Here is a 'bt full' on one of the examples:

#0  0x08057fb5 in ast_queue_frame (chan=0x82412d0, fin=0x43938618) at channel.c:377
       f = (struct ast_frame *) 0x8159078
       prev = (struct ast_frame *) 0x0
       cur = (struct ast_frame *) 0x82412d0
       blah = 1
       qlen = 0
#1  0x0808da14 in ast_dsp_process (chan=0x82412d0, dsp=0x8252188, af=0x0) at dsp.c:1511
       silence = 136581840
       res = 0
       x = 0
       shortdata = (short unsigned int *) 0x43937d90
       odata = (
   unsigned char *) 0x43938258 "ÔPZCOOIutttvvvvttuIIOICFFFZZYY]]]P]", 'T' <repeats 40 times>, "ÔTTTTÔTÔTÔÔTÔTÔTTTTTTÔÔTÔÔÔÔÔTTÔÔÔTTTÔTTÔTTÔTTTÔTÔÔÔTTTTTTTTTÔÔÔÔTÔTÔÔÔÔÔTÔÔTÔÔÔÔTÔTÔ"
       len = 136581840
       writeback = 0
#2  0x418c30ba in pipe_frame (p=0x80eefa8, f=0x82412d0) at chan_capi.c:1187
       wfds = {fds_bits = {8388608, 0 <repeats 31 times>}}
       written = 136581840
       tv = {tv_sec = 0, tv_usec = 10}
#3  0x418c33ff in pipe_msg (PLCI=769, CMSG=0x44) at chan_capi.c:1847
       p = (struct capi_pipe *) 0x82412d0
       CMSG2 = {ApplId = 1, Command = 134 '\206', Subcommand = 131 '\203', Messagenumber = 40785, adr = {adrController = 721665, adrPLCI = 721665,
   adrNCCI = 721665}, AdditionalInfo = CAPI_COMPOSE, B1configuration = 0x0, B1protocol = 0, B2configuration = 0x0, B2protocol = 0, B3configuration = 0x0,
 B3protocol = 0, BC = 0x0, BChannelinformation = 0x0, BProtocol = CAPI_COMPOSE, CalledPartyNumber = 0x0, CalledPartySubaddress = 0x0, CallingPartyNumber = 0x0,
 CallingPartySubaddress = 0x0, CIPmask = 0, CIPmask2 = 0, CIPValue = 0, Class = 0, ConnectedNumber = 0x0, ConnectedSubaddress = 0x0, Data32 = 0, Data64 = 0,
 DataHandle = 0, DataLength = 0, FacilityConfirmationParameter = 0x0, Facilitydataarray = 0x0, FacilityIndicationParameter = 0x0, FacilityRequestParameter = 0x0,
 FacilityResponseParameters = 0x0, FacilitySelector = 0, Flags = 0, Function = 0, HLC = 0x0, Info = 0, InfoElement = 0x0, InfoMask = 0, InfoNumber = 0,
 Keypadfacility = 0x0, LLC = 0x0, ManuData = 0x0, ManuID = 0, NCPI = 0x0, Reason = 0, Reason_B3 = 0, Reject = 0, Useruserdata = 0x0, Data = 0x0, l = 14, p = 2,
 par = 0x418ef9f5 "\003\031\001", m = 0x418f47e0 "\016", buf = '\0' <repeats 179 times>}
       error = 136581840
       fr = {frametype = 2, subclass = 8, datalen = 160, samples = 160, mallocd = 0, offset = 64, src = 0x0, data = 0x43938258, delivery = {tv_sec = 1133741680,
   tv_usec = 1108157604}, prev = 0x38, next = 0x4206e4b6}
       b3buf = '\0' <repeats 24 times>, "UÎ\214A", '\0' <repeats 12 times>, "Úµ\214A\000\000\000\000\000\000\000\000\200\b³Vòåàh\036sÇoÔPZCOOIutttvvvvttuIIOICFFFZZYY]]]P]", 'T' <repeats 40 times>, "ÔTTTTÔTÔTÔÔTÔTÔTTTTTTÔÔTÔÔÔÔÔTTÔÔÔTTTÔTTÔTTÔTTTÔTÔÔÔTTTTTTTTTÔÔÔÔTÔTÔÔÔÔÔTÔÔTÔÔÔÔTÔTÔ", '\0' <repeats 156 times>, "\024ý\006B", '\0' <repeats 16 times>...
       j = 1
       b3len = 160
       dtmf = -48 'Ð'
       dtmflen = 0
       rxavg = 0
       txavg = 0
#4  0x418c807a in capi_handle_msg (CMSG=0x44) at chan_capi.c:2182
       i = (struct ast_capi_pvt *) 0x82412d0
       DNID = 0x418caf92 "s"
       msn = 0x0
       CMSG2 = {ApplId = 1, Command = 2 '\002', Subcommand = 131 '\203', Messagenumber = 7725, adr = {adrController = 513, adrPLCI = 513, adrNCCI = 513},
 AdditionalInfo = CAPI_COMPOSE, B1configuration = 0x0, B1protocol = 0, B2configuration = 0x0, B2protocol = 0, B3configuration = 0x0, B3protocol = 0, BC = 0x0,
 BChannelinformation = 0x0, BProtocol = CAPI_COMPOSE, CalledPartyNumber = 0x0, CalledPartySubaddress = 0x0, CallingPartyNumber = 0x0,
 CallingPartySubaddress = 0x0, CIPmask = 0, CIPmask2 = 0, CIPValue = 0, Class = 0, ConnectedNumber = 0x0, ConnectedSubaddress = 0x0, Data32 = 0, Data64 = 0,
 DataHandle = 0, DataLength = 0, FacilityConfirmationParameter = 0x0, Facilitydataarray = 0x0, FacilityIndicationParameter = 0x0, FacilityRequestParameter = 0x0,
 FacilityResponseParameters = 0x0, FacilitySelector = 0, Flags = 0, Function = 0, HLC = 0x0, Info = 0, InfoElement = 0x0, InfoMask = 0, InfoNumber = 0,
 Keypadfacility = 0x0, LLC = 0x0, ManuData = 0x0, ManuID = 0, NCPI = 0x0, Reason = 0, Reason_B3 = 0, Reject = 1, Useruserdata = 0x0, Data = 0x0, l = 32, p = 19,
 par = 0x418ef9d3 "\003/\r\006\b\n\005\a\t\001\026\027)\004\f(0\034\001\001", m = 0x418f47e0 "\016",
 buf = "\000\000\000\000\000\000\000\000Ñ\210\rB&ASTERISK-165;\211\223C@\212\223CÜ\210\rB\001\000\000\000\eÐ\216A\f\000\000\000&ASTERISK-165;\211\223C\000\000\000\000\000\000\000\000X\212\22---Type <return> to continue, or q <return> to quit---
3C\f&\217A@\212\223C«Ï\216A\000\b\000\000\000\000\000\000\000\000\000\000%ê\216A\000\000\000\000\000\000\000\000{\000\004@\024\000\000\000\202\000\000\000 \215\223C\f&\217A\f&\217A8È\035\bà&\217A\020\212\223CÚì\216A8È\035\b\202\000\000\000\204\001\000\0004\212\223C", '\0' <repeats 12 times>, "\f&\217A\000\000\000\000X\212\223C@\212\223C\002î\216A8È\035\b"}
       PLCI = 769
       NCCI = 136581840
       NPLAN = 32
       fds = {22, 23}
       controller = 1
       buffer = "*", '\0' <repeats 78 times>
       p = (struct capi_pipe *) 0x0
       flags = 136581840
       deflect = 0
ASTERISK-1  0x418c793e in do_monitor (data=0x0) at chan_capi.c:2209
       monCMSG = (_cmsg *) 0x82412d0
ASTERISK-2  0x4003c484 in start_thread () from /lib/tls/libpthread.so.0
Comments:By: zoa (zoa) 2004-05-06 09:21:36

what is your cvs version ? (date + branch ?)

By: Brian West (bkw918) 2004-05-06 09:54:51

Doesn't matter what version they need to contact the CAPI author since chan_capi isn't included with asterisk we can't do anything about it.

bkw

By: farnwomt (farnwomt) 2004-05-06 09:57:10

The files that are in use, ie channel.c and dsp.c are as follows:

dsp.c    Working revision:    1.22
channel.c    Working revision:    1.93

I realise that these are a touch old (about April 7th 2004), but the particular functions that are being called at the time of the crash haven't changed.  

dsp.c has had a single line change in a completely different function.

channel.c has had no changes in the problematic function and still dereferences pvt without checking if it is NULL or not.

I have added some defensive code to my channel.c file in ast_queue_frame() in the hope that it will avert the crash, but it would be better to know how to stop it getting to this.

This is what I have done:

       /* Build us a copy and free the original one */
       f = ast_frdup(fin);
       if (!f) {
               ast_log(LOG_WARNING, "Unable to duplicate frame\n");
               return -1;
       }
       ast_mutex_lock(&chan->lock);
       prev = NULL;
       /* Mtf's bug fix to stop the crashes */
       if ( chan->pvt == NULL )
       {
               ast_log(LOG_WARNING, "Mtf stopped a crash on NULL pvt\n");
               ast_mutex_unlock(&chan->lock);
               return -1;
       }
       /* End of Mtf's bug fix */
       cur = chan->pvt->readq;
       while(cur) {
               prev = cur;
               cur = cur->next;
               qlen++;
       }

By: Mark Spencer (markster) 2004-05-06 10:04:15

This is some sort of locking bug on chan_capi.  Unfortunately, since chan_capi is not part of the Asterisk distribution (it's not been disclaimed) there is nothing we can do here to assist you.  It may be helpful to try to contact the chan_capi author himself.

By: farnwomt (farnwomt) 2004-05-06 10:12:18

In answer to bkw918's comments about CAPI, I think it is important to note that asterisk falls over in a function which is part of the asterisk code rather than the CAPI code.

Even if the CAPI code is doing something it shouldn't (and I don't know if it is) the core asterisk code should be written defensively enough that it doesn't bring the whole system down.

I realise that this is a difficult goal to achieve, but as and when problems arise it shouldn't be too difficult to make modest little patches which help provide long term stability.

This isn't to say that I don't believe that bugs higher up the chain should be fixed.

By: Mark Spencer (markster) 2004-05-06 10:25:19

This is no different from a driver inside the kernel being able to crash the kernel.  You don't say "gosh, the kernel should be architected differently so that if I do something stupid in my ethernet driver i can't crash the kernel", or if you do, you do it with a microkernel like mach and not with a monolithic kernel like linux.

Similarly, Asterisk is a monolithic program for all of its core routines -- including channel drivers.  Channel drivers have to be written properly with locking in mind or the core functions can fail.  There is no way within the core function to work around an improperly written channel driver.

If you want safety and security, stick to the AGI side, where no matter what your AGI script does, it doesn't risk Asterisk.