[Home]

Summary:ASTERISK-11408: Call from '' to extension '7104' rejected because extension not found
Reporter:Norman Franke (norman)Labels:
Date Opened:2008-02-08 22:52:19.000-0600Date Closed:2008-02-18 11:42:25.000-0600
Priority:CriticalRegression?No
Status:Closed/CompleteComponents:Channels/chan_sip/Interoperability
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) 20080212__bug11960.diff.txt
( 1) 20080213__bug11960.diff.txt
( 2) 20080215__bug11960__2.diff.txt
( 3) 20080215__bug11960.diff.txt
( 4) 2159_1.txt
( 5) asterisk_log_12b.txt_1.gz
( 6) gdb_crash.txt
( 7) valgrind_12b.txt
( 8) valgrind_s1.txt
( 9) valgrind.txt
(10) valgrind.txt_2
Description:I've had a few of these errors:

[Feb  8 16:35:33] NOTICE[2170]: chan_sip.c:13879 handle_request_invite: Call from '' to extension '202' rejected because extension not found.
[Feb  8 16:35:49] NOTICE[2170]: chan_sip.c:13879 handle_request_invite: Call from '' to extension '7104' rejected because extension not found.

And shortly there after, it crashes. I'll enclose my crash report with DONT_OPTIMIZE. The dead thread is ASTERISK-7 -1246659664 LWP 18653

Thread 11 (process 18653):
0  0xffffe410 in __kernel_vsyscall ()
No symbol table info available.
1  0xb7fc556e in __lll_mutex_lock_wait () from /lib/tls/i686/cmov/libpthread.so.0
No symbol table info available.
2  0xb7fc2179 in _L_mutex_lock_141 () from /lib/tls/i686/cmov/libpthread.so.0
No symbol table info available.
3  0x00000000 in ?? ()
No symbol table info available.

I've traced all of the channels that were in the main channels list, and all of them seemed normal. One (#1) was making an outgoing call, and waiting on the lock that thread 11 had before it crashed.

My boss dialed 202 and he noted that it died right after that. Is this really the cause of the crash? If so, what's going on? I do have a dial plan rule for _2XX and 7104, so it appears to have lost it's context.

Comments:By: Tilghman Lesher (tilghman) 2008-02-09 00:51:13.000-0600

Please follow the instructions in doc/valgrind.txt

By: Norman Franke (norman) 2008-02-09 13:08:48.000-0600

Unfortunately, valgrind doesn't seem to run under Debian Etch with kernel 2.6.23. It runs, but never quits and becomes a <defunct> process, while asterisk still runs. In addition, it spews some 50 errors while asterisk is loading.

I'll try again, but this only happens under load and valgrind doesn't support concurrent thread execution (I have a quad core PC), so it may not really help.

By: Norman Franke (norman) 2008-02-09 23:15:46.000-0600

This is likely a concurrency issue. Check out the latest crash. Here is the offending line:

chan_sip.c:
15363           if (!p->owner || !ast_channel_trylock(p->owner))

p->owner is null, despite the check in the first part. I've enclosed a "thread apply all bt full" as "gdb_crash.txt". I'll attempt to try valgrind on Monday.

This wasn't the same as the other crash, since that one had a lock on "channels" and this one doesn't, but concurrency issues can do all sorts of things and are even harder to debug with valgrind.



By: Norman Franke (norman) 2008-02-11 12:41:24.000-0600

I'm running valgrind today. Already I found two issues with 1.4.18

==17698== Thread 7:
==17698== Conditional jump or move depends on uninitialised value(s)
==17698==    at 0x80CEE33: ast_rtp_get_peer (rtp.c:2010)
==17698==    by 0x59FF172: ??? (chan_sip.c:15548)
==17698==    by 0x80FC35A: dummy_start (utils.c:852)
==17698==    by 0x403123F: start_thread (in /lib/tls/i686/cmov/libpthread-2.3.6.so)
==17698==    by 0x4FAF49D: clone (in /lib/tls/i686/cmov/libc-2.3.6.so)

If you look in ast_rtp_get_peer, it's comparing against the second argument:

from chan_sip.c:
                   struct sockaddr_in sin;
                   ast_rtp_get_peer(sip->rtp, &sin);

Which isn't initialized.

Next:

==17698== Thread 15:
==17698== Conditional jump or move depends on uninitialised value(s)
==17698==    at 0x5AB462D: iax_frame_free (iax2-parser.c:1015)
==17698==    by 0x5A84EA0: ??? (chan_iax2.c:1423)
==17698==    by 0x5A8608B: ??? (chan_iax2.c:1805)
==17698==    by 0x5A88A82: ??? (chan_iax2.c:2546)
==17698==    by 0x5AA4F0B: ??? (chan_iax2.c:8251)
==17698==    by 0x5AA545A: ??? (chan_iax2.c:8355)
==17698==    by 0x80FC35A: dummy_start (utils.c:852)
==17698==    by 0x403123F: start_thread (in /lib/tls/i686/cmov/libpthread-2.3.6.so)
==17698==    by 0x4FAF49D: clone (in /lib/tls/i686/cmov/libc-2.3.6.so)

(seems make after changing the build options didn't re-build the plugins. I'll make clean for next time.)

(malloc_debug.txt is zero length.)



By: Norman Franke (norman) 2008-02-11 13:00:02.000-0600

More:

==26116== Conditional jump or move depends on uninitialised value(s)
==26116==    at 0x1C552B88: ??? (chan_sip.c:15207)
==26116==    by 0x1C54DAF4: ??? (chan_sip.c:15388)
==26116==    by 0x80A61BF: ast_io_wait (io.c:279)
==26116==    by 0x1C53A71C: ??? (chan_sip.c:15601)
==26116==    by 0x80F8FC8: dummy_start (utils.c:852)
==26116==    by 0x1B91BB62: start_thread (in /lib/tls/libpthread-0.60.so)
==26116==    by 0x1BA75189: clone (in /lib/tls/libc-2.3.2.so)
==26116==
==26116== Conditional jump or move depends on uninitialised value(s)
==26116==    at 0x1C55FE02: ??? (chan_sip.c:12590)
==26116==    by 0x1C553752: ??? (chan_sip.c:15162)
==26116==    by 0x1C54DAF4: ??? (chan_sip.c:15388)
==26116==    by 0x80A61BF: ast_io_wait (io.c:279)
==26116==    by 0x1C53A71C: ??? (chan_sip.c:15601)
==26116==    by 0x80F8FC8: dummy_start (utils.c:852)
==26116==    by 0x1B91BB62: start_thread (in /lib/tls/libpthread-0.60.so)
==26116==    by 0x1BA75189: clone (in /lib/tls/libc-2.3.2.so)

By: Norman Franke (norman) 2008-02-11 19:44:43.000-0600

Interesting:

==17698== Thread 45:
==17698== Invalid read of size 4
==17698==    at 0x4033024: pthread_mutex_lock (in /lib/tls/i686/cmov/libpthread-2.3.6.so)
==17698==    by 0x80CD666: ast_mutex_lock (lock.h:701)
==17698==    by 0x80D3816: p2p_set_bridge (rtp.c:3103)
==17698==    by 0x80D4011: bridge_p2p_loop (rtp.c:3242)
==17698==    by 0x80D481A: ast_rtp_bridge (rtp.c:3387)
==17698==    by 0x8088BF2: ast_channel_bridge (channel.c:3997)
==17698==    by 0x58E53ED: ast_bridge_call (res_features.c:1420)
==17698==    by 0x675209B: try_calling (app_acd.c:4081)
==17698==    by 0x6755E78: handle_acd (app_acd.c:5090)
==17698==    by 0x6755532: acd_exec (app_acd.c:4917)
==17698==    by 0x80BAE73: pbx_exec (pbx.c:532)
==17698==    by 0x80BE23C: pbx_extension_helper (pbx.c:1851)
==17698==  Address 0x716ae88 is 8,576 bytes inside a block of size 10,720 free'd
==17698==    at 0x401D40C: free (vg_replace_malloc.c:323)
==17698==    by 0x80CF634: ast_rtp_destroy (rtp.c:2151)
==17698==    by 0x59C551F: __sip_destroy (chan_sip.c:3084)
==17698==    by 0x59FF37C: do_monitor (chan_sip.c:15586)
==17698==    by 0x80FC35A: dummy_start (utils.c:852)
==17698==    by 0x403123F: start_thread (in /lib/tls/i686/cmov/libpthread-2.3.6.so)
==17698==    by 0x4FAF49D: clone (in /lib/tls/i686/cmov/libc-2.3.6.so)

It appears the the chan_sip monitor thread is disposing the struct ast_rtp of when bridge_p2p_loop (after the bridge terminates in a different thread) thinks it has it. Why is this?

My app_acd is basically app_queue (from 1.4.18) with the realtime stuff redone and personal (agent-specific) sub-queues.

Our clients make calls in the background on the same SIP account, but that shouldn't be a problem, assume, since they are different INVITEs.

(I've had quite a few of these.) Some how malloc debug didn't get enabled, so that file is empty.



By: Norman Franke (norman) 2008-02-11 21:07:51.000-0600

Uploaded valgrind.txt_2 with a number of errors involving chan_sip, AFAIK.

By: Tilghman Lesher (tilghman) 2008-02-12 15:59:53.000-0600

Patch uploaded, needs testing.

By: Darren Philips (dazza76) 2008-02-12 16:58:36.000-0600

I have experienced a similar issue
[Feb 13 08:22:48] NOTICE[24218]: chan_sip.c:13866 handle_request_invite: Call from '' to extension '049999999' rejected because extension not found.
but mine does not crash.
i figured out that when my cisco phone calls in it ignores the context they were told to use and goes by the userscontext  .
Work around for me was to include the correct context in default.
but this is only a temporary solution.
ast ver 1.4.18
Regards
Darren



By: Norman Franke (norman) 2008-02-12 18:05:07.000-0600

I'm using a CISCO AS5400 and it does come into the correct context. I defined it in my sip.conf and that works. I'm getting these errors from PJSIP-based clients. I also get tons of these:

[Feb 12 19:02:13] WARNING[25773]: chan_sip.c:12735 handle_response: Remote host can't match request BYE to call '6f21b0f1417106f6456e991852b8103a@172.16.8.14'. Giving up.

By: Norman Franke (norman) 2008-02-12 18:09:24.000-0600

Thanks for the patch! I've installed the patch and it's running with MALLOC_DEBUG and DONT_OPTIMIZE.

Does this patch fix all of the errors in valgrind.txt_2? That would be great, if so. Could this cause the lockup and corrupt stack as noted in 2159_1.txt. Here's hoping so.

I'll let you know what happens. It may take a while, since 24 hours of valgrinding the old one didn't turn up anything, but 48 hours did. If it's good for two days, I'll run it without valgrind and see if we get crashes and/or freezes again.

By: Norman Franke (norman) 2008-02-13 10:17:09.000-0600

I guess I answered my own question. This patch does not fix all of the problems:

==12656== Invalid read of size 4
==12656==    at 0x403341D: pthread_mutex_unlock (in /lib/tls/i686/cmov/libpthread-2.3.6.so)
==12656==    by 0x80D116C: ast_mutex_unlock (lock.h:691)
==12656==    by 0x80D73E5: p2p_set_bridge (rtp.c:3105)
==12656==    by 0x80D7BC4: bridge_p2p_loop (rtp.c:3242)
==12656==    by 0x80D83CD: ast_rtp_bridge (rtp.c:3387)
==12656==    by 0x808B017: ast_channel_bridge (channel.c:3997)
==12656==    by 0x58E5504: ast_bridge_call (res_features.c:1420)
==12656==    by 0x675B36D: try_calling (app_acd.c:4081)
==12656==    by 0x675F1BA: ??? (app_acd.c:5090)
==12656==    by 0x675E874: acd_exec (app_acd.c:4917)
==12656==    by 0x80BE5EE: pbx_exec (pbx.c:532)
==12656==    by 0x80C19E7: pbx_extension_helper (pbx.c:1851)
==12656==  Address 0x6e2e744 is 8,668 bytes inside a block of size 10,828 free'd
==12656==    at 0x401D40C: free (vg_replace_malloc.c:323)
==12656==    by 0x8072362: __ast_free_region (astmm.c:174)
==12656==    by 0x80726EE: __ast_free (astmm.c:208)
==12656==    by 0x80D31E7: ast_rtp_destroy (rtp.c:2151)
==12656==    by 0x59CA701: ??? (chan_sip.c:3096)
==12656==    by 0x5A04647: ??? (chan_sip.c:15588)
==12656==    by 0x80FFFB4: dummy_start (utils.c:852)
==12656==    by 0x403123F: start_thread (in /lib/tls/i686/cmov/libpthread-2.3.6.so)
==12656==    by 0x4FAF49D: clone (in /lib/tls/i686/cmov/libc-2.3.6.so)

Appears to be an RTP issue. (See valgrind_12b.txt for more errors. As usual, malloc_debug.txt is empty.)



By: Norman Franke (norman) 2008-02-13 10:47:04.000-0600

This may help. In reviewing the logs, I did see a few instances of this a few days ago when I was running debug 3:

[Feb 12 18:39:28] DEBUG[29854]: rtp.c:3150 bridge_p2p_loop: Oooh, something is weird, backing out

And I get millions os "Ooh, empty read..."

(But it never crashed during this run.) It happened in the latest log as well, perhaps that will help. I've loaded it as asterisk_log_12b.txt.gz. Well, I would upload it, but even gziped it's too large.



By: Norman Franke (norman) 2008-02-13 11:05:11.000-0600

Splitting the files, I get this:

Database query failed. Error received from database was ASTERISK-1147: Got a packet bigger than 'max_allowed_packet' bytes for the query: INSERT INTO mantis_bug_file_table
(bug_id, title, description, diskfile, filename, folder, filesize, file_type, date_added, content)
VALUES

By: Norman Franke (norman) 2008-02-13 11:07:07.000-0600

OK, I got 1/5 of the file uploaded as asterisk_log_12b.txt_1.gz. Let me know if you'd like the rest Corydon76.

By: Tilghman Lesher (tilghman) 2008-02-13 11:40:04.000-0600

What's this app_acd in the middle of your backtrace?

By: Tilghman Lesher (tilghman) 2008-02-13 12:01:22.000-0600

Okay, another patch is uploaded.

By: Norman Franke (norman) 2008-02-13 12:54:21.000-0600

Sorry, in the middle it is calling:

res = try_calling(qe, options, &tries, &noption, agi);

And is in "handle_acd" as before. handle_acd basically waits for it's turn, then calls try_calling to call the queue member (like app_queue does.)

By: Norman Franke (norman) 2008-02-13 13:11:51.000-0600

I'll try that patch. I'm not sure it will help, since in main/rtp.c bridge_p2p_loop the channel is unlocked when
p2p_set_bridge(p1, NULL);
is executed at line 3242 of rtp.c. If the monitor has already disposed of the ast_rtp.

if (ast_test_flag(&sip->flags[0], SIP_NEEDDESTROY) && !sip->packets &&
   !sip->owner) {
ast_mutex_unlock(&sip->lock);
__sip_destroy(sip, 1);
goto restartsearch;
}

Should chan_sip:do_monitor (above) really be releasing the lock first, then destroying the members (e.g. ast_rtp) when rtp.c:bridge_p2p_loop doesn't have a lock on anything at line 3242? Perhaps bridge_p2p_loop needs to acquire the bridge lock before releasing the channel lock? Otherwise, it seems that do_monitor can release it.

By: Norman Franke (norman) 2008-02-13 14:24:07.000-0600

From that last valgrind, these are also odd:

==12656== Invalid read of size 1
==12656==    at 0x401E768: strlen (mc_replace_strmem.c:242)
==12656==    by 0x4F23163: vfprintf (in /lib/tls/i686/cmov/libc-2.3.6.so)
==12656==    by 0x4F40F80: vsnprintf (in /lib/tls/i686/cmov/libc-2.3.6.so)
==12656==    by 0x8100E03: ast_dynamic_str_thread_build_va (utils.c:1294)
==12656==    by 0x80B306C: ast_log (logger.c:783)
==12656==    by 0x5A03C66: ??? (chan_sip.c:15381)
==12656==    by 0x80AD830: ast_io_wait (io.c:279)
==12656==    by 0x5A046CF: ??? (chan_sip.c:15603)
==12656==    by 0x80FFFB4: dummy_start (utils.c:852)
==12656==    by 0x403123F: start_thread (in /lib/tls/i686/cmov/libpthread-2.3.6.so)
==12656==    by 0x4FAF49D: clone (in /lib/tls/i686/cmov/libc-2.3.6.so)
==12656==  Address 0x6274251 is 137 bytes inside a block of size 240 free'd
==12656==    at 0x401D40C: free (vg_replace_malloc.c:323)
==12656==    by 0x8072362: __ast_free_region (astmm.c:174)
==12656==    by 0x80726EE: __ast_free (astmm.c:208)
==12656==    by 0x808204A: ast_channel_free (channel.c:1254)
==12656==    by 0x8082A22: ast_hangup (channel.c:1496)
==12656==    by 0x80C3BCE: __ast_pbx_run (pbx.c:2563)
==12656==    by 0x80C3DB2: pbx_thread (pbx.c:2623)
==12656==    by 0x80FFFB4: dummy_start (utils.c:852)
==12656==    by 0x403123F: start_thread (in /lib/tls/i686/cmov/libpthread-2.3.6.so)
==12656==    by 0x4FAF49D: clone (in /lib/tls/i686/cmov/libc-2.3.6.so)


==12656== Invalid read of size 1
==12656==    at 0x401F2A0: mempcpy (mc_replace_strmem.c:676)
==12656==    by 0x4F46042: _IO_default_xsputn (in /lib/tls/i686/cmov/libc-2.3.6.so)
==12656==    by 0x4F22DA7: vfprintf (in /lib/tls/i686/cmov/libc-2.3.6.so)
==12656==    by 0x4F40F80: vsnprintf (in /lib/tls/i686/cmov/libc-2.3.6.so)
==12656==    by 0x8100E03: ast_dynamic_str_thread_build_va (utils.c:1294)
==12656==    by 0x80B306C: ast_log (logger.c:783)
==12656==    by 0x5A03C66: ??? (chan_sip.c:15381)
==12656==    by 0x80AD830: ast_io_wait (io.c:279)
==12656==    by 0x5A046CF: ??? (chan_sip.c:15603)
==12656==    by 0x80FFFB4: dummy_start (utils.c:852)
==12656==    by 0x403123F: start_thread (in /lib/tls/i686/cmov/libpthread-2.3.6.so)
==12656==    by 0x4FAF49D: clone (in /lib/tls/i686/cmov/libc-2.3.6.so)
==12656==  Address 0x6274267 is 159 bytes inside a block of size 240 free'd
==12656==    at 0x401D40C: free (vg_replace_malloc.c:323)
==12656==    by 0x8072362: __ast_free_region (astmm.c:174)
==12656==    by 0x80726EE: __ast_free (astmm.c:208)
==12656==    by 0x808204A: ast_channel_free (channel.c:1254)
==12656==    by 0x8082A22: ast_hangup (channel.c:1496)
==12656==    by 0x80C3BCE: __ast_pbx_run (pbx.c:2563)
==12656==    by 0x80C3DB2: pbx_thread (pbx.c:2623)
==12656==    by 0x80FFFB4: dummy_start (utils.c:852)
==12656==    by 0x403123F: start_thread (in /lib/tls/i686/cmov/libpthread-2.3.6.so)
==12656==    by 0x4FAF49D: clone (in /lib/tls/i686/cmov/libc-2.3.6.so)

These are also chan_sip.c:do_monitor related. It seems do_monitor really likes to free stuff while other threads are using it or use stuff other threads have disposed of. Did something change around 1.4.11 or so with do_monitor locking?

By: Norman Franke (norman) 2008-02-13 17:08:16.000-0600

I installed the latest patch a few hours ago. I'll let you know how it goes.

By: Norman Franke (norman) 2008-02-14 09:40:35.000-0600

Alas, that patch did not fix it. I have tons more of these:

==5364== Invalid read of size 4
==5364==    at 0x4033457: pthread_mutex_unlock (in /lib/tls/i686/cmov/libpthread-2.3.6.so)
==5364==    by 0x80D116C: ast_mutex_unlock (lock.h:691)
==5364==    by 0x80D73E5: p2p_set_bridge (rtp.c:3105)
==5364==    by 0x80D7BC4: bridge_p2p_loop (rtp.c:3242)
==5364==    by 0x80D83CD: ast_rtp_bridge (rtp.c:3387)
==5364==    by 0x808B017: ast_channel_bridge (channel.c:3997)
==5364==    by 0x58E5504: ast_bridge_call (res_features.c:1420)
==5364==    by 0x675B36D: try_calling (app_acd.c:4081)
==5364==    by 0x675F1B7: handle_acd (app_acd.c:5089)
==5364==    by 0x675E871: acd_exec (app_acd.c:4916)
==5364==    by 0x80BE5EE: pbx_exec (pbx.c:532)
==5364==    by 0x80C19E7: pbx_extension_helper (pbx.c:1851)
==5364==  Address 0x6f26870 is 8,672 bytes inside a block of size 10,828 free'd
==5364==    at 0x401D40C: free (vg_replace_malloc.c:323)
==5364==    by 0x8072362: __ast_free_region (astmm.c:174)
==5364==    by 0x80726EE: __ast_free (astmm.c:208)
==5364==    by 0x80D31E7: ast_rtp_destroy (rtp.c:2151)
==5364==    by 0x59CA728: __sip_destroy (chan_sip.c:3099)
==5364==    by 0x5A0466E: do_monitor (chan_sip.c: 15591)
==5364==    by 0x80FFFB4: dummy_start (utils.c:852)
==5364==    by 0x403123F: start_thread (in /lib/tls/i686/cmov/libpthread-2.3.6.so)
==5364==    by 0x4FAF49D: clone (in /lib/tls/i686/cmov/libc-2.3.6.so)

By: Norman Franke (norman) 2008-02-15 10:10:15.000-0600

I'm getting more errors like this one:

==5364== Invalid read of size 1
==5364==    at 0x401F2EA: mempcpy (mc_replace_strmem.c:676)
==5364==    by 0x4F46042: _IO_default_xsputn (in /lib/tls/i686/cmov/libc-2.3.6.so)
==5364==    by 0x4F22DA7: vfprintf (in /lib/tls/i686/cmov/libc-2.3.6.so)
==5364==    by 0x4F40F80: vsnprintf (in /lib/tls/i686/cmov/libc-2.3.6.so)
==5364==    by 0x8100E03: ast_dynamic_str_thread_build_va (utils.c:1294)
==5364==    by 0x80B306C: ast_log (logger.c:783)
==5364==    by 0x5A03C8D: ??? (chan_sip.c:15384)
==5364==    by 0x80AD830: ast_io_wait (io.c:279)
==5364==    by 0x5A046F6: ??? (chan_sip.c:15606)
==5364==    by 0x80FFFB4: dummy_start (utils.c:852)
==5364==    by 0x403123F: start_thread (in /lib/tls/i686/cmov/libpthread-2.3.6.so)
==5364==    by 0x4FAF49D: clone (in /lib/tls/i686/cmov/libc-2.3.6.so)
==5364==  Address 0x72411ec is 140 bytes inside a block of size 240 free'd
==5364==    at 0x401D40C: free (vg_replace_malloc.c:323)
==5364==    by 0x8072362: __ast_free_region (astmm.c:174)
==5364==    by 0x80726EE: __ast_free (astmm.c:208)
==5364==    by 0x808204A: ast_channel_free (channel.c:1254)
==5364==    by 0x8082A22: ast_hangup (channel.c:1496)
==5364==    by 0x80C3BCE: __ast_pbx_run (pbx.c:2563)
==5364==    by 0x80C3DB2: pbx_thread (pbx.c:2623)
==5364==    by 0x80FFFB4: dummy_start (utils.c:852)
==5364==    by 0x403123F: start_thread (in /lib/tls/i686/cmov/libpthread-2.3.6.so)
==5364==    by 0x4FAF49D: clone (in /lib/tls/i686/cmov/libc-2.3.6.so)
==5364==    
==5364== More than 100 errors detected.  Subsequent errors
==5364== will still be recorded, but in less detail than before.

This is using the new patch. I can restart soon if another patch is ready.



By: Tilghman Lesher (tilghman) 2008-02-15 12:30:51.000-0600

Okay, let's try yet another patch.

By: Norman Franke (norman) 2008-02-15 13:00:52.000-0600

Alas, I can't even start asterisk with that patch:

(gdb) bt
# 0  0xb7fad7f1 in pthread_mutex_lock () from /lib/tls/libpthread.so.0
# 1  0x080d804e in ast_mutex_lock (pmutex=0x2174) at lock.h:701
# 2  0x080d1bbe in ast_rtp_get_bridged (rtp=0x0) at rtp.c:2030
# 3  0xb748b1cf in __sip_destroy (p=0x819beb0, lockowner=1) at chan_sip.c:3099
# 4  0xb74c460c in do_monitor (data=0x0) at chan_sip.c:15595
# 5  0x080fdfc2 in dummy_start (data=0xb730a798) at utils.c:852
# 6  0xb7fabb63 in start_thread () from /lib/tls/libpthread.so.0
# 7  0xb7ed718a in clone () from /lib/tls/libc.so.6



By: Tilghman Lesher (tilghman) 2008-02-15 13:15:14.000-0600

Oops; sorry about that.  This one should load, at least.

By: Norman Franke (norman) 2008-02-15 13:47:59.000-0600

Thanks, that starts up OK. I'll let you know how it goes.

By: Norman Franke (norman) 2008-02-16 12:08:08.000-0600

That may have fixed some. I got three of these still:

==2687== Invalid read of size 1
==2687==    at 0x401F2AA: mempcpy (mc_replace_strmem.c:676)
==2687==    by 0x4F46042: _IO_default_xsputn (in /lib/tls/i686/cmov/libc-2.3.6.so)
==2687==    by 0x4F22DA7: vfprintf (in /lib/tls/i686/cmov/libc-2.3.6.so)
==2687==    by 0x4F40F80: vsnprintf (in /lib/tls/i686/cmov/libc-2.3.6.so)
==2687==    by 0x8100E8B: ast_dynamic_str_thread_build_va (utils.c:1294)
==2687==    by 0x80B306C: ast_log (logger.c:783)
==2687==    by 0x5A03D13: ??? (chan_sip.c:15391)
==2687==    by 0x80AD830: ast_io_wait (io.c:279)
==2687==    by 0x5A0477C: ??? (chan_sip.c:15613)
==2687==    by 0x810003C: dummy_start (utils.c:852)
==2687==    by 0x403123F: start_thread (in /lib/tls/i686/cmov/libpthread-2.3.6.so)
==2687==    by 0x4FAF49D: clone (in /lib/tls/i686/cmov/libc-2.3.6.so)
==2687==  Address 0x537dc6d is 157 bytes inside a block of size 240 free'd
==2687==    at 0x401D40C: free (vg_replace_malloc.c:323)
==2687==    by 0x8072362: __ast_free_region (astmm.c:174)
==2687==    by 0x80726EE: __ast_free (astmm.c:208)
==2687==    by 0x808204A: ast_channel_free (channel.c:1254)
==2687==    by 0x8082A22: ast_hangup (channel.c:1496)
==2687==    by 0x80C3BCE: __ast_pbx_run (pbx.c:2563)
==2687==    by 0x80C3DB2: pbx_thread (pbx.c:2623)
==2687==    by 0x810003C: dummy_start (utils.c:852)
==2687==    by 0x403123F: start_thread (in /lib/tls/i686/cmov/libpthread-2.3.6.so)
==2687==    by 0x4FAF49D: clone (in /lib/tls/i686/cmov/libc-2.3.6.so)

By: Norman Franke (norman) 2008-02-18 09:23:19.000-0600

Corydon76 - that last patch may have fixed the first problem. Still none of those, just the others I posted above. So, that's an improvement.

I'm still getting tons of:

Call from '' to extension '7104' rejected because extension not found

And I don't know why that is.

By: Tilghman Lesher (tilghman) 2008-02-18 11:16:27.000-0600

I'm going to commit this patch and close this issue, since we've found the cause of the crash and are fixing it.  Please open a new issue with a 'sip set debug' while the 'extension not found' issue is happening.  It's probably one of the following:  you have some sip peers that are set to arrive in a different context and '7104' does not exist in that context, OR you have some sip clients that are not authenticating and are arriving in the 'default' context (as is specified in the [general] section of your sip.conf, and '7104' does not exist in your 'default' context.

By: Digium Subversion (svnbot) 2008-02-18 11:28:27.000-0600

Repository: asterisk
Revision: 103780

U   branches/1.4/channels/chan_sip.c
U   branches/1.4/main/rtp.c

------------------------------------------------------------------------
r103780 | tilghman | 2008-02-18 11:28:27 -0600 (Mon, 18 Feb 2008) | 9 lines

When a SIP channel is being auto-destroyed, it's possible for it to still be
in bridge code.  When that happens, we crash.  Delay the RTP destruction until
the bridge is ended.
(closes issue ASTERISK-11408)
Reported by: norman
Patches:
      20080215__bug11960__2.diff.txt uploaded by Corydon76 (license 14)
Tested by: norman

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=103780

By: Digium Subversion (svnbot) 2008-02-18 11:42:25.000-0600

Repository: asterisk
Revision: 103781

_U  trunk/
U   trunk/channels/chan_sip.c
U   trunk/main/rtp.c

------------------------------------------------------------------------
r103781 | tilghman | 2008-02-18 11:42:24 -0600 (Mon, 18 Feb 2008) | 17 lines

Merged revisions 103780 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.4

........
r103780 | tilghman | 2008-02-18 11:31:52 -0600 (Mon, 18 Feb 2008) | 9 lines

When a SIP channel is being auto-destroyed, it's possible for it to still be
in bridge code.  When that happens, we crash.  Delay the RTP destruction until
the bridge is ended.
(closes issue ASTERISK-11408)
Reported by: norman
Patches:
      20080215__bug11960__2.diff.txt uploaded by Corydon76 (license 14)
Tested by: norman

........

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=103781