|Summary:||ASTERISK-07473: Asterisk core dump using ast_aji_send|
|Date Opened:||2006-08-07 03:50:28||Date Closed:||2007-05-24 10:27:20|
|Environment:||Attachments:||( 0) chan_gtalk-thread_safety-branch_1.4.patch|
( 1) chan_gtalk-thread_safety-trunk.patch
( 2) gdb20060808-0915.txt
|Description:||We are using res_jabber to send messages to our application, and are processing thousands of messages a day. However, two or three times a day, asterisk core dumps (6 - abort). |
All of these core dumps point to the same place - it looks as if the client->p (the parser from what I can understand) is nul
|Comments:||By: jmls (jmls) 2006-08-07 03:51:25|
I have several cores that show the same information. I cannot reproduce this on demand - we had three crashes in two weeks and then 4 in one day.
By: jmls (jmls) 2006-08-13 13:38:06
anything I can do or supply to push this forward ?
By: jmls (jmls) 2006-08-20 03:03:00
this has happened again several times this week. I have more cores if required
By: jmls (jmls) 2006-08-24 17:11:15
help! please! this is really giving me grief at work :)
By: Matt O'Gorman (mogorman) 2006-08-25 17:05:40
Julian, I think I might have narrowed it down a bit, it seems to be a problem with the underlying library iksemel, I compiled the library without any optimizations and it seems to not have this problem. The way I tested it, is I have it sending a message out every 500ms, i have sent over 11,000 messages which is much more than i used to be able to send before it crashed, I imagine the problem is still here just less apperant. I have started diving into the iksemel stream code to try to fix this.
By: jmls (jmls) 2006-08-26 01:18:08
thanks very much for the info and help. What version of iksemel are you compiling, and how are you compiling it. Great work.
By: jmls (jmls) 2006-08-29 08:14:10
I had another 3 crashes today. Please could you tell me how to compile the iksemel library without optimizations so that I can try your theory :) I'm really getting my butt kicked over here ...
By: jmls (jmls) 2006-09-08 07:05:19
I have compiled without optimisations, and it seems to have removed this particular problem. However, I am still getting system crashes, always in this area:
ASTERISK-6 0x008511ac in _gnutls_encrypt () from /usr/lib/libgnutls.so.11
ASTERISK-7 0x0084f3c1 in _gnutls_send_int () from /usr/lib/libgnutls.so.11
ASTERISK-8 0x00850818 in gnutls_record_send () from /usr/lib/libgnutls.so.11
ASTERISK-9 0x00b4b92b in iks_send_raw (prs=0x82a51cc,
xmlstr=0x843e0b8 "<message type='chat' to='[snip]
By: jmls (jmls) 2006-09-11 03:20:50
As the errors in the bt seem to be in the tls portions of libiksemel, I am trying with asterisk logged in to the wildfire server *without* tls (usetls=no). I'll post the results later.
By: jmls (jmls) 2006-09-12 09:48:05
I now have usetls=no in jabber.conf, and have not had a crash for nearly two days.
By: jmls (jmls) 2006-09-16 06:46:58
System uptime: 5 days, 5 hours, 23 minutes, 45 seconds
Compiling iksemel with no optimisations and usetls=no in jabber.conf seems to have made all the difference. We had reached the stage where we were crashing several times a day. This seems much much more stable.
By: jmls (jmls) 2006-09-28 01:57:20
we have not had a single jabber related crash since we set usetls=no and compiled iksemel without optimizations.
I don't know if you want to close this bug or not, as there are obviously problems with usetls=yes, but it now works fine for our purposes
By: Anthony LaMantia (alamantia) 2006-09-29 17:49:25
my thoughts , we should close this bug, but also alert the iksemel team to the problem with their tls code, or work out the problem ourselves and post a patch to their project as iksemel is a cirtical part of res_jabber.c .
By: jmls (jmls) 2006-10-01 12:39:22
I think that mog is alreay working on the iksemel libraries to find the bugs.
By: jmls (jmls) 2006-11-01 06:46:52.000-0600
hey mog! how's this going ?
By: Anthony LaMantia (alamantia) 2006-11-01 15:31:55.000-0600
closing this as mog is currently working on replacing gnutls with openssl and the iksemel is also aware and working on this.
By: Leif Madsen (lmadsen) 2007-05-04 16:47:59
Reopened per Hans Zandbelt on asterisk-dev mailing list.
By: philipp2 (philipp2) 2007-05-05 20:29:55
@jmls: could you explain what exactly you did when you say 'compiled iksemel without optimizations'?
jabber/gtalk now sort of works for me, however a 'restart' needs 4-5 segfaults until safe_asterisk manages to bring up asterisk again (Debian Sarge, iksemel compiled from source as I couldn't get jabber to work with the Sarge iksemel package; with Debian Etch the iksemel/gnutls packages are fine).
BTW it seems that I need to set usetls=yes as otherwise I'll get a "JABBER: socket read error".
By: jmls (jmls) 2007-05-06 01:24:50
I had to build my own iksemel from source, and modified the compile script to add -O0 in.
make sure that your jabber server accepts non-secure connections
By: phsultan (phsultan) 2007-05-06 08:44:07
philipp2: did you try SVN (trunk or 1.4) last revision? I used to experience many crashes at startup, recently fixed here : http://bugs.digium.com/view.php?id=9667
Also, Hans has been working on fixing TLS related bugs he and other people detected. He'll soon make a valuable patch available here.
By: philipp2 (philipp2) 2007-05-06 10:42:26
Yep, 9667 solved it! :-)
BTW, I'd appreciate if you guys could take a look at http://www.voip-info.org/wiki/view/Asterisk+Google+Talk and correct any errors you see there; I tried to beef up the available info with what I could find here and there.
By: zandbelt (zandbelt) 2007-05-06 14:41:12
Here it is: a trivial patch that modifies chan_gtalk code so gnutls/crypto functionality can be accessed in a thread-safe manner. This solves the random crashes that chan_gtalk has suffered from on incoming/outgoing calls. For those who did: it is _not_ (no longer) required to modify iksemel code for that; standard iksemel release 1.2 or current trunk is OK.
I provided patches against the 1.4 branch and the trunk:
(I just faxed a disclaimer)
By: zandbelt (zandbelt) 2007-05-06 14:43:46
NB: also modifying iksemel compilation (turning optimization off) is no longer neccessary. Of course avoid using gnutls also works for Jabber clients other-than-Gtalk, but for Gtalk clients this is not an option.
By: phsultan (phsultan) 2007-05-09 04:30:43
Hans, an explicit comment before calling gcry_control() would be nice, too :)
Thank you for this patch, hope it will prevent most crashes now.
By: Olle Johansson (oej) 2007-05-24 10:26:28
zandbelt: Thanks for contributing to Asterisk!
Patch committed to 1.4 svn rev 65901 and trunk.