[Home]

Summary:ASTERISK-14807: Corrupt Memory Issue - with Valgrind Trace
Reporter:Leo Brown (netfuse)Labels:
Date Opened:2009-09-09 15:58:45Date Closed:2011-06-07 14:07:47
Priority:MinorRegression?No
Status:Closed/CompleteComponents:Core/General
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) valgrind.txt
( 1) valgrind-1.6.2.0-rc1.txt
Description:Hi,

Have a CentOS 5 system running Asterisk 1.6.0.9 (have tried more recent builds too but know this version well) and it is becoming unstable when loading IAX2. Once the module is loaded, the asterisk console prompt no longer appears unless connecting remotely, and even then it only gives output to some commands.

Without IAX2 loaded the system is stable for a few hours, or shall i say a few calls.

I know that there can be an issue with corrupt memory, but i am not sure what this points to. I have created a valgrind trace and hopefully this may shed some light on what is at fault.

I should not also astcanary is tweeting. Everything else on the stable, webserver etc are running fine and stable. The problem has persists through multiple reboots and disk checks.

Cheers
Leo
Comments:By: Leo Brown (netfuse) 2009-09-09 16:14:38

PS. malloc_debug.log wrote 21 lines of
WARNING: Freeing unused memory at 0x[address], in complete_fn of cli.c, line 139

By: Leo Brown (netfuse) 2009-09-09 16:19:42

Took another valgrind on 1.6.2.0-rc1, attaching.

By: Leo Brown (netfuse) 2009-09-09 17:30:31

Since it's chan_iax2.so that's stopping the system from working almost totally (loading it causes chan_sip.so to stop talking to endpoints, for a start), I thought I'd look at threads.

Before loading chan_iax2.so:

asterisk1*CLI> core show threads
0xb7c87bb0 ast_make_file_from_fd started at [  161] tcptls.c ast_tcptls_server_root()
0xb7cc3bb0 netconsole           started at [ 1088] asterisk.c listener()
0xb7cffbb0 do_monitor           started at [19479] chan_sip.c restart_monitor()
0xb7ea9bb0 device_state_thread  started at [ 8362] pbx.c load_pbx()
0xb7e6dbb0 do_parking_thread    started at [ 4000] features.c ast_features_init()
0xb7fd8bb0 ast_event_dispatcher started at [  818] event.c ast_event_init()
0xb7f9cbb0 listener             started at [ 1144] asterisk.c ast_makesocket()
0xb7f60bb0 logger_thread        started at [  928] logger.c init_logger()
0xb7f21bb0 desc->accept_fn      started at [  346] tcptls.c ast_tcptls_server_start()
0xb7ee5bb0 do_devstate_changes  started at [  529] devicestate.c ast_device_state_engin                                                                       e_init()
10 threads listed.

After loading chan_iax2.so:

0xb76e9bb0 ast_make_file_from_fd started at [  161] tcptls.c ast_tcptls_server_root()
0xb7761bb0 netconsole           started at [ 1088] asterisk.c listener()
0xb78d3bb0 iax2_process_thread  started at [10366] chan_iax2.c start_network_thread()
0xb7897bb0 iax2_process_thread  started at [10366] chan_iax2.c start_network_thread()
0xb785bbb0 iax2_process_thread  started at [10366] chan_iax2.c start_network_thread()
0xb781fbb0 iax2_process_thread  started at [10366] chan_iax2.c start_network_thread()
0xb77e3bb0 sched_thread         started at [10376] chan_iax2.c start_network_thread()
0xb77a7bb0 network_thread       started at [10377] chan_iax2.c start_network_thread()
0xb7c4bbb0 iax2_process_thread  started at [10366] chan_iax2.c start_network_thread()
0xb79ffbb0 iax2_process_thread  started at [10366] chan_iax2.c start_network_thread()
0xb79c3bb0 iax2_process_thread  started at [10366] chan_iax2.c start_network_thread()
0xb7987bb0 iax2_process_thread  started at [10366] chan_iax2.c start_network_thread()
0xb794bbb0 iax2_process_thread  started at [10366] chan_iax2.c start_network_thread()
0xb790fbb0 iax2_process_thread  started at [10366] chan_iax2.c start_network_thread()
0xb7c87bb0 ast_make_file_from_fd started at [  161] tcptls.c ast_tcptls_server_root()
0xb7cc3bb0 netconsole           started at [ 1088] asterisk.c listener()
0xb7cffbb0 do_monitor           started at [19479] chan_sip.c restart_monitor()
0xb7ea9bb0 device_state_thread  started at [ 8362] pbx.c load_pbx()
0xb7e6dbb0 do_parking_thread    started at [ 4000] features.c ast_features_init()
0xb7fd8bb0 ast_event_dispatcher started at [  818] event.c ast_event_init()
0xb7f9cbb0 listener             started at [ 1144] asterisk.c ast_makesocket()
0xb7f60bb0 logger_thread        started at [  928] logger.c init_logger()
0xb7f21bb0 desc->accept_fn      started at [  346] tcptls.c ast_tcptls_server_start()
0xb7ee5bb0 do_devstate_changes  started at [  529] devicestate.c ast_device_state_engin                                                                       e_init()
24 threads listed.

After loading the module, the console that i issued the load command from doesn't process new commands, i have to connect another session.

By: Leo Brown (netfuse) 2009-09-09 18:18:35

Oh god - the astdb had corrupted. Somehow it's entries were chained so that any query returned infinite rows. This is what led many modules (including chan_iax2.so as it uses the astdb for registry) to stall. Worthy of a ticket to get * to behave better under this circumstance?

By: Leif Madsen (lmadsen) 2009-09-10 07:34:11

I've assigned this to Tilghman to review and determine how he would like to proceed based on your last comment.

Thanks for the thorough bug report!

By: Leo Brown (netfuse) 2009-09-10 07:46:09

No probs. A corrupt astdb really does wreak havoc (on people as well as code, as you can see!). I can provide a copy of the relevant astdb (not here, obviously) if someone wants to see what happens :)

By: Tilghman Lesher (tilghman) 2009-09-10 12:48:35

Moving issue to chan_sip

By: cappucinoking (cappucinoking) 2010-05-04 16:20:47

I don't think you should expect * to behave properly with a corrupt DB.
So how about instead, an astdb diagnostics/consistancy verifier tool?

By: Leo Brown (netfuse) 2010-05-05 06:07:20

Interesting, I don't think DB1 supports any inherent consistency checking. The DB code is tightly coupled with Asterisk, and so it should  be possible to check that we're only reading a reasonable amount of data/rows first. I'm now worried that I don't have a copy of this DB any more, and am presuming it's the only thing that would help a BDB1 expert resolve this. I will persevere in finding this again.

By: Leo Brown (netfuse) 2010-05-05 06:12:52

PS. If anything, surely this is should be Applications/app_db?

By: cappucinoking (cappucinoking) 2010-05-05 18:56:44

According to the following link there is a DB verification tool - not sure what version it's in though.
http://wdc.opusa.com:8458/en/db/utility/db_verify.html

The functionality of this could also be combined with some astdb verification of key/value pairs as well?

By: Leif Madsen (lmadsen) 2010-05-10 10:57:21

Changed category to Core/General.

By: Leif Madsen (lmadsen) 2010-05-25 14:49:36

As this issue is so old, I'm going to close it out. If the reporter is still having issues then please feel free to reopen the issue and provide some additional information. Please test against the latest 1.6.2 release.