Summary:ASTERISK-10254: crash in ast_obj2 (deletion)
Reporter:Steve Murphy (murf)Labels:
Date Opened:2007-09-07 12:03:50Date Closed:2011-06-07 14:00:55
Versions:Frequency of
Environment:Attachments:( 0) astobj2test.patch
( 1) segv-sequence.txt
What happens: in ao2_ref(),

/* for safety, zero-out the astobj2 header and also the
* first word of the user-data, which we make sure is always
* allocated. */
bzero(obj, sizeof(struct astobj2 *) + sizeof(void *) );
ast_atomic_fetchadd_int(&ao2.total_objects, -1);

Freeing the obj is lethal, because it is still referenced by the bucket.
From looking at the code, apparently the ao2_ref() does not have access to
the container, so it cannot unlink the bucket.

Because the bucket is still there, and points to unallocated mem, subsequent memory allocations will re-use the space, and searching for key "A", which would otherwise yield no match, trips over garbaged memory and the process aborts.

This is not an unusual situation.

My personal guess is that memory should not be freed in ao2_ref(), and that buckets with 0 refs need to be purged in more places. Actually, every possible operation that might bump into them at the container level. This is a bit messy.

****** STEPS TO REPRODUCE ******

In a nutshell: Add element with key "A".
              Delete it (unref)
              Add a bunch of other elements.
              look for "A".


I found this problem by converting a test program that I wrote for a different hashtable implementation to work with ast_obj2 instead. This was the first problem I found, that would involve changing the ast_obj2 code to get around it.

I will attach a patch that includes the test program. It basically does additions, removals, and random lookups in a hash table. I don't think that any of its operations are unreasonable or unusual. But it is devilishly good at finding weak spots in algorithms. The program takes one arg on the command line, a number, that tells it how many threads to run. I've been running with "1", until it completes successfully, then I'd run it with "10". There may be other bugs, both in my test program, and in the ast_obj2 routines, but they'll have to be solved in order. Please, let me know about any changes you may have to make to the test program; I am planning on adding some more code to randomly iterate over the objects in the hash table, to see if there's any weaknesses there.
Comments:By: Russell Bryant (russell) 2007-09-07 12:40:23

I don't think this is correct.  This free() will never occur if the object is still in a container.  The reference count includes a reference for each container it is in.

By: Steve Murphy (murf) 2007-09-07 13:01:07

A misconception on my part; I was using ao2_ref(obj,-2) to unlink the container,
when I should have been using ao2_unlink(c,obj) instead. Sorry!

The test harness passes just fine this way, even with 10 threads. The algorithm is vindicated!