Summary:ASTERISK-05127: app_pickup deadlocks asterisk
Reporter:xrobau (xrobau)Labels:
Date Opened:2005-09-22 03:53:41Date Closed:2008-01-15 15:48:52.000-0600
Versions:Frequency of
Environment:Attachments:( 0) fix_dp.diff
Description:I've been trying to use the new 'app_pickup' introduced in ASTERISK-4743, and as soon as you try to actually do a pickup, asterisk deadlocks everything 8-( - you can't even do a 'stop now'


Phone 1 calling phone 2:
   -- Executing Dial("SIP/303-b452", "SIP/300|15|tr") in new stack
   -- Called 300
   -- SIP/300-aad0 is ringing
Phone 3, using "exten => _**XXX,1,Pickup(${EXTEN:2})":
   -- Executing Pickup("SIP/301-70f4", "300") in new stack
   -- Executing Macro("SIP/301-70f4", "hangupcall") in new stack
   -- Executing ResetCDR("SIP/301-70f4", "w") in new stack
   -- Executing NoCDR("SIP/301-70f4", "") in new stack
   -- Executing Wait("SIP/301-70f4", "5") in new stack
   -- Executing Hangup("SIP/301-70f4", "") in new stack

(Note, I do AMP development, so this is an AMP based dialplan, in case it looks familiar)

At this stage, asterisk is wedged, and only a 'killall -9 asterisk' un-wedges it.
Comments:By: Andrew Lindh (andrew) 2005-09-22 07:06:19

I have the same problem with the new directed pickup app.

By: Tilghman Lesher (tilghman) 2005-09-22 08:52:22

After running a 'stop now', can you attach with gdb, get a backtrace, and upload it to this bug?

By: xrobau (xrobau) 2005-09-22 15:33:52

There's nothing actually in the backtrace. 'thread apply all bt' gives nothing, and a 'bt full' just has main() in it. However, after turning on all the debugging, I notice that /var/log/asterisk/full is full of:

Sep 23 06:29:32 DEBUG[11929] chan_sip.c: Failed to grab lock, trying again...

Approx 20 pages of that per second.

By: xrobau (xrobau) 2005-09-22 16:08:34

OK. Broken-by-design, and this would have never worked.

Line 54 should be 'int res = 0' rather than int res = -1, as -1 is meant to be set  only when there's no target channel found (line 89).

Line 98 does a if (res) [which will match the -1 originally set] and then exits.

However, this shouldn't be crashing asterisk like this. From the intent of the code, this should just exit unhappily, which it does - but with the unfortunate side effect of the lockup when the _called_ party hangs up. Is the 'ast_mutex_unlock' causing it grief?

By: Russell Bryant (russell) 2005-09-22 23:04:23

Can you take a look at this, please?

By: Joshua C. Colp (jcolp) 2005-09-23 11:18:42

Please find attached a patch to fix Directed Pickup. Essentially what was happening was that the code to actually do the pickup was not being executed due to a wrong initialization value. It was also not unlocking the target channel it found, thus causing a deadlock. This patch changes the initialization value to 0, and adds some extra unlocking in scenarios when an error would occur - previously the target channel would also not be unlocked there as well.

By: Andrew Lindh (andrew) 2005-09-23 11:49:29

Much better! (dare I say fixed?) I can pickup calls now and so far...no deadlocks. This is worth a CVS commit just to fix what's very broken now!

It would also be nice to pickup calls on "hold" too, not just ringing.

Should these be application options? It would be nice:
Pickup(ext@context,rnha) (Rining=default, NO ringing, Holding, steal ANY call state)

Options would allow for different types of users and policies for places that want on-hold pickup (like me) and some that don't....

By: Joshua C. Colp (jcolp) 2005-09-23 11:56:30

I'll play around and see if I can expand it further. But due to the architecture of things it's not exactly easy to steal away a call as you may think. Nonetheless if there's interest I'll pursue it in due time.

By: Andrew Lindh (andrew) 2005-09-23 12:05:51

The @context part is still not working for me.

When there is an incoming call to x310 then Pickup(310) works, but Pickup(310@netplex-inside) does not.

When I do a "show channel SIP/310-d875" (the rining call at the time), State is Rinning(5) and Context is netplex-inside. But the Pickup does not work.

By: Joshua C. Colp (jcolp) 2005-09-23 15:49:08

Okay folks - I've talked to andrew and we've figured out the source of his problem. Keep in mind that when you're using something like a Goto, you need to use the extension and context that triggered the Dial for Pickup to work. For example:

exten => 145,1,Goto(phones,145,1)

exten => 145,1,Dial(SIP/polycom1_line1)

For pickup you would need to use 145@phones - NOT 145@internal

By: Andrew Lindh (andrew) 2005-09-23 16:25:22

Thanks for the good work! The context stuff will be a user-support issue that they'll have to figure out...

It would be nice to have Pickup return on failure (or optionally after complete) like Dial with a status code so that more than one pickup can be tried. This would be needed if there is more than one context where a call can hide. IE. internal calls (exten to exten) and inbound external calls (gateway to exten) that exist in different (inside vs. outside) contexts.

Or to be able to play a file on pickup failure...IE "no call found to pickup" rather than just congestion.

By: Andrew Lindh (andrew) 2005-09-23 16:28:16


exten => _*3XX,1,Pickup(${EXTEN:1}@netplex-inside2)
exten => _*3XX,n,Pickup(${EXTEN:1}@netplex-main)
exten => _*3XX,n,Congestion

By: Digium Subversion (svnbot) 2008-01-15 15:48:52.000-0600

Repository: asterisk
Revision: 6634

U   trunk/apps/app_directed_pickup.c

r6634 | markster | 2008-01-15 15:48:51 -0600 (Tue, 15 Jan 2008) | 2 lines

Fix directed pickup deadlock bug (bug ASTERISK-5127)