Summary: | ASTERISK-12695: wait_for_answer never receives HANGUP frame sent via ast_queue_hangup | ||
Reporter: | guy viviers (gui) | Labels: | |
Date Opened: | 2008-09-08 14:24:27 | Date Closed: | 2011-06-07 14:02:49 |
Priority: | Major | Regression? | No |
Status: | Closed/Complete | Components: | Applications/app_dial |
Versions: | Frequency of Occurrence | ||
Related Issues: | |||
Environment: | Attachments: | ||
Description: | 1) A user calls into our Asterisk pbx via one of our PSTN lines and dials a SIP extension. 2) The caller hangs up the PSTN line before anyone answers the SIP extension. 3) Our channel driver calls ast_queue_hangup to inform Asterisk of the hangup. 4) Asterisk never acknowledges the hangup and the call eventually times out. ****** ADDITIONAL INFORMATION ****** I made a change in our code that fixes this problem but I wanted to run this by you guys because I think it points to a larger problem. The change is ... <Code removed. Code _must_ be included as an attachment.> The wait_for_answer function sleeps in ast_waitfor_n until something causes the poll system call to return. When the caller hangs up before anyone answers our channel driver calls ast_queue_hangup, which queues a HANGUP control frame on the channel's read queue and sends an interrupt signal to the sleeping thread. The poll function in ast_waitfor_n returns -1 because of the interrupt which causes ast_waitfor_n to return 0. When wait_for_answer receives a 0 return value from ast_waitfor_n it doesn't check the caller's read queue. The change I made causes wait_for_answer to check the calling channel's read queue upon return from ast_waitfor_n, but I believe the burden should be on ast_waitfor_n to return successful status not only when it receives an RTP frame but when it receives a frame via a channel's read queue too. I didn't make this change to ast_waitfor_n (actually ast_waitfor_nandfds) myself because it is called by many other functions and the changes that I make could cause unintended side-effects to code that is used to its current behavior. Regards, gui | ||
Comments: | By: Mark Michelson (mmichelson) 2008-09-08 17:38:06 Hmm, I have a feeling that there's nothing broken, per se, but that documentation regarding expected return values of certain functions is lacking. For instance, you've stated that the AST_CONTROL_HANGUP frame is not detected properly in app_dial's wait_for_answer function. Here's something important to note. If a channel is ever returned by ast_waitfor_n and then an ast_read of that channel returns NULL, then you can safely take that to mean that the channel has hung up. For an example, look at app_dial's wait_for_answer function. If you look at the 1.4.21.2 tag and look at line 563, you'll see that a null frame pointer is taken to mean that a hangup has occurred. Another example of this is on line 703. I hope this has been helpful. If it hasn't and I'm completely missing the point of this bug report, let me know. By: guy viviers (gui) 2008-09-09 09:43:54 Hi putnopvut, Thanks for your reply. Having spent 2 full days tracking down the source of this bug I am aware of how ast_read behaves when it is called and finds an AST_CONTROL_HANGUP frame on its read queue. The change that I made simply causes ast_read to be called when ast_waitfor_n returns and the code finds something is on its input channel's read queue. The only point that I was trying to make was that it would be nice if ast_waitfor_n returned successful status any time it was safe to call ast_read, which is what is implied. The reason I mentioned this is because I suspect that there are other instances of code within Asterisk that call ast_waitfor_n which, like wait_for_answer, dont behave as expected. As far as we're concerned this issue is closed because wait_for_answer is now behaving as we expect. I only made this bug report in the "spirit of sharing" (yeeewwww ... being an evil capitalist at heart, that phrase gives me the creeps!) Regards, gui By: Joel Vandal (jvandal) 2008-09-16 09:02:44 Using latest branches/1.4 (SVN rev 143202), got these locks, maybe this is related to this ticket ? ======================================================================= === Currently Held Locks ============================================== ======================================================================= === === <file> <line num> <function> <lock name> <lock addr> (times locked) === === Thread ID: 2997132192 (pbx_thread started at [ 2645] pbx.c ast_pbx_start()) === ---> Lock #0 (channel.c): MUTEX 1451 ast_hangup &chan->lock 0xb28b59a8 (1) === ---> Lock #1 (chan_local.c): MUTEX 519 local_hangup &p->lock 0xb28ff118 (1) === ---> Tried and failed to get Lock #2 (channel.c): MUTEX 962 ast_queue_hangup &chan->lock 0xb2823b10 (1) === ------------------------------------------------------------------- === === Thread ID: 2960927648 (pbx_thread started at [ 2645] pbx.c ast_pbx_start()) === ---> Lock #0 (channel.c): MUTEX 2613 ast_write &chan->lock 0xb2823b10 (1) === ---> Waiting for Lock #1 (chan_local.c): MUTEX 313 local_write &p->lock 0xb28ff118 (1) === --- ---> Locked Here: chan_local.c line 519 (local_hangup) === ------------------------------------------------------------------- === ======================================================================= By: Russell Bryant (russell) 2008-10-05 16:37:29 There are a number of problems that be the cause of this within the channel driver itself. Since you're using a custom channel driver, we can not support you here. If you are able to reproduce this without any custom code in use, then feel free to reopen this issue. |