Summary: | ASTERISK-05861: [patch] possible incomplete locking in sip_pvt's packets list | ||
Reporter: | Luigi Rizzo (rizzo) | Labels: | |
Date Opened: | 2005-12-17 12:11:11.000-0600 | Date Closed: | 2006-03-06 18:08:18.000-0600 |
Priority: | Minor | Regression? | No |
Status: | Closed/Complete | Components: | Core/General |
Versions: | Frequency of Occurrence | ||
Related Issues: | |||
Environment: | Attachments: | ( 0) packets_lock.diff | |
Description: | I am not 100% sure, but i am under the impression that the "packets" list in chan_sip.c :: struct sip_pvt is not locked properly. Specifically, __sip_ack() navigates the list without holding p->lock, and only acquires that when removing the target from the list. There is however another procedure, retrans_pkt(), which can remove records from the list (correctly protected by the p->lock), which means that there might be a race condition where __sip_ack() will try to dereference a freed entry. The attached patch moves the lock outside so that list navigation is protected. Of course I may be wrong, but it is not obvious at all what whold prevent the above race, so if the current code is correct then a proper comment explaining why it is correct should be committed. | ||
Comments: | By: Luigi Rizzo (rizzo) 2005-12-18 17:32:39.000-0600 followup with more details: __sip_ack() most of the times is called with p already locked. There is however one exception in the path sip_reg_timeout() ==> __sip_pretend_ack() ==> __sip_ack() where __sip_ack() is called without the lock. Unfortunately in the above case __sip_pretend_ack() also accesses the p->packets list without holding the lock, so grabbing the lock in __sip_ack() as in the original code or in my patch is still not sufficient. I now think the proper fix would be to: - adequately comment the assumptions and code paths that use __sip_ack() - acquire and release the lock around the call to __sip_pretend_ack() in sip_reg_timeout() - possibly remove the lock in __sip_ack() because it is now unnecessary (doesn't harm though); Note, i am not very familiar with this code so it would be nice if the original author or someone more expert than me could review the above analysis. By: Olle Johansson (oej) 2006-01-09 14:39:44.000-0600 I can't possibly comment on this, since the locking is still a bit of magic to me. Just happy someone tries to clear up that mess... Anyone else that can make a comment? By: Olle Johansson (oej) 2006-01-30 04:38:14.000-0600 We still need a locking-expert to look at this issue. /O By: Tilghman Lesher (tilghman) 2006-03-06 18:07:58.000-0600 You're correct, although I can see where it could have been missed. The issue is that there are only two threads which are allowed to access that list: the channel thread and the monitor thread. There exists a race condition right now where the monitor thread could queue a packet that could be lost, if the formerly-first packet on the list is to be ack'ed. Committed the portion related to this race to 1.2, merged to trunk. |