Summary: | ASTERISK-09912: chan_iax2 VNAK storm seen | ||
Reporter: | Steve Davies . (stevedavies) | Labels: | |
Date Opened: | 2007-07-20 04:47:24 | Date Closed: | 2007-08-01 17:06:59 |
Priority: | Minor | Regression? | No |
Status: | Closed/Complete | Components: | Channels/chan_iax2 |
Versions: | Frequency of Occurrence | ||
Related Issues: | |||
Environment: | Attachments: | ||
Description: | Saw this happening on a customer between two boxes: Tx-Frame Retry[-01] -- OSeqno: 064 ISeqno: 060 Type: IAX Subclass: LAGRP Timestamp: 200069ms SCall: 00061 DCall: 00059 [196.40.106.94:4569] Rx-Frame Retry[ No] -- OSeqno: 011 ISeqno: 011 Type: IAX Subclass: VNAK Timestamp: 29274ms SCall: 00047 DCall: 00061 [196.40.106.94:4569] Tx-Frame Retry[-01] -- OSeqno: 064 ISeqno: 060 Type: IAX Subclass: LAGRP Timestamp: 200069ms SCall: 00061 DCall: 00059 [196.40.106.94:4569] Rx-Frame Retry[ No] -- OSeqno: 011 ISeqno: 011 Type: IAX Subclass: VNAK Timestamp: 29285ms SCall: 00047 DCall: 00061 [196.40.106.94:4569] Tx-Frame Retry[-01] -- OSeqno: 064 ISeqno: 060 Type: IAX Subclass: LAGRP Timestamp: 200069ms SCall: 00061 DCall: 00059 [196.40.106.94:4569] Rx-Frame Retry[ No] -- OSeqno: 011 ISeqno: 011 Type: IAX Subclass: VNAK Timestamp: 29294ms SCall: 00047 DCall: 00061 [196.40.106.94:4569] Tx-Frame Retry[-01] -- OSeqno: 064 ISeqno: 060 Type: IAX Subclass: LAGRP Timestamp: 200069ms SCall: 00061 DCall: 00059 [196.40.106.94:4569] Rx-Frame Retry[ No] -- OSeqno: 011 ISeqno: 011 Type: IAX Subclass: VNAK Timestamp: 29305ms SCall: 00047 DCall: 00061 [196.40.106.94:4569] Tx-Frame Retry[-01] -- OSeqno: 064 ISeqno: 060 Type: IAX Subclass: LAGRP Timestamp: 200069ms SCall: 00061 DCall: 00059 [196.40.106.94:4569] Rx-Frame Retry[ No] -- OSeqno: 011 ISeqno: 011 Type: IAX Subclass: VNAK Timestamp: 29314ms SCall: 00047 DCall: 00061 [196.40.106.94:4569] Tx-Frame Retry[-01] -- OSeqno: 064 ISeqno: 060 Type: IAX Subclass: LAGRP Timestamp: 200069ms SCall: 00061 DCall: 00059 [196.40.106.94:4569] Rx-Frame Retry[ No] -- OSeqno: 011 ISeqno: 011 Type: IAX Subclass: VNAK Timestamp: 29341ms SCall: 00047 DCall: 00061 [196.40.106.94:4569] Tx-Frame Retry[-01] -- OSeqno: 064 ISeqno: 060 Type: IAX Subclass: LAGRP Timestamp: 200069ms SCall: 00061 DCall: 00059 [196.40.106.94:4569] Rx-Frame Retry[ No] -- OSeqno: 011 ISeqno: 011 Type: IAX Subclass: VNAK Timestamp: 29351ms SCall: 00047 DCall: 00061 [196.40.106.94:4569] Tx-Frame Retry[-01] -- OSeqno: 064 ISeqno: 060 Type: IAX Subclass: LAGRP Timestamp: 200069ms SCall: 00061 DCall: 00059 [196.40.106.94:4569] Rx-Frame Retry[ No] -- OSeqno: 011 ISeqno: 011 Type: IAX Subclass: VNAK Timestamp: 29360ms SCall: 00047 DCall: 00061 [196.40.106.94:4569] Tx-Frame Retry[-01] -- OSeqno: 064 ISeqno: 060 Type: IAX Subclass: LAGRP Timestamp: 200069ms SCall: 00061 DCall: 00059 [196.40.106.94:4569] Rx-Frame Retry[ No] -- OSeqno: 011 ISeqno: 011 Type: IAX Subclass: VNAK Timestamp: 29369ms SCall: 00047 DCall: 00061 [196.40.106.94:4569] Tx-Frame Retry[-01] -- OSeqno: 064 ISeqno: 060 Type: IAX Subclass: LAGRP Timestamp: 200069ms SCall: 00061 DCall: 00059 [196.40.106.94:4569] Rx-Frame Retry[ No] -- OSeqno: 011 ISeqno: 011 Type: IAX Subclass: VNAK Timestamp: 29378ms SCall: 00047 DCall: 00061 [196.40.106.94:4569] Tx-Frame Retry[-01] -- OSeqno: 064 ISeqno: 060 Type: IAX Subclass: LAGRP Timestamp: 200069ms SCall: 00061 DCall: 00059 [196.40.106.94:4569] Rx-Frame Retry[ No] -- OSeqno: 011 ISeqno: 011 Type: IAX Subclass: VNAK Timestamp: 29387ms SCall: 00047 DCall: 00061 [196.40.106.94:4569] etc etc etc etc It did stop eventually when: Rx-Frame Retry[ No] -- OSeqno: 010 ISeqno: 014 Type: IAX Subclass: HANGUP Timestamp: 30238ms SCall: 00047 DCall: 00061 [196.40.106.94:4569] CAUSE CODE : 0 Tx-Frame Retry[-01] -- OSeqno: 014 ISeqno: 011 Type: IAX Subclass: ACK Timestamp: 30238ms SCall: 00061 DCall: 00047 [196.40.106.94:4569] Rx-Frame Retry[ No] -- OSeqno: 014 ISeqno: 014 Type: IAX Subclass: VNAK Timestamp: 30245ms SCall: 00047 DCall: 00061 [196.40.106.94:4569] Tx-Frame Retry[ No] -- OSeqno: 000 ISeqno: 000 Type: IAX Subclass: INVAL Timestamp: 00000ms SCall: 00061 DCall: 00047 [196.40.106.94:4569] | ||
Comments: | By: Russell Bryant (russell) 2007-07-20 17:20:58 There have been some fixes in the past week related to this. Please try with the latest code in the 1.4 branch. By: Digium Subversion (svnbot) 2007-08-01 16:58:55 Repository: asterisk Revision: 77887 ------------------------------------------------------------------------ r77887 | russell | 2007-08-01 16:58:53 -0500 (Wed, 01 Aug 2007) | 23 lines Fix some race conditions which have been causing weird problems in chan_iax2. The most notable problem is that people have been seeing storms of VNAK frames being sent due to really old frames mysteriously being in the retransmission queue and never getting removed. It was possible that a dynamic thread got created, but did not acquire its lock before the thread that created it signals it to perform an action. When this happens, the thread will sleep until it hits a timeout, and then get destroyed. So, the action never gets performed and in some cases, means a frame doesn't get transmitted and never gets freed since the scheduler never gets a chance to reschedule transmission. Another less severe race condition is in the handling of a timeout for a dynamic thread. It was possible for it to be acquired to perform at action at the same time that it hit a timeout. When this occurs, whatever action it was acquired for would never get performed. (patch contributed by Mihai and SteveK) (closes issue ASTERISK-9946) (closes issue ASTERISK-9912) (closes issue ASTERISK-9902) (possibly related to issue ASTERISK-10004) ------------------------------------------------------------------------ By: Digium Subversion (svnbot) 2007-08-01 17:06:59 Repository: asterisk Revision: 77889 ------------------------------------------------------------------------ r77889 | russell | 2007-08-01 17:06:58 -0500 (Wed, 01 Aug 2007) | 31 lines Merged revisions 77887 via svnmerge from https://origsvn.digium.com/svn/asterisk/branches/1.4 ........ r77887 | russell | 2007-08-01 17:16:17 -0500 (Wed, 01 Aug 2007) | 23 lines Fix some race conditions which have been causing weird problems in chan_iax2. The most notable problem is that people have been seeing storms of VNAK frames being sent due to really old frames mysteriously being in the retransmission queue and never getting removed. It was possible that a dynamic thread got created, but did not acquire its lock before the thread that created it signals it to perform an action. When this happens, the thread will sleep until it hits a timeout, and then get destroyed. So, the action never gets performed and in some cases, means a frame doesn't get transmitted and never gets freed since the scheduler never gets a chance to reschedule transmission. Another less severe race condition is in the handling of a timeout for a dynamic thread. It was possible for it to be acquired to perform at action at the same time that it hit a timeout. When this occurs, whatever action it was acquired for would never get performed. (patch contributed by Mihai and SteveK) (closes issue ASTERISK-9946) (closes issue ASTERISK-9912) (closes issue ASTERISK-9902) (possibly related to issue ASTERISK-10004) ........ ------------------------------------------------------------------------ |