ASTERISK-20754: rtp_engine's RTCP{Sent,Received} events should contain the Channel name

[Home]

Summary: ASTERISK-20754: rtp_engine's RTCP{Sent,Received} events should contain the Channel name

Reporter: Jaco Kroon (jkroon) Labels:

Date Opened: 2012-11-29 09:16:20.000-0600 Date Closed: 2013-07-05 12:35:12

Priority: Trivial Regression? No

Status: Closed/Complete Components: Resources/res_rtp_asterisk

Versions: 1.8.18.0 Frequency of
Occurrence

Related
Issues:
must be completed before resolving ASTERISK-21471 Stasis Core - Refactor RTP/RTCP Events

is related to ASTERISK-16404 [branch] Pinefrog - RTCP improvements

Environment: Attachments: ( 0) asterisk-rtcp-channel.patch

Description: It's all good and well being able to detect bad links, but in some cases we really want to be able to associate which calls were affected afterwards. In combination with tracking which calls are using which channels (monitoring the AMI) and having the RTCP data associated with a specific channel this can be achieved externally to asterisk, hopefully allowing us to figure out which calls exactly were bad.

This is particularly important for us since we communicate with SBCs, but just knowing that we had packet loss coming from the SBC is not good enough to figure out where the loss originated, and we need to be able to provide A and B numbers to our peering partners to be able to track the path that the calls took on their networks. With this information I can pro-actively monitor RTP traffic and automatically file appropriate reports, and instead of becoming reative when my clients reports problems I can find them and fix them before they moan at me (and just notify them that some bad link got sorted out).

Comments: By: Jaco Kroon (jkroon) 2012-11-29 12:20:25.346-0600

Ok, what I thought would be a working patch, but it turns out that the sip_pvt's owner in the dialog setup when rtp get's setup is NULL, so that doesn't help. Am I just missing something obvious or is there a good reason for this?
By: Jaco Kroon (jkroon) 2012-11-30 08:48:20.990-0600

This one fixes it for at least SIP. Multicast RTP is simple enough that I can almost give you a letter it should just work.

Unfortunately I was braindead enough to base this against the 1.8.18.0 version of asterisk ... just completed a checkout of trunk, will rebase against that, but I do need it for 1.8.18.0 for the moment at least.

Code audit is in order. For each channel type I basically split the updating of ->owner into a function that iterates the various ->rtp, ->vrtp and ->trtp (as appropriate) members and updating their owners too.

It should be noted that up until now there were no users of the ast_rtp_instance_get_chan() function that I could find (at least not in 1.8.18.0), thus these changes should not affect existing functionality (except if I somehow managed to introduce a new bug).
By: Rusty Newton (rnewton) 2012-12-06 16:21:48.179-0600

Acknowledged this. It may get changed into New Feature after others look at it. Then it would likely only go into Trunk, so if you can get a trunk patch that would be great.
By: Jaco Kroon (jkroon) 2013-05-21 04:32:16.729-0500

Matt, yes! Related, the code from Olle to get the channel name listed is actually a much cleaner mechanism than what mine is (much less intrusive and doesn't rely on channel drivers doing the "right" thing). So I think you really rather snarf the appropriate bits from that patch. If my head comes for gulping some air in the next few days I'll see if I can do that for you. However, I really am after a bigger patch that improves the RTCP AMI events in general. Got a proposal but it may be too much of a change according to some for going into 11, and there is already refactoring for trunk ... so that leaves us in a bit of a pinch. I did make it configurable that you get the current behaviour by default and my alternate behaviour on explicit configuration.
By: Matt Jordan (mjordan) 2013-05-21 06:42:11.109-0500

Any improvements to RTCP events would have to be done in trunk. There's a lot of significant work being done in the interfaces area that will actually make it easier to use RTCP information throughout Asterisk, not just for AMI. Since this patch was related to that, I figured I'd take a look at it to see what could be pulled up in relation to that work.

Pinefrog obviously also comes to mind as something that benefits from this, as it uses the RTCP related information for CQR as well.

While it is important to have the channel information related to the RTP instance, that actually already occurs when two RTP capable channels are natively bridged. That isn't quite as extensive as what you do here in this patch, but I'm thinking of trying to marry those two concepts.

One problem this patch does appear to have are masquerades. When a masquerade occurs, the channel pointer becomes invalid as the RTP instance will be put onto another channel object. The original channel object can be disposed of, since the RTP instance doesn't ref bump the channel object (as an aside, it shouldn't ref bump the channel object as it doesn't own the channel object). This means there are potential crashes here during transfer operations, call pickup, and other scenarios that currently use masquerades.

While masquerades are much less frequent in Asterisk 12, they can still occur in rare cases, so having a perma-pointer back to the channel may be tricky. There are masquerade fixup callbacks that could be used to update the channel pointer in the RTP instance; we may have to use them.
By: Matt Jordan (mjordan) 2013-05-21 13:13:50.420-0500

I've been thinking about this some more.

One of the problems with using the actual {{ast_channel}} object in the RTP stack is that the {{ast_channel}} objects can get destroyed out from underneath you. You can either attempt to fixup the RTP instances whenever the {{ast_channel}} object goes away, or you can ref count the {{ast_channel}} object in the RTP stack. The first option is prone to missing a fixup spot and getting a seg fault; the latter is prone to having ref count leaks (and, since it creates a circular dependency, is a bad idea (tm))

However, we really don't need the {{ast_channel}} object, as we're really just interested in getting some aspect of the channel's state when an AMI event needs to be fired. In general, we just want the name - but we're not looking to actually change the channel state. We really want a read/only version of the channel.

There is an option in trunk that provides that. The Stasis-Core cache maintains the state of all channels. If we store the channel's uniqueid - which never changes, even during a masquerade - we can use that to snag the channel state for our events. And we don't have to worry about all of the masquerades and fixups and ref leaks.

I think I'm going to go with that approach here, as it feels a lot safer and makes use of the channel state cache (which is awfully handy for things like this)