Looking for a way to configure UDP ports used in Peer-to-peer for 1 to 1 sessions (P2P4121)

Hi there folks

Thanks for this wonderful project, we’ve started to play around with it during the pandemic and so far we are having a lot of fun with it.

Keywords

  • QoS / DSCP marking
  • UDP port configuration for p2p4121 meetings (1:1)

Background / Scope
Healthcare provider with lots of branch offices (read: complex network), that’s behind at least 72 corporate firewalls and uses at least 892374 different applications and OS’s (read: network congestion is a problem at times).

What we use Jitsi for
Mostly for 1:1 meetings. During the first wave of the pandemic our doctors used it to talk to the patients (at home) and somehow it grew into their prefered applicatoin for 1:1 videocalls. While it was never intended nor set up to be used in production for a prolonged period of time they have started to use it for internal videocalls between the branches too sometimes. And obviously they don’t want to have it taken away anymore, so we want to integrate it properly.

Our problem
We noticed that internal 1:1 calls get set up in true p2p fashion between the two participants. This behaviour is great and the blog posting and presentation below shed some light on it:
https://jitsi.org/blog/p2p4121/
https://www.slideshare.net/GeorgePolitis2/optimizing-the-infrastructure-costs-and-call-quality-of-web-rtc-based-group-calls?next_slideshow=1

What we also notice is that the UDP ports used for such a p2p4121 media streams seem to be within the range between 49152 and 65535. According to some random guy on the internet this is IANAs reserved, private port range that gets used when an aplication doesn’t select a particular port (i.e., in sockets terminology, it doesn’t bind() its socket to a particular port).

stackoverflow [dot] com/questions/63152801/how-client-port-number-of-websocket-get-in-google-chrome

So far we couldn’t figure out how we could change that port range. Does anyone have an idea, how the UDP port range used for media during p2p videocalls can be changed/configured?

Why?
QoS. If we can specify that p2p media streams from Jitsi always are carried out within a specific UDP port range we can tag those packets with the correspondig DSCP value. This helps us to have flawless videocalls during times of congestion of the internal network.
Right now no QoS mechanisms are applied to the videostreams and they can stutter at times of heavy network load as they get no special treatment.

tl;dr - Goal we want to achieve?
We want internal jitsi p2p mediastreams to get prioritized handling during network congestion / EF marking which currently is not the case.

I’m the VoIP guy so forgive me if the described networking part isn’t very precise. Also apologies if this can be configured within the config files - I couldn’t find it / didn’t know where I should look.

Any ideas or suggestions how we could achieve an EF marking on the media stream during 1:1 calls? We don’t like marking everything within the 49152 and 65535 range as EF as it could be almost anything.

Best regards
Pongo Abelii

PS
This is where I’m coming from / workflow I’m used to to achieve what I’m looking for:

  1. Define a port range for RTP media streams (which I currenlty don’t know how to do with Jitsi for p2p calls)
  2. Make the client use this range
  3. Either make the client itself mark the packets within that range as EF traffic
  4. Or do the marking at the access layer switch

www [dot] cisco [dot] com/c/en/us/td/docs/voice_ip_comm/jabber/11_5/CJAB_BK_D00D8CBD_00_deployment-installation-guide-cisco-jabber115/CJAB_BK_D00D8CBD_00_deployment-installation-guide-cisco-jabber115_chapter_010001.html#CJAB_TK_DD601B77_00

1 Like

The browser chooses these ports, and since they are ephemeral they (correctly) come from the ephemeral range. Jitsi Meet is working at a higher layer (WebRTC RTCPeerConnection) than the UDP sockets, so it doesn’t have much control over this.

The best way to achieve this would be to have the browser do the DSCP tagging on the RTP traffic, as otherwise you’ll struggle to identify it once it’s out in the network. Google Chrome already supports this with a googDscp constraint which sets DSCP Assured Forwarding on all RTP traffic associated with the RTCPeerConnection. For example:

new RTCPeerConnection({"iceServers": []}, {optional: [{googDscp: true}]});

A quick search of the lib-jitsi-meet codebase for googDscp doesn’t turn up anything, so you’d need to patch support in, which would be straightforward.

Firefox doesn’t support it yet but implementation is being tracked here. Since it shares the same WebRTC codebase I guess it wouldn’t be a huge implementation effort if someone was motivated to do it. Not sure about Safari.

2 Likes

Hi jbg

You’re on the right track, thank you for the input. I’ll have a talk with the devs that set up the whole Jitsi instance and will see if they somehow can patch it in for us.

Seems like the general consensus in the industry is to push the trust boundary of DSCP markings towards the application(s). Some studies* suggest that as of today there are very little averse effects of marking the traffic on the application level and sending it over the global network (such as low-latency requests getting marked down to lower-than-best-effort marking somewhere along the path) . It also seems to be somewhat of a hot topic in the industry so I expect more development and solutions in the coming months and years.

  • On the utility of unregulated IP DiffServ Code Point (DSCP) usage by end systems
    www [dot] sciencedirect [dot] com/science/article/pii/S0166531619300203

In January 2021 some of the big players created this RFC with the recommended DSCP marking for WebRTC QoS which basically is what we’re also looking for and hoping to see more of in the future:
https://datatracker.ietf.org/doc/html/rfc8837

Here is yet another blog posting on the topic that’s worth a read:

Since Chrome and Chromium based webbrowsers have a whopping ~70% market share as of today I guess in our case all the other browsers would fall victim to Occam’s razor and we’d simply let time sort things out for us without investing too much of our own ressources into the problem.

Cheers
Pongo

PS
One workaround seems to be to switch off p2p4121 and relay all traffic through the TURN server. If you flip through the presentation linked in the initial posting it becomes clear that you’ll be paying a rather high price for the DSCP marking if you go down that path. Since everything gets routed through the TURN server you’ll have higher latency and way more ressource usage(CPU/RAM/Bandwidth) on the TURN server than you’d have if you’d let thing run in a p2p4121 fashion.

Hi all

We’ve figured it out, at least for Windows Clients that use Chrome / Chromium based engines.

There is a regkey you can set / distribute via Group Policy that allows you to define which UDP source port range should be used in WebRTC calls.

Restrict the range of local UDP ports used by WebRTC
https://admx.help/?Category=Chrome&Policy=Google.Policies.Chrome::WebRtcUdpPortRange

Registry Hive: HKEY_LOCAL_MACHINE or HKEY_CURRENT_USER
Registry Path: Software\Policies\Google\Chrome
Value Name: WebRtcUdpPortRange
Value Type: REG_SZ
Default Value:
Example: 10000-11999

Careful, this modifies the source port range for ALL local UDP ports used by WebRTC and not only those for Jitsi. Please carefully check what other WebRTC applications you are running in your environment before modifying it as it might has unexpected impact on production.

Initially we thought streaming services like Youtube could be affected as well (thus youtube video traffic getting prioritized treatment as well) since they seem to have moved towards WebRTC as well. I couldn’t verify this in wireshark. At least in our case a youtube video gets sent over TCP, this the above setting has no impact.

Hope this helps.

Best regards
Pongo

Edit
On a second though this doesn’t really change anything except moving the whole WebRTC port range from 49152 - 65535 to a custom range. It doesn’t allow us to specifically give Jitsi media streams prioritized treatment. Implementing jbg’s suggested patch on the application indeed seems to be the right way to go about this.

Edit2:
The w3c is already looking into it. It’s in Candidate Recommendation status (thus probably only a question of time until it becomes a w3c recommendation and somewhere down the road part of WebRTC).
WebRTC Priority Control API
www [dot] w3 [dot] org/TR/webrtc-priority/

7.1.1 Maturity Levels When Advancing a Technical Report Towards Recommendation

Working Draft (WD)
A Working Draft is a document that W3C has published for review by the community, including W3C Members, the public, and other technical organizations.

Candidate Recommendation (CR)
A Candidate Recommendation is a document that W3C believes has been widely reviewed and satisfies the Working Group’s technical requirements. W3C publishes a Candidate Recommendation to gather implementation experience.

Proposed Recommendation (PR)
A Proposed Recommendation is a mature technical report that, after wide review for technical soundness and implementability, W3C has sent to the W3C Advisory Committee for final endorsement.

W3C Recommendation (REC)
A W3C Recommendation is a specification or set of guidelines that, after extensive consensus-building, has received the endorsement of W3C Members and the Director. W3C recommends the wide deployment of its Recommendations. Note: W3C Recommendations are similar to the standards published by other organizations.

1 Like

You’re on the right track, thank you for the input. I’ll have a talk with the devs that set up the whole Jitsi instance and will see if they somehow can patch it in for us.

You’ve piqued my interest in this so I’m looking at submitting a PR to add support in lib-jitsi-meet so it can just be turned on with a config option.

DSCP-by-default is likely to be a long way off, as the wider Internet still has a lot of broken middleboxes that fail to forward packets with DSCP bits set. That’s the reason no browsers mark RTP for expedited/assured forwarding by default.

One workaround seems to be to switch off p2p4121 and relay all traffic through the TURN server. If you flip through the presentation linked in the initial posting it becomes clear that you’ll be paying a rather high price for the DSCP marking if you go down that path. Since everything gets routed through the TURN server you’ll have higher latency and way more ressource usage(CPU/RAM/Bandwidth) on the TURN server than you’d have if you’d let thing run in a p2p4121 fashion.

This somewhat works around it, although you’d only be able to mark the traffic after the TURN server, so the flow from the client to the TURN server would still be unmarked. And yes, it’s a big price to pay.

1 Like

Interesting and very exciting topic indeed, at least for me as Voice/Collab guy.
Things are evolving fast and from what I can tell softphones and WebRTC also have a significant impact on the QoS architecture of existing and future networks.

The pandemic has accelerated WebRTC adoption and thus the need to shift trust boundaries for prioritized traffic away from the access layer switch towards the client from what I can tell.

A little history for the rookies that will find this posting in the future

First hardphones turned into softphones on the client
Back in the days when most employees still had hardphones on their desk in the office it was rather easy to set the trust boundary to decide which traffic markings can be trusted and which ones not. In the enterprise networks I’ve seen we usually set the trust boundary at the access layer switch. The hardphones and desktop clients would reside each in their own VLAN (one Data VLAN and one Voip VLAN) and sometimes we’d extend the trust boundary to the VoIP hardphone if conditional trust was configured, so VoIP traffic would always be trusted and get priority treatment.
But we’d harldy ever trust the “dirty” data traffic coming from the desktop clients. Except for a hand full of clients that did some time sensitive database requests there usually was no need to trust any of their DSCP markings.

Check page 20 (Trust Boundaries) if you’re interested in this topic.

Well times have changed and over time the hardphones slowly disappeared while softclients such as MSN Messenger, Skype, Jabber, Discord, etc. started to appear on the clients machines. We wouldn’t move our trust boundary just yet as we could do the DSCP marking by means of ACLs that would match on a specific (RTP) port range used only by those dedicated VoIP applications. So we didn’t have to touch/break too much on the network when we moved away from hardphones. Usually we could define a dedicated UDP port range for each realtime application which allowed us to distinguish prioritized traffic from the other traffic on the network.

Then video has moved from dedicated hardware to the client (WebRTC) as well
Most videoconferences however were still carried out through what Cisco calls “collaboration endpoints”. That’s dedicated hardware (Camera/Microphone) in conference rooms specifically built for conferences. People would physically gather in conference rooms and if needed a videoconference would be set up between two or multiple sites through this dedicated hardware. Since those physical devices would sit in their own VLAN we could still rely on the old QoS architecture where we’d have our trust boundary at the access layer. In our case this was the standard until early 2020, when suddenly all the offices were empty from one day to the other due to Covid-19.

More recently we saw a sharp increase in individual, client based video traffic (users doing WebRTC based conferences on their own machines) while those conference rooms where a group of people would gather in one room quickly lost a lot of their former popularity.

Impact
This move away from dedicated hardware for videoconferences to software on the client has interesting implications for the network - in particular the QoS architecture - as well.

First we saw phones moving away from dedicated hardware to software on the client. For VoIP we rely on dedicated applications that make use of specific port ranges for the media streams that can be individually configured for each application. We don’t do voice calls through the browser (just yet), we still do most of the voice calling through dedicated applications such as Cisco Jabber.

Now video has very quickly evolved from dedicated hardware into software on the client as well. In our case video has skipped the stage of dedicated software (there is no such thing for video as there is a softphone for voip) and has jumped directly into the browser thanks to WebRTC. This means that we don’t have various dedicated applications that we can assign various port ranges anymore, but only one application - which is the browser that uses the 49152 and 65535 UDP port range, regardless of what WebRTC service has established the media connection(s). Meaning that unlike with voice traffic we don’t have a way to tell and distinguish which traffic should be discriminated and which one should get prioritized treatment based on a port range.

The old approach of having dedicated VLANs for VoIP or Video endpoints doesn’t work anymore and neither do dedicated port ranges do the trick anymore.
We need new solutions to prioritize traffic, now that we have a whole lot of people doing realtime videocalls through their browsers to get their work done.

What’s on the horizon / What network engineers need to know
Luckily the brilliant guys over there at google and the w3c have already thought about solutions.

Google has already implemented their proprietary googDscp feature to mark RTP traffic and the w3c is currently also working on a solution to give WebRTC traffic a DSCP marking.

Both approaches mean that the trust boundary likely needs to be moved away from the access layer switch towards the client itself as the individual applications/services will set their DSCP markings in the future.

So Jitsi being able to mark its media streams with (user defined?) DSCP values definitely is something very exciting that’s definitely going to be used in the future, if it gets implemented.

Also forgive me if this things aren’t described with 100% technically accuracy. Take it with a grain of salt. Our way of doing things might not be best practices. Always check the reference guides of your network partner to learn about best practices, do not trust random guys on the internet. Always get your information from the primary source whenever possible, it will save you from a lot of hassle.
I’m at home at the upper layers of the network stack, so this obviously is the VoIP/Collab-guys perspective of things. I only occasionally have to deal with the layers below my VoIP/Video applications, so am not too familiar with all the nitty-gritty details of QoS. I usually only show up at the network engineers office when the media quality isn’t acceptable. We got VoIP working pretty nicely by now but now WebRCT is giving me interesting challenges to solve to get the quality I want for those videocalls launched from a web browser.

Jitsi was planned as a quick and dirty solution for a problem that appeared during the Covid pandemic but it seems like Jitsi is here to stay. The users like it and they want to use it.

(The problem that needed to be solved during the pandemic: Clients being scared to show up at the hospital, so doctors needed a simple way to communicate with them while they stayed at home. Cloud based solutions like Zoom were completely out of question due to legal/regulatory concerns. In healthcare we’re dealing with highly sensitive personal data so it was clear that it needs to be an on-prem solution and Jitsi was the best at hand at the time).

Looking forward. Interesting times ahead.

Cheers, have a great weekend everybody!

Edit:
For people interested in the history and traditional QoS designs for voice/video up until now, have a look at this paper. On page 44 you can see that not even Cisco trusted corporate PCs but I think with WebRTC we need to rethink how we want to do QoS in the future.
QoS Strategies and Smart Media Techniques for Collaboration Deployments

1 Like

Got around to this today: added support for DSCP marking outbound media packets on Chromium by jbg · Pull Request #1684 · jitsi/lib-jitsi-meet ·

2 Likes

Oh wow, that was fast! Drinks definitely are on me this weekend.

Will try to get it running in our testlab next week and do a few Wireshark traces to see how it behaves on our machines.

One thing I was wondering about was whether Chrome / Chromium based web browsers* do the DSCP mapping according to the recommendations in RFC8837 or not, which would mean:

Audio (High priority) = EF
Video (High priority) = AF41

RFC8837
https://datatracker.ietf.org/doc/html/rfc8837#section-5

Apparently, yes.
I had to go deep down the rabbit hole but found two comments in the source code of chromium.
(Some developer feel free to verify as I do not do any coding myself, thus could be wrong)

WebRTC Video:
https://source.chromium.org/chromium/chromium/src/+/master:third_party/webrtc/media/engine/webrtc_video_engine.cc;l=1096?q=_AF41&ss=chromium

WebRTC Audio:
https://source.chromium.org/chromium/chromium/src/+/master:third_party/webrtc/media/engine/webrtc_voice_engine.cc;l=1475

Both sections contain a comment along the lines

// Note that these values come from:
// https://tools.ietf.org/html/draft-ietf-tsvwg-rtcweb-qos-16#section-5
// TODO(deadbeef): Change values depending on whether we are sending a
// keyframe or non-keyframe.

Which confirms that everything based on a vanilla chromium base* should behave according to RFC8837 when googDscp is set to true. (Everybody that builds their software on Chromium is free to modify things though, so better confirm how things behave with a wireshark trace before you start building your systems/network)

* There is a whole lot of software out there that is built on Chromium.
Web browsers like MS Edge, Opera, Chrome
Streaming and communication platforms such as WhatsApp, Twitch, Skype, Slack, Discord
But also software like Cisco Jabber or Teslas Model S in-car UI

Useful knowledge for troubleshooting and testing WebRTC related topcis

  • edge://webrtc-internals/ - Open it in a separate tab while you have an ongoing videocall and you’ll see a lot of stats (Jitter, RTT, Packet Loss but also things like the frame rate). This can be used to verify wether your QoS configuration is working properly or not. The results should look vastly different for googDscp=true and googDscp=false while the network is under load
  • Chromium Command Line Switches: List of Chromium Command Line Switches « Peter Beverloo (A lot of software that’s based on Chromium can be started from the command line with those switches. There are a lot of interesting ones for WebRTC related settings)

Allright folks, have a great weekend!
Pongo

Edit:
Whoever is interested in the topic and current state of things, don’t miss RFC8825.
Overview: Real-Time Protocols for Browser-Based Applications
https://datatracker.ietf.org/doc/html/rfc8825