SFU behavior

Hello Folks,

I understand Jitsi is a pure SFU webrtc solution. I am trying to understand what is the actual behavior. e.g. lets assume 10 people join a conference.

  • Does this mean everybody will send one stream out and get 9 streams in?

  • At what point will the video streams blackout when bandwidth constraints kick in

  • Whose video streams will black out amongst the 10 participants. How is it decided

  • What resolution of all 9 video streams does each participants get inbound. i.e if each participant is sending 720p out, do each of them end up receiving 720p in?

  • Where does the downscaling of video stream take place? Is it on receiver’s end?

  • Lastly, I know it has been asked multiple times about why Jitsi uses SFU instead of MCU. I did read that SFU is next gen tech and better than MCU. But if 10 people join and due to bandwidth few people’s video is not available, it comes as a bad experience. So can somebody from jitsi @damencho please let us know the reasoning behind using SFU.

  • Is there any way to introduce MCU in jitsi framework?


When there is not enough download at an endpoint, videos will start switching off and will go back on when the network is back to enough for getting all videos.

Nope, depends of the size of the screen and the size of the thumbnails if it is tilieview. If it is sage view 180p will be received and 720p if it is available for the participant on stage.

If there are 100 participants with video and you look in tileview and see 25 participants you will receive just 25 videos, resolution depends of the size of the element.

Nope, this is simulcast, every participant is sending the 3 resolutions if enough bandwidth is available.

Lightweight server. It is easy and cheaper to have multiple sfu routers and to be mixing video.

Currently, jibri is doing something similar, delivering a mix as a recording or live stream. And you can see around the forum people constantly having problems with the resources for it. And this is just for one session in a meeting with 10 participants, you will need 10 such sessions … what about 100 meetings with 10 or a 1000. If everything was MCU based - meet.jit.si will not exist today and probably and same for jitsi.

Many thanks @damencho
How does zoom display video of 300 participants simultaneously for business users. They claim to be using something called Multimedia router. Is it same as the sfu router you metioned

I think it is the same. And they do it the same we do it - route only what is needed.

How does zoom display video of 300 participants simultaneously for business users ?

I already answered this in my previous comment.

Thank you @damencho . However my experience with jitsi hosted on aws vs zoom is quite different. I have hosted a t3a.medium instance on aws in us-east and even when only 3 people join the conference the video of one of the persons sometimes goes off in Tile mode. I have never seen this happen to zoom.

  1. How can this be solved. What does zoom have different (private data centers, dedicated bandwidth?)
  2. Also if the bandwidth goes down, is there a way that jitsi automatically switches to lower resolution streams e.g. from tile view to filmstrip view

t3 is a budget, low-cost instance type. Use m5/c5 (or m6g/c6g ARM instances). You can put 300 in a call with Jitsi, you just need to spec and configure your servers properly.

I wonder it is problem with aws ec2 t3a.medium instance. I only have one meeting with 3 people in it. If it was multiple meetings with many people then it makes sense that t3a.medium cannot handle network load.

Infact I have used m5 also for this single meeting with 3 people and still it gave issues. Is there a way to know through jitsi about where is the network bandwidth narrow ? Is it at the receiver’s end or is it at the server’s end.

How do I test this ? Is there a test setup? Does jitsi-torture work? or is there any other automated test framework

If you’re using an m5 instance on AWS and there are only 3 participants, the bandwidth problem is certainly not on the AWS side. Even t3 should have enough bandwidth, with t3 the concern would be the limited CPU resources.

Especially since you mentioned only one participant’s video goes black, it’s most likely constrained bandwidth between them and the JVB.

Is there a way to know through jitsi about where is the network bandwidth narrow ? Is it at the receiver’s end or is it at the server’s end.

All jitsi (or any application) can know is that packets were lost somewhere in the path between the user and the server, not where they were lost in that path.

To try to determine where, you could use tools like iperf to test possible throughput, packet rate, loss and jitter. By running it between different combinations of endpoints (e.g. between your server and another server, between your server and the user, between the user and another server) you can start to infer the likely location of a bandwidth constraint. If you’re using iperf, run it in UDP mode with a target bitrate. TCP speed tests (like most of the browser-based ones) can fail to detect network problems that would affect real-time communication.

Having said all that, though, with only a few participants you can be reasonably sure your server is not the bandwidth constraint if you’re using AWS or another host with good quality connectivity. If issues affect only a few participants, and consistently those ones, it’s quite likely that the problem is with those participants’ ISP, WiFi or computer. Or maybe they’re all from the same country, with limited international capacity given to residential end-users, and your JVB is in another country. Look for the simple answers first before diving headfirst into network testing tools.