Jitsi meet performance: comparison to hangouts, teams and bigBlueButton

If you installed your own instance, you can configure the bitrate via config.js file as described here.
Zoom is a privacy mess link1 and link2. Moreover, it supports h.264 not h.265. h.265 is very expensive in terms of royalties, I do not think that someone in open source world will use it.

I agree that the adoption of VP9 is crucial, you can vote for the issue here.

I wonder if there’s the option of going one better than VP9 and leaping straight to AV1?!

That would really put Jitsi on the map :muscle:

1 Like

No, since browsers do not support AV1 on WebRTC. Moreover, AV1 is CPU intensive and at the moment there is not hardware acceleration support so it is not good for mobile devices.

:cry: Oh that’s a shame. Thanks for taking the time to reply though.

I’m still evaluating the performance of jitsi. Is there any way to know about the limits of jitsi? Number of users per room? Users per server? This study is old and this project is not usable.
I found this quite recent paper of 2018 that compares: jitsi, kurento, janus and medooze. Jitsi and medooze were the best SFU in terms of number of users (490 users with 70 rooms) and RTT while medooze was the best in terms of image quality.
Thank you

I actually dont think VP9 will make this a game changer, I think Zoom is not working like SFU, but rather generates 2 mixed video signals on the server and allows the client basically to downstream one of them (Gallery view or speaker view), I can be wrong though.
This way, there is only 1 stream download for every user, instead of 30, if you try to see 30 faces with Jitsi.
I think, thinking in this direction is the only way to make Jitsi a viable option for large groups/classes (30 participants) that want to see everyone on video.
The fact that you can basically stream a mixed video to Youtube is telling me that some of the needed parts are already there. It is more about thinking, how to re-structure it, so that participants stream their signal to the server, but get only one mixed signal back. And yes this will consume server performance.

Is that a direction that some folks are already thinking?


I think this is the interessing point : What is the real added value to see every video ?
I think its only psychologic (or for monitoring) because in fact , there is no value seeing each moving their head.
So i think that ChannelLastN=5 && keep the last image in the thumbnail replacing the black empty thumbnail, can be a good compromise.

That is just an idea …

May be not a game changer, but an great bandwidth improvment in this context of all the planet on internet !

just my point of view.


Good idea with keeping the last frame in thumbnail, maybe with some kidn of icon that explains that it is a photo.
Generally speaking, there is a lot of value seeing everyone in gridview as video, especially for teacher/student scenario. Teachers said, they would know if kids pay attention, or run off, especially the little once.

Yes VP9 makes a lot of sense, and will likely help here a lot as well, but getting to a similar experience as zoom, especially with low bandwidth endpoints, the general idea needs to be thought through. Having 1 or 2 streams that show all faces makes sense, having 30 streams, makes no sense, or will not be feasable, when you have 1-2 MBit download.

Well, VP9 allows to improve about 40% the number of users with the same bandwidth.
There are two solutions to conference problem: MCU and SFU. The first is better in terms of bandwidth, but much worse in term of CPU load while the second is the opposite. Zoom uses MCU approach. In general, MCU is considered the old way. With SFU you can limit the amount of streams to participants that effectively are speaking.

1 Like


Good points on SFU and MCU, I just observed some conferences, where I can not imagine SFU working well and provide a good experience.
Yes CPU might be an issue, but at these times, CPU is cheap, and bandwidth is sparse.
If you have users with 1-2 MBit connections, you dont liek the idea of downstreaming 30 streams, even just audio.

If 30 are talking, i mute them all !!!
handshake or /ban !!
:slight_smile: :slight_smile: :slight_smile:

Hey, I’m especially curious to hear more about the comparison results of Jitsi vs. BBB. Are they kind of the same? I ask, cause I’m evaluating video meeting solutions for a non-commercial education organization. And at the moment I cannot really test Jitsi vs. BBB with a greater number of users.

MCU is the old way of videoconferencing, it does not scale well with high numbers (you need a server farms) and it too centralized. On SFU you have can scale up to thousands or tenth of thousand of users. Moreover, you can SFU with e2e encryption (PERC project matrix protocol analysis.). In general, it is not necessary to transmit all the streams to all the participants. You have several options in order to save the bandwith.

I could not find any benchmark, but I’m experimenting with both solutions and recently with multiparty meeting too. Up to now, BBB is the solution that scales better with up to 100-150 user per room. BBB uses kurento and freeswitch as main components. Kurento is configured as SFU for HTML5 WebRTC client while freeswitch is used for audio conferencing to support SIP users too. The main drawback of BBB is that it relies on old software (ubuntu 16.04 still supported and nodejs 8 deprecated).

1 Like

thanks, that already helps. I think, the ubuntu16-part is not so bad atm, but I also don’t like the nodejs8 part.

pinging @Arthur_TOUMASSIAN since he did some work with vp9 and libjitsi/videobridge1. Perhaps he is still around and is able to help! https://github.com/jitsi/jitsi-videobridge/issues/1133

I think jitsi should give the maximum priority on this. Many users are moving to multiparty meeting since it supports VP9, the video is considerably smoother, sharper, more synchronized and with higher resolution than jitsi.


My work done on JVB1 is only for plain VP9. To get full power of jitsi SFU we need an implementation of VP9 SVC. That’s what optimizes really the traffic.

However as the stable version is out, my company is willing to reimplment at least VP9 on JVB2. Because yes, there is huge difference. We managed to get nice 720p with VP9 with limitating bandwidth audio + video < 1000Kbps.


Great. Of course SVC it would be better. However, SVC is still a draft of WebRTC.
JVB2 has been released as stable. We are waiting for your updates as soon as possible :slight_smile: