Support for large (100+ user) conferences: Timeline, and contribution

Hi All,

Thank you in advance for your great work on Jitsi!

I am working with a Cyber Charter K-12 organization that is a heavy user of video classes, to tailor and launch a self hosted Jitsi system for online classes. Currently we’re using conferencing from another vendor.

Most classes are quite small and are a great fit for Jitsi in its current state, however there are a few courses which have over 100 participants. Initially we were planning on modifying the Jitsi-meet client to display a CDN distributed livestream for these large classes, but still allow participation and questions from students. With LastN=1 or 2 of course. More recently, we had been hoping to (mis?)use octo to distribute large single conferences over multiple JVBs even though the users are all geographically nearby. This has only gotten to the small proof of concept stage.

In this thread Damencho mentions:

Currently you cannot create a meeting with 200 people. We have a hard limit of 75 participants, but even more than 35, the experience will suffer. But we are working on adding big meetings with more participants (more than 100).

Could anyone provide an update on the timeline for this feature, and also point us in the direction of a way to try even a rough beta version of large rooms if possible? We are also happy to participate in any way possible on the development and testing of this feature.

Thanks!

Blake

–edit–
I see in [jitsi-dev] JVB scalability from Boris that there isn’t a specific timeline, but we’re still happy to assist in development of this feature in any way possible.

1 Like

Did you get anything out of it ? We are trying to enable a large room (around 250 participants) and followed the same path as yours, tried a very low LastN, tried to disable all webcam and mic but one, but it wasn’t enough, so we are now thinking of trying to (mis? :grinning:)use octo just to be able to distribute a single room across many JVB. What was the first results of your proof of concept ?

Hi Arzar,
Never got to the point of testing real load, but I did successfully get multiple users onto two JVBs, running in docker on a single host, by enabling Octo with

org.jitsi.jicofo.BridgeSelector.BRIDGE_SELECTION_STRATEGY=SplitBridgeSelectionStrategy

so that it tried to push each new user to a different bridge, without having to assign users to different regions.

1 Like

Hi @blivingston, does the dockerized JVB taken from here https://github.com/jitsi/docker-jitsi-meet ? Also importantly, does the docker can be scale able up to more than two JVB on single host (assumed it could create more and more rooms and participant as well since the JVB now become in larger quantity)?

Thanks for the enlightenment :slight_smile:

1 Like

Hi Janto, I was testing out using https://github.com/jitsi/docker-jitsi-meet . To add another JVB, I had to go through a few steps. I copied the videobridge declaration in docker-compose.yml and made another one, and assigned it new ports and a new config directory. Then, after a first run, I edited the generated config for the container to use the new ports.

I don’t think that this would actually provide better performance though - JVB2 at least seems to use as many cores as you have ( I could be wrong though). I was mostly using it to test out Octo configuration without the trouble of multiple VMs

has anybody been able to figure this out? “split bridge selection strategy” is good start for testing purposes. looking for a final solution though.
Jicofo does an excellent job load balancing each conference into different bridges, but not load balancing multiple participants in the same meeting.

We do prefer to put participants in the same meeting on the same bridge, but we won’t go past the thresholds, so if a JVB is overloaded we’ll start putting participants on a new bridge.

1 Like

This is great information that I didn’t know. What is the threshold? I would like to confirm that my configuration acts this way before committing to my leaders that I can host a meeting with 300 people

You can start here to see how it’s determined if a bridge is ‘overloaded’

1 Like

To handle large rooms, we run our videobridges on bare metal with realtime kernels, not containered or virtualized.

1 Like

Hi @rasos - that’s very interesting! If you have time to share, I’d love to hear more about the observations and performance measurements that lead to using bare-metal and a RT kernel. For instance is high volume UDP traffic much happier with an RT kernel or outside of virtualization?

In the last few days, I’ve load tested using JitsiMalleus and Selenium Grid scaled out on AWS Elastic Container Service to add up to 100 observers and 1 real presenter on a 2x8 core m4.2xLarge videobridge setup using Octo & SplitBridgeSelectionStrategy. While the CPU had a lot of headroom on both bridges, the presenter video and audio were visibly a bit choppy.

Our implementation were able to reach 250+ users in a conference room by running octo with SplitBridgeSelectionStrategy, channelLastN = 6 and server sizes of our bridges in AWS is c5.metal. Seems like the issues were facing to reach more than 300 is the client side cpu and bandwith limit.

3 Likes

Hello @Anthony_Garcia ,

This is cool, I’m curious to know more on how did u achieve this. and what about the multi jvb setup too. and where is this “SplitBridgeSelectionStrategy”

Regards,

Hi @bbaldino, is this feature implemented in stable debian repo? I’m asking because yesterday we had one bridge overloaded (over 400 Mbit/s traffic) and second had not that much work (30 Mbit/s traffic). I think server stress threshold was reached but all participants were added to the single bridge where was conference created. Here is part from jicofo log with stress values:

Jicofo 2020-05-29 09:09:38.254 INFO: [30] org.jitsi.jicofo.JitsiMeetConferenceImpl.log() Expiring channels for: kapusta@conference.meet.mydomain.com/fbb1d157 on: Bridge[jid=jvbbrewery@internal.auth.meet.mydomain.com/0b4864ed-b468-4760-b358-19d4d8b40fb4, relayId=xx.xx.xx9:4096, region=UK, stress=1.56]
Jicofo 2020-05-29 09:10:21.849 INFO: [30] org.jitsi.jicofo.JitsiMeetConferenceImpl.log() Expiring channels for: kapusta@conference.meet.mydomain.com/1beb3565 on: Bridge[jid=jvbbrewery@internal.auth.meet.mydomain.com/0b4864ed-b468-4760-b358-19d4d8b40fb4, relayId=xx.xx.xx9:4096, region=UK, stress=1.30]
Jicofo 2020-05-29 09:10:41.217 INFO: [30] org.jitsi.jicofo.JitsiMeetConferenceImpl.log() Expiring channels for: kapusta@conference.meet.mydomain.com/b44cb127 on: Bridge[jid=jvbbrewery@internal.auth.meet.mydomain.com/0b4864ed-b468-4760-b358-19d4d8b40fb4, relayId=xx.xx.xx9:4096, region=UK, stress=1.55]
Jicofo 2020-05-29 09:12:47.410 INFO: [30] org.jitsi.jicofo.JitsiMeetConferenceImpl.log() Expiring channels for: kapusta@conference.meet.mydomain.com/b99d0283 on: Bridge[jid=jvbbrewery@internal.auth.meet.mydomain.com/0b4864ed-b468-4760-b358-19d4d8b40fb4, relayId=xx.xx.xx9:4096, region=UK, stress=1.09]
Jicofo 2020-05-29 09:13:48.473 INFO: [30] org.jitsi.jicofo.JitsiMeetConferenceImpl.log() Expiring channels for: kapusta@conference.meet.mydomain.com/719f39ed on: Bridge[jid=jvbbrewery@internal.auth.meet.mydomain.com/0b4864ed-b468-4760-b358-19d4d8b40fb4, relayId=xx.xx.xx9:4096, region=UK, stress=1.00]
Jicofo 2020-05-29 09:21:12.203 INFO: [30] org.jitsi.jicofo.JitsiMeetConferenceImpl.log() Expiring channels for: kapusta@conference.meet.mydomain.com/cb98a8ea on: Bridge[jid=jvbbrewery@internal.auth.meet.mydomain.com/0b4864ed-b468-4760-b358-19d4d8b40fb4, relayId=xx.xx.xx9:4096, region=UK, stress=0.97]
Jicofo 2020-05-29 09:21:31.489 INFO: [30] org.jitsi.jicofo.JitsiMeetConferenceImpl.log() Expiring channels for: kapusta@conference.meet.mydomain.com/eb1ec630 on: Bridge[jid=jvbbrewery@internal.auth.meet.mydomain.com/0b4864ed-b468-4760-b358-19d4d8b40fb4, relayId=xx.xx.xx9:4096, region=UK, stress=1.28]
Jicofo 2020-05-29 09:32:16.928 INFO: [30] org.jitsi.jicofo.JitsiMeetConferenceImpl.log() Expiring channels for: kapusta@conference.meet.mydomain.com/89a56655 on: Bridge[jid=jvbbrewery@internal.auth.meet.mydomain.com/0b4864ed-b468-4760-b358-19d4d8b40fb4, relayId=xx.xx.xx9:4096, region=UK, stress=1.01]
Jicofo 2020-05-29 09:33:34.724 INFO: [30] org.jitsi.jicofo.JitsiMeetConferenceImpl.log() Expiring channels for: kapusta@conference.meet.mydomain.com/4c08043d on: Bridge[jid=jvbbrewery@internal.auth.meet.mydomain.com/0b4864ed-b468-4760-b358-19d4d8b40fb4, relayId=xx.xx.xx9:4096, region=UK, stress=1.50]
Jicofo 2020-05-29 09:35:49.471 INFO: [30] org.jitsi.jicofo.JitsiMeetConferenceImpl.log() Expiring channels for: kapusta@conference.meet.mydomain.com/af686f32 on: Bridge[jid=jvbbrewery@internal.auth.meet.mydomain.com/0b4864ed-b468-4760-b358-19d4d8b40fb4, relayId=xx.xx.xx9:4096, region=UK, stress=0.57]
Jicofo 2020-05-29 09:38:39.776 INFO: [30] org.jitsi.jicofo.JitsiMeetConferenceImpl.log() Expiring channels for: kapusta@conference.meet.mydomain.com/6249c6f2 on: Bridge[jid=jvbbrewery@internal.auth.meet.mydomain.com/0b4864ed-b468-4760-b358-19d4d8b40fb4, relayId=xx.xx.xx9:4096, region=UK, stress=0.46]

Octo split_bridge was disabled as you recommended. This situation arise this way: second JVB hosted one conference with 10 video participants, jicofo created two new conferences on first JVB, but this two conferences had 60 video participants over the time :frowning: This way we had one overloaded JVB and one not loaded.

With split bridge enabled we had not this situation.

What is main reason for not recommending split bridge strategy with octo? Added latency? Can you give us a little bit more info please?

Thank you,

Milan

The default selection strategy appears to be SingleBridgeSelectionStrategy, which means we won’t use Octo at all.

SplitBridgeSelectionStrategy is going to forcefully spread endpoints across bridges and is something we use to test Octo. You may find it better than forcing a conference on a single bridge, but, it’s not going to be ideal for production. The thing you need to debug is Jicofo’s view of the ‘stress levels’ of the bridges it sees. The Bridge toString() will include this info: https://github.com/jitsi/jicofo/blob/master/src/main/java/org/jitsi/jicofo/bridge/Bridge.java#L306

2 Likes

Hey @bbaldino - thanks for your inputs on SplitBridgeSelectionStrategy . I was planning to use this in production but it seems like this is something not recommended. I have some questions regarding the bridge selection strategies.

  1. Does it happen only with octo setup / this will happen without OCTO too?
  2. I came across IntraRegionBridgeSelectionStrategy (https://github.com/jitsi/jicofo/blob/master/src/main/java/org/jitsi/jicofo/bridge/IntraRegionBridgeSelectionStrategy.java). How does it distributes the streams exactly?
  1. I came across the above configurations from another thread and it is interesting. Does it work well with IntraRegionBridgeSelectionStrategy? Is there any recommendation on determining the values for the above parameters?

Thanks in advance.

We only split meetings across bridges when Octo is enabled.

It spreads a conference across bridges, but only bridges within the same region (the ‘home’ region of the conference)

They would, yeah. No recommendations there other than it’ll depend on the capability of your bridge machines.

3 Likes

Hi @bbaldino, thank you for your suggestion. I’m monitoring stress levels reported by all my JVB for several days and it seems OK me. I’ve removed split_bridge strategy from config files and that can be it as no strategy was selected. Now I’m running IntraRegionBridgeSelectionStrategy with three JVBs, but stress doesn’t exceed 0.8 on my JVBs currently so I can’t tell if it works for me.

Thank you anyway for support!

Milan

1 Like

Hi, @bbaldino
Now we have one shard with 3 jvb in only one region (for now).

  • I need all meetings split across all 3 bridges.
  • And also I need big meeting also split across bridges.

I wish that big meeting start on one bridge and all participants of that meeting are connecting to that bridge until some moment (for example when number of participants of that meeting is 50) or when that bridge is overloaded (high stress metric).
What I need to configure?
What selection strategy I need to choose?

PS. I have about connected 350 users maximum, and about 35-40 maximum users in one meeting.
We plan to be able to handle up to 300 users in one conference.

If I’m understanding you right, the behavior you’re looking for is exactly how Octo works. You can use the strategy mentioned above.