I want to create Jitsi can handle 1000 participants in one conference (only voice). is that possible?
I want to create Jitsi can handle 1000 participants in one conference (only voice). is that possible?
This is currently not possible to have so many participants in one conference. We are currently working on being able create a big conference, which means a hundred participants (audio & video) where we have 20 senders.
1000 participants sounds like a broadcast scenario, where you will have few speakers in the conference and the rest will just listen the stream, which you can test on meet.jit.si by live-streaming to youtube, where you can have millions of viewers not only 1000.
I want to make an application like Discord App. which is not broadcasting. in the application all users can communicate. does it mean that jitsi can’t handle big conference yet? Is the scale of videobridge not a solution?
When Jitsi release can create big conference?
We expect to be able to do 100 participant call in a month or so.
It is not just the bridge, it is also the UI.
what’s the current maximum for simultaneous senders?
and how many people can join one room?
If I don’t make UI / Video, just use voice. what is the maximum user that can join the conference?
Client computer bottleneck
Jitsi meet work out of the box with up to about 15 participants. After that you will start to run into bottlenecks for users with slow computers that the userinterface redraw itself too much and consume 100% cpu.
This can be solved by profiling using the chrome performance monitor and disable all things that consume CPU time.
For example audio level monitoring that render the blue dots when someone talk trigger GUI updates. With many users these GUI updates will cause the client web-browser to redraw too frequently that will cause the user to have a bad audio experience due to missed audio updates. The solution is to simply disable the AudioLevels monitoring.
On the left only 10% CPU when audio levels are disabled. On the right 100% CPU time when audio levels are enabled due to frequent audioLevelReport processing and GUI re-layout.
@damencho - Is there any update for 100 participant call rollout ?
For anyone that may be interested, These are test results for creating a 50 participant room, so far.!
At the moment, we seem to be having problems running more than 20 participants in a single room on an AWS m4.2xlarge instance. From AWS console, the network throughput flattens out, so I am assuming that it is a network problem.
VideoParser.transform#58: Unable to find encoding matching packet! packet=RtpPacket: PT=100, Ssrc=1208934469, SeqNum=16985, M=true, X=true, Ts=487707000, encodings=
TransportCcEngine.tccReceived#157: TCC packet contained received sequence numbers: 651-653. Couldn’t find packet detail for the seq nums: 651-653. Latest seqNum was 1844, size is 1000. Latest RTT is 1496.352784 ms.
ConnectivityCheckClient.processTimeout#857: timeout for pair: 22.214.171.124:10000/udp/srflx -> 126.96.36.199:37717/udp/prflx (stream-3af6937c.RTP), failing.
Seem to be the relevant messages that repat (a lot !!)
We tested with 50 participants (Real people, on PC’s + Mobile APP) but it was imposible to see / hear correctly above 20 users.
I will be trying again this afternoon with a bigger AWS instance (c5n.9xlarge) wich suposedly has 25 Gbs of network. I will post the results here.
I will try the disableAudioLevel recommendation in case client side performance is an issue.
Meantime, if anyone has any recommendations or clues as to what to look for in JVB logs, they would be much appreciated.
UPDATE- c5n.9xlarge TESTS
The UPDATE- c5n.9xlarge instance supposedly has a 50 Gbs link, so I think we should be able to eliminate serverside network bottlenecks as a cause of problems for 50 users in one room.
The conference test was initially reasonable, but degraded after a couple of minutes at 50 participants, with video and audio problems, participants loosing connections, etc.
Here the graphs for network & CPU usage (50 Participants joining from 15:05 to about 15:40, then 20 participants till 16:15):
As can be seen from the Network Out graph, values are significantly higher for the second test, which indicates that the m4.2xlarge instance was introducing a network bottleneck in the first test.
Logs are showing the same type of errors as those previously shown.
Just in case it is of importance, in the second test, the Jiitsi-Meets Server and the JVB were running on separate instances (JMS on m4.2xlarge video bridge shut down, JVB on c5n.9xlarge).
Fascinating … Do you have any graphs on UDP , TCP graphs. I am supposing some might have connected through TCP and some through UDP. Running prometheus + grafana, will give you some very good graphs to analyze.
Also am interested on running JVB on lets say a1.2xlarge (AWS Graviton Processor with 64-bit Arm Neoverse cores) . The Process gives you a 10 Gbps throughput easily. I was running some tests in the morning .
The following are the results from a iperf udp test to a public IPERF server
|Instance||Cloud||Available Bandwidth IPERF UDP Tests||Comments|
|n1-standard-4||GCP||474 Mbit/sec||Standard Instance of Google Cloud|
|t2.micro||AWS||656 Mbit/sec||Standard Instance of AWS|
|a1.large||AWS||1.67 Gbit/sec||Special Instance (ARM Processor) with more network throughput|
|a1.large||AWS||3.58 Gbit/sec||Special Instance (ARM Processor) with more network throughput(with a small network tweak) mentioned https://aws.amazon.com/blogs/compute/optimizing-network-intensive-workloads-on-amazon-ec2-a1-instances/|
As far as I know, Octo has been introduced recently to cascade videobridges, i.e., to serve a conference via several videobridges. Maybe Octo can help you to host such a large conference (never tried myself):
I think the whole idea is to understand the constraints. Is it cpu usage, memory usage or the bandwidth. Till now the whole problem looks like bandwidth for me. But again this requires a lot of analysis to scale up. Using Octo makes sense if you have people logging from different regions. But let’s say we have a school having online classes for the students, most of them login from the same region.
My thinking was that the bandwidth usage of a single JVB won’t grow exponentially anymore when using octo with no regions configured (if thats even possible). However, I have no experience with using octo nor with hosting such large conferences, so I may be completely wrong here.
@cosmo83 I have not had time to setup prometheus + grafana, but it is on the todo list.
As has been mentioned in many threads, bandwidth appears to be the predominant factor, but it looks to me that CPU is also a factor to consider.
This is CPU usage on a 27 participant conference tested on c5n.9xlarge for JVB and c5n.large for Jitsi Meet (no JVB running).
Just ran the tests for 10 streamers and the video looked pretty good for me. Also captured some prom graphs …
Hi @cosmo83, may I ask what SW are you using for Hammertesting? Official hammertest is not compatible wit recent JVB2 according to the message in git.
Iam using selenium grid
I thought I might provide some feedback from experience.
I really appreciate the work that @damencho and the Jitsi team are putting in, my team is looking hopefully try to finally contribute back to their great work by investigating how to have OCTO run using a user load-balanced model, which is currently possible if you aren’t using OCTO. If you need to use geographic load balancing you can only currently do server-based load balancing.
My team is using a 7 region OCTO deployment.
West US, East US, West Asia, East Asia, Europe, South America, Australia
We have a very stable build now.
We have a separate server for running jicofo, prosidy etc (this is a t3.medium) and we use c5n.xlarge for JVBs setup with autoscaling based on CPU usage.
We swapped this from network, because network usage is fairly bursty while CPU usage grows more linearly with the number of participants.
We run 2 modes for usage.
50 users in a Last-12 mode.
100 users in a Last-2 mode.
We have chosen 50 users on Last-12 based on experience from users plus testing results. 50 users on Last-12 results in a decent utlisation of end user cpus any higher than last-12 starts to use significant processing power and bandwidth. 50 person Last-12 conference will use ~ 1GB/sec bandwidth on your server. Hence the need for using network optimised c5n instead of the standard C5.
A 70 person Last-2 conference runs < 1GB/sec for your server (largest session we’ve actually had)
This will use about 80% of the c5n.xlarge CPU however.
Our experience says in Last-12 mode. Max participants is 50 using a c5n.xlarge (a c5.xlarge will sometimes run out of available network when network bandwidth spikes)
Last-2 mode. Max participants is 100 using a c5n.xlarge (the c5n.xlarge will run out of CPU usage at this point)
Note: because some of the considerations are actually the end users device, the reduction of video streams is partly to save their machines. We have also switched off the visual audio monitor too, as there were other reports from other threads around this being a problem. This is also really important to do, as your end user machines won’t support this.
(see - .High CPU utilization on client end)
Also of note, we also completely block all browsers that do not simulcast (eg. firefox) the lack of simulcast significantly disrupts large conferences due to the increased bandwidth consumption.
Thanks a lot for your nice description.
I have additional questions to your settings.
Our system needs multiple different settings (not at the same time):
Thanks a lot for your answers
This roughly equates to what we also happened to come up with through experience.
5 videos x 120 participants = 600 video streams (this is roughly what the other thread mentioned)
12 videos x 50 participants = 600 video streams (this is roughly what our last-12 configuration at 50 people is)
My guess is a Last-2 configuration can theoretically go much much higher than 100 participants (but the c5n.xlarge will run out of CPU to process the attendees unless it running on a user based load balancing setup).
Of note - Last-N room configuration can be set programatically which is what we do.