AWS Scalability Help for Jitsi and Jibri


#1

I am in the process of building an environment based on the Jitsi platform. My challenge is scalability based on my potential clients.

The sales arm of my company has received commitments that are forcasting the posibility of 100,000 paying clients in the next 90 to 120 days. I not only need to make sure my AWS servers scale to accommodate the client connections, but I also need to make sure I can scale (at an efficient and affordable price) the Jibri recording service.

I need help working on the server side of things to make this all come together where it supports an unlimited amount of traffic automatically.

My current developers are very knowledgeable on the AWS side of things but are struggling to understand how to automate the scaling part of Jitsi.

I welcome any help at all on deploying this. I am on somewhat of a tight deadline and need to roll this out by the end of January at the latest.

Thanks in advance for any input that you might have. I am also ready and willing to hire anyone that can help to fast-track this.

Thanks…Mark


#2

Hi @mahilton2000 ,

Thanks for reaching out, sorry I missed the Community Call this week and wasn’t around to answer your questions in person. As a caveat, all the parameters below will probably require tweaking for your specific needs and environment, so take them with a grain of salt.

We run meet.jit.si in a scalable way to accommodate users across the globe. Our current scaling model for the basic jitsi meet experience is to run it in “shards”. This means we have one prosody/jicofo/jitsi-meet server and an Autoscaling Group of JVBs in each region. We currently run in 6 regions in AWS, and use a mesh of HAProxy servers to direct traffic to the appropriate shard. Each conference only ever runs on one shard.

The JVBs are autoscaled up and down based on average outbound bandwidth across the group, and we rely on Jicofo to do the load balancing between JVBs. We run at least 2 JVBs (c5.xlarge with Dedicated tenancy) in each shard, and scale based on the EC2 CloudWatch metric NetworkOut. We scale up at NetworkOut >= 750000000 for 10 consecutive periods of 60 seconds. We scale back down at NetworkOut >= 375000000 for 10 consecutive periods of 60 seconds.

As far as jibri is concerned, we have a cron job which polls the Jibri REST interface to determine if it’s a) healthy and b) IDLE vs. BUSY. If it’s unhealthy, we scale it down and replace it. We use the IDLE/BUSY state to send a custom CloudWatch metric which we call “jibri_available”. Each jibri sends a 0 when it’s BUSY or a 1 when it’s IDLE. Then our autoscaling rule simply specifies jibri_available < 2 for 5 consecutive periods of 60 seconds to scale up, and jibri_available > 3 for 10 consecutive periods of 60 seconds to scale down. This means we end up starting with 2 jibris in a region, and attempted to always have 2 avaialble, and end up having a steady state of 3 jibris when none are in use.

Let me know if this is helpful, and if we can answer any other specific questions you might have.

Cheers,

-Aaron


#3

Thanks Aaron…I appreciate the information. I have forwarded that to my developers. In the mean time Emil suggested that I reach out to Lindaes.com for possible help. I will update you if I require more help.

Thanks…Mark

TokBird, Inc

Phone 877-357-6338 x 700

Mobile 832-524-4391

mark.hilton@tokbird.com

Skype: mahilton2000