Multi-shard auto-scalable Kubernetes setup

Hi all!

We are currently working on a scalabe Kubernetes setup for Jitsi. We have found a lot of valuable information in this forum and therefore also want to contribute to that.

Our setup is open source and can be found on Github: https://github.com/schul-cloud/jitsi-deployment

Feel free to use any of the resources there!

We have managed to setup Jitsi Meet in a multi-shard multi-JVB environment on a single Kubernetes cluster. In addition to that we also have monitoring (kube-prometheus), logging (ECK), and certificate handling (cert-manager) in place.

Our general architecture is pictured here: https://raw.githubusercontent.com/schul-cloud/jitsi-deployment/master/docs/architecture/build/jitsi_meet.png

We would also appreciate pointing out any issues that you can find!

4 Likes

Thank you for sharing this ! How many users and simultaneous connections are you handling ? What has been your maximum ?

We have done some load testing on our setup. The results are documented at https://github.com/schul-cloud/jitsi-deployment/blob/master/docs/loadtests/loadtestresults.md

2 Likes

Thank you!

Hi @wolfm89,
It is the best documentation of Jitsi in Kubernetes ever created in this community. :slight_smile:

How many JVB maximum is being set in your K8S cluster and how we can customize this number? By the way, do you plan to scale jibri in your K8S deployment as well?

Thank you,

Hi Janto,
Thank you for your kind words!
The maximum (and minimum) number of JVB instances per shard is set in the configuration of the HPA (Horizontal Pod Autoscaler) in the overlays: e.g. for our dev environment https://github.com/schul-cloud/jitsi-deployment/blob/master/overlays/development/jitsi-base/jvb-hpa-patch.yaml
Just increase the number of maxReplicas there.

Unfortunately we’re not planning to integrate Jibri for now. But it would be great to see some results if you manage to get that working, too!

Hmm… I can only guess because I haven’t seen this error before. It looks like as if the custom resource “DecoratorController” is not available. It should be available after applying the MetaController kustomization.yaml in https://github.com/schul-cloud/jitsi-deployment/tree/master/base/ops/metacontroller

Thank you for sharing!
I tried to run it on Mac with such versions of software:
[kustomize/v3.5.4]
[minikube 1.12.1]
[kubectl version --client
Client Version: version.Info{Major:“1”, Minor:“18”, GitVersion:“v1.18.0”, GitCommit:“9e991415386e4cf155a24b1da15becaa390438d8”, GitTreeState:“clean”, BuildDate:“2020-03-25T14:58:59Z”, GoVersion:“go1.13.8”, Compiler:“gc”, Platform:“darwin/amd64”}]
That what I get when I try to run a development overlay

Blockquote
kustomize build . | kubectl apply -f -
unable to recognize “STDIN”: no matches for kind “Certificate” in version “cert-manager.io/v1alpha2
unable to recognize “STDIN”: no matches for kind “ClusterIssuer” in version “cert-manager.io/v1alpha2
unable to recognize “STDIN”: no matches for kind “Elasticsearch” in version “elasticsearch.k8s.elastic.co/v1
unable to recognize “STDIN”: no matches for kind “Kibana” in version “kibana.k8s.elastic.co/v1
unable to recognize “STDIN”: no matches for kind “DecoratorController” in version “metacontroller.k8s.io/v1alpha1
unable to recognize “STDIN”: no matches for kind “DecoratorController” in version “metacontroller.k8s.io/v1alpha1
unable to recognize “STDIN”: no matches for kind “DecoratorController” in version “metacontroller.k8s.io/v1alpha1
unable to recognize “STDIN”: no matches for kind “DecoratorController” in version “metacontroller.k8s.io/v1alpha1
unable to recognize “STDIN”: no matches for kind “Alertmanager” in version “monitoring.coreos.com/v1
unable to recognize “STDIN”: no matches for kind “PodMonitor” in version “monitoring.coreos.com/v1
unable to recognize “STDIN”: no matches for kind “PodMonitor” in version “monitoring.coreos.com/v1
unable to recognize “STDIN”: no matches for kind “PodMonitor” in version “monitoring.coreos.com/v1
unable to recognize “STDIN”: no matches for kind “Prometheus” in version “monitoring.coreos.com/v1
unable to recognize “STDIN”: no matches for kind “PrometheusRule” in version “monitoring.coreos.com/v1
unable to recognize “STDIN”: no matches for kind “ServiceMonitor” in version “monitoring.coreos.com/v1
unable to recognize “STDIN”: no matches for kind “ServiceMonitor” in version “monitoring.coreos.com/v1
unable to recognize “STDIN”: no matches for kind “ServiceMonitor” in version “monitoring.coreos.com/v1
unable to recognize “STDIN”: no matches for kind “ServiceMonitor” in version “monitoring.coreos.com/v1
unable to recognize “STDIN”: no matches for kind “ServiceMonitor” in version “monitoring.coreos.com/v1
unable to recognize “STDIN”: no matches for kind “ServiceMonitor” in version “monitoring.coreos.com/v1
unable to recognize “STDIN”: no matches for kind “ServiceMonitor” in version “monitoring.coreos.com/v1
unable to recognize “STDIN”: no matches for kind “ServiceMonitor” in version “monitoring.coreos.com/v1
unable to recognize “STDIN”: no matches for kind “ServiceMonitor” in version “monitoring.coreos.com/v1
unable to recognize “STDIN”: no matches for kind “ServiceMonitor” in version “monitoring.coreos.com/v1
unable to recognize “STDIN”: no matches for kind “ServiceMonitor” in version “monitoring.coreos.com/v1
unable to recognize “STDIN”: no matches for kind “ServiceMonitor” in version “monitoring.coreos.com/v1
unable to recognize “STDIN”: no matches for kind “ServiceMonitor” in version “monitoring.coreos.com/v1

I manage to do a kustomize build . | kubectl apply -f - without any errors but
it does not really look like working in the end.

  • dashboard is quite broken - show no Namespaces, pretty much nothing is there
  • kibana server only display ’ Kibana server is not ready yet’

Hello,

Have tried to run jitsi in kubernetes for a while. This seems like the perfect setup.

I am having trouble setting this up on GKE tough. I simplified the setup by taking out all the bbb,
logging and grafana stuff as I wanted to test this as bare bone and as simple as possible.
( Also the full deployment failed for me in many ways )

I got p2p sessions to work with jwt enabled. But anything jvb won’t work. I’m having trouble understanding how the connection is being made to the jvb servers as they scale. How does the haproxy in sticky mode route the udp video stream to the right jvb and room.

It seems so close to working, but i guess the hard part is the jvb scaling.

HAProxy is not involved in anyway with videostreams, it’s there just to for “web” shard, that is for handling HTTP traffic (redirecting to right jms instance based on room parametr), which videostreams are not.
Maybe, if you describe what jvb problems you have, we can help. I suppose you have some general jvb problem, which are solved here plenty in this forum.

1 Like

Should the jvb instance then be creating it’s own external ingress with its own external ip? Or is the same ingress ip used as jitsi-web service, but using ports 30300 + jvb number for shard 0 and 30400 + jvb number for shard-1?

We run jitsi in kubernetes and we have configured jvb so the pods listen on predefined hostPort, from kubernets point of view, they’re not exposed to public via service.
We run each jvb pod on different vms, the adress of each jvb is vms’s ip address and port specified in hostPort. I would avoid running multiple jvb instances on same vms.

Ingress is thus not needed, i think you should basically try to avoid any component not necessary for videostreams and make “path” between clients and jvb as short as possible in term of components. Clients will basically connect directly to jvb exposed hostPort, jicofo will inform all clients to where to connect to, ie tells clients where are the jvb bridges (ip adress and port on which each jvb listens).

So in your setup i would make all jvbs listen on one port (ie default 10000/UDP) and roll out vms with own ip for each jvb. That you can scale very easilly, we use it that way and it’s super easy to add new jvb instance (the only variable here is jvb nickname, which needs to be different for each jvb, so basically we use hostname and that’s it).

As usuall, correct me if i’m wrong :wink:

btw. k8s ingress object is for processing HTTP requests, which videostream data basically are not. Thus HAProxy and ingress doesn’t apply here at all. I’m confused that you are still trying to use those where they can’t be applied, because they are totally different things. You might think of it as if you are trying to throw x265 videodata on HAProxy or nginx.

1 Like

@wolfm89 I was trying a similar setup, but not exactly the same. Have you ever seen a problem where each haproxy instance assigns different shard names when servers are resolved by DNS? Is there anything special done in your setup that preserves the order of shards in a DNS response?

This comment says that using SRV query instead of A, makes the order to be preserved:

Does it mean when you do dig -t SRV _http._tcp.web.jitsi.svc.cluster.local command it always gives you results in the same order? Is it something special about kube DNS or is there special config that can achieve that? We’re using CoreDNS and the SRV records appear always in random order in the response. Each haproxy instance assigns different server numbers depending on the luck. This seems to be a common issue judging by comments on this haproxy issue:

Hello @simoncolincap @janrenz @mvakert @wolfm89@ wolfm89

Everything goes well.I have deployed in AKS.

Just wanted to know how to access the jitsi frontend ?

For your information, I have also replaced the placeholders for the base64 credentials in

/overlays/production/ops/bbb-basic-auth-secret.yaml
/base/jitsi/jitsi-secret.yaml

Below is the screenshot when I run kubectl get all -n jitsi

1 Like

Hi Everyone, same result @sunilkumarjena21


Logs:

Hi @namtel
Which cloud are you using?

Just check with the describe cmd…like below…In my case it was CPU memory allocation issue & hardcoded values in deployment.

kubectl describe po prometheus-k8s-0 -n jitsi

  • Using DO,
    And still stuck here.

Comment these lines throu-out the project

spec:

  # avoid that pods of different shards share zone

  # nodeSelector:

  #   topology.kubernetes.io/zone: ZONE_1

Nice Guys & great documentation to start with. Hats off :slight_smile:

One small issue on Jicofo when deployment in Kubernetes. @wolfm89 @nosmo @ mvakert , janrenz Jan Renz , simoncolincap

Jicofo failed to start:-

Events:
Type Reason Age From Message

Normal Scheduled 13m default-scheduler Successfully assigned jitsi/shard-0-jicofo-6c85888786-lkk9w to ip-172-31-41-106.ap-south-1.compute.internal
Normal Pulling 13m kubelet, ip-172-31-41-106.ap-south-1.compute.internal Pulling image “jitsi/jicofo:stable-4627-1”
Normal Pulled 13m kubelet, ip-172-31-41-106.ap-south-1.compute.internal Successfully pulled image “jitsi/jicofo:stable-4627-1”
Normal Created 13m kubelet, ip-172-31-41-106.ap-south-1.compute.internal Created container jicofo
Normal Started 13m kubelet, ip-172-31-41-106.ap-south-1.compute.internal Started container jicofo
Warning Unhealthy 3m23s (x60 over 13m) kubelet, ip-172-31-41-106.ap-south-1.compute.internal Readiness probe failed: Get http://172.31.47.68:8888/about/health: dial tcp 172.31.47.68:8888: connect: connection refused