Taking down JVB leads to crashed meetings on other JVBs

I run Jitsi Meet on Kubernetes. Today I wanted to upgrade my setup from 1 JVB to 3 JVBs. That is, I went from having one Service (jvb) and one Deployment (jvb) to having three Services (jvb-0, jvb-1, jvb-2; each with a different public external IP) and three respective Deployments (jvb-0, jvb-1, jvb-2). The Pod of each Deployment gets to know its external IP through the DOCKER_HOST_ADDRESS environment variable. The other environment variables (including JVB_AUTH_USER and JVB_AUTH_PASSWORD) did not change at all. It seemed to work: Different conferences get scheduled to the different JVBs and work just fine.

The problem

Whenever I delete one of the JVB Pods, conferences on other JVBs crash, too. Furthermore, they can’t even be recreated by refreshing the website. One has to change the conference name.

For example, I had a conference running on jvb-2 and deleted the Pod of jvb-0. The conference crashed. The following is the jicofo Log, beginning just before I deleted the Pod of jvb-0:

Jicofo 2021-02-06 18:36:50.013 INFO: [28] org.jitsi.jicofo.xmpp.BaseBrewery.log() Removed brewery instance: jvbbrewery@internal-muc.meet.jitsi/jvb-0-5869cdfdf9-tjj56
Jicofo 2021-02-06 18:36:50.013 INFO: [28] org.jitsi.jicofo.xmpp.BaseBrewery.log() A bridge left the MUC: jvbbrewery@internal-muc.meet.jitsi/jvb-0-5869cdfdf9-tjj56
Jicofo 2021-02-06 18:36:50.014 INFO: [28] org.jitsi.jicofo.bridge.BridgeSelector.log() Removing JVB: jvbbrewery@internal-muc.meet.jitsi/jvb-0-5869cdfdf9-tjj56
Jicofo 2021-02-06 18:36:50.014 INFO: [28] org.jitsi.jicofo.bridge.JvbDoctor.log() Stopping health-check task for: jvbbrewery@internal-muc.meet.jitsi/jvb-0-5869cdfdf9-tjj56
Jicofo 2021-02-06 18:36:50.015 INFO: [31] org.jitsi.jicofo.JitsiMeetConferenceImpl.log() Creating an Octo participant for Bridge[jid=jvbbrewery@internal-muc.meet.jitsi/jvb-2-68bbfc99df-9jj4w, relayId=null, region=null, stress=0.03] in JitsiMeetConferenceImpl[gid=119266, name=passiveemergenciessoarsufficiently2@muc.meet.jitsi]
Jicofo 2021-02-06 18:36:50.021 INFO: [342] org.jitsi.jicofo.AbstractChannelAllocator.log() Using jvbbrewery@internal-muc.meet.jitsi/jvb-2-68bbfc99df-9jj4w to allocate channels for: OctoParticipant[relays=[]]@915391138
Jicofo 2021-02-06 18:36:50.037 SEVERE: [342] org.jitsi.jicofo.AbstractChannelAllocator.log() jvbbrewery@internal-muc.meet.jitsi/jvb-2-68bbfc99df-9jj4w - failed to allocate channels, will consider the bridge faulty: XMPP error: <iq to='focus@auth.meet.jitsi/focus506067324354232' from='jvbbrewery@internal-muc.meet.jitsi/jvb-2-68bbfc99df-9jj4w' id='MGQZm-10845' type='error'><error type='cancel'><internal-server-error xmlns='urn:ietf:params:xml:ns:xmpp-stanzas'/><text xmlns='urn:ietf:params:xml:ns:xmpp-stanzas' xml:lang='en'>Couldn&apos;t get OctoRelayService</text></error></iq>
org.jitsi.protocol.xmpp.colibri.exception.ColibriException: XMPP error: <iq to='focus@auth.meet.jitsi/focus506067324354232' from='jvbbrewery@internal-muc.meet.jitsi/jvb-2-68bbfc99df-9jj4w' id='MGQZm-10845' type='error'><error type='cancel'><internal-server-error xmlns='urn:ietf:params:xml:ns:xmpp-stanzas'/><text xmlns='urn:ietf:params:xml:ns:xmpp-stanzas' xml:lang='en'>Couldn&apos;t get OctoRelayService</text></error></iq>
	at org.jitsi.impl.protocol.xmpp.colibri.ColibriConferenceImpl.maybeThrowOperationFailed(ColibriConferenceImpl.java:364)
	at org.jitsi.impl.protocol.xmpp.colibri.ColibriConferenceImpl.createColibriChannels(ColibriConferenceImpl.java:268)
	at org.jitsi.jicofo.OctoChannelAllocator.doAllocateChannels(OctoChannelAllocator.java:95)
	at org.jitsi.jicofo.AbstractChannelAllocator.allocateChannels(AbstractChannelAllocator.java:271)
	at org.jitsi.jicofo.AbstractChannelAllocator.doRun(AbstractChannelAllocator.java:190)
	at org.jitsi.jicofo.AbstractChannelAllocator.run(AbstractChannelAllocator.java:150)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Jicofo 2021-02-06 18:36:50.037 SEVERE: [342] org.jitsi.jicofo.JitsiMeetConferenceImpl.log() One of our bridges failed: jvbbrewery@internal-muc.meet.jitsi/jvb-2-68bbfc99df-9jj4w
Jicofo 2021-02-06 18:36:50.037 INFO: [342] org.jitsi.jicofo.JitsiMeetConferenceImpl.log() Region info, conference=119266: [[null, null]]
Jicofo 2021-02-06 18:36:50.037 INFO: [342] org.jitsi.jicofo.JitsiMeetConferenceImpl.log() Region info, conference=119266: [[null]]
Jicofo 2021-02-06 18:36:50.037 INFO: [342] org.jitsi.jicofo.JitsiMeetConferenceImpl.log() Region info, conference=119266: [[null]]
Jicofo 2021-02-06 18:36:50.038 WARNING: [342] org.jitsi.jicofo.AbstractParticipant.log() Canceling OctoChannelAllocator[BridgeSession[id=119266_af64c7, bridge=Bridge[jid=jvbbrewery@internal-muc.meet.jitsi/jvb-2-68bbfc99df-9jj4w, relayId=null, region=null, stress=0.03]]@510813779, OctoParticipant[relays=[]]@915391138]@133371104
Jicofo 2021-02-06 18:36:50.040 INFO: [342] org.jitsi.jicofo.bridge.BridgeSelectionStrategy.log() Selected initial bridge Bridge[jid=jvbbrewery@internal-muc.meet.jitsi/jvb-1-75c75bd8b7-nvhl9, relayId=null, region=null, stress=0.00] with reported stress=0.0 for participantRegion=null using strategy SingleBridgeSelectionStrategy
Jicofo 2021-02-06 18:36:50.040 INFO: [342] org.jitsi.impl.protocol.xmpp.colibri.OperationSetColibriConferenceImpl.log() Conference created: org.jitsi.impl.protocol.xmpp.colibri.ColibriConferenceImpl@56a03ded
Jicofo 2021-02-06 18:36:50.040 INFO: [342] org.jitsi.jicofo.JitsiMeetConferenceImpl.log() Added participant jid= passiveemergenciessoarsufficiently2@muc.meet.jitsi/9c6fd92d, bridge=jvbbrewery@internal-muc.meet.jitsi/jvb-1-75c75bd8b7-nvhl9
Jicofo 2021-02-06 18:36:50.040 INFO: [342] org.jitsi.jicofo.JitsiMeetConferenceImpl.log() Region info, conference=119266: [[null, null]]
Jicofo 2021-02-06 18:36:50.040 INFO: [344] org.jitsi.jicofo.discovery.DiscoveryUtil.log() Doing feature discovery for passiveemergenciessoarsufficiently2@muc.meet.jitsi/9c6fd92d
Jicofo 2021-02-06 18:36:50.040 INFO: [342] org.jitsi.jicofo.JitsiMeetConferenceImpl.log() Added participant jid= passiveemergenciessoarsufficiently2@muc.meet.jitsi/83bcd2fe, bridge=jvbbrewery@internal-muc.meet.jitsi/jvb-1-75c75bd8b7-nvhl9
Jicofo 2021-02-06 18:36:50.040 INFO: [344] org.jitsi.jicofo.discovery.DiscoveryUtil.log() Successfully discovered features for passiveemergenciessoarsufficiently2@muc.meet.jitsi/9c6fd92d in 0
Jicofo 2021-02-06 18:36:50.040 INFO: [342] org.jitsi.jicofo.JitsiMeetConferenceImpl.log() Region info, conference=119266: [[null, null, null]]
Jicofo 2021-02-06 18:36:50.041 INFO: [344] org.jitsi.jicofo.AbstractChannelAllocator.log() Using jvbbrewery@internal-muc.meet.jitsi/jvb-1-75c75bd8b7-nvhl9 to allocate channels for: Participant[passiveemergenciessoarsufficiently2@muc.meet.jitsi/9c6fd92d]@1938637495
Jicofo 2021-02-06 18:36:50.041 INFO: [343] org.jitsi.jicofo.discovery.DiscoveryUtil.log() Doing feature discovery for passiveemergenciessoarsufficiently2@muc.meet.jitsi/83bcd2fe
Jicofo 2021-02-06 18:36:50.041 INFO: [343] org.jitsi.jicofo.discovery.DiscoveryUtil.log() Successfully discovered features for passiveemergenciessoarsufficiently2@muc.meet.jitsi/83bcd2fe in 0
Jicofo 2021-02-06 18:36:50.041 INFO: [343] org.jitsi.jicofo.AbstractChannelAllocator.log() Using jvbbrewery@internal-muc.meet.jitsi/jvb-1-75c75bd8b7-nvhl9 to allocate channels for: Participant[passiveemergenciessoarsufficiently2@muc.meet.jitsi/83bcd2fe]@1630508651
Jicofo 2021-02-06 18:36:50.052 INFO: [344] org.jitsi.jicofo.ParticipantChannelAllocator.log() Sending transport-replace to: passiveemergenciessoarsufficiently2@muc.meet.jitsi/9c6fd92d
Jicofo 2021-02-06 18:36:50.052 INFO: [344] org.jitsi.protocol.xmpp.AbstractOperationSetJingle.log() RE-INVITE PEER: passiveemergenciessoarsufficiently2@muc.meet.jitsi/9c6fd92d
Jicofo 2021-02-06 18:36:50.061 INFO: [343] org.jitsi.jicofo.ParticipantChannelAllocator.log() Sending transport-replace to: passiveemergenciessoarsufficiently2@muc.meet.jitsi/83bcd2fe
Jicofo 2021-02-06 18:36:50.061 INFO: [343] org.jitsi.protocol.xmpp.AbstractOperationSetJingle.log() RE-INVITE PEER: passiveemergenciessoarsufficiently2@muc.meet.jitsi/83bcd2fe
Jicofo 2021-02-06 18:36:50.240 INFO: [28] org.jitsi.jicofo.JitsiMeetConferenceImpl.log() Got transport-accept from: passiveemergenciessoarsufficiently2@muc.meet.jitsi/9c6fd92d
Jicofo 2021-02-06 18:36:50.240 INFO: [28] org.jitsi.jicofo.JitsiMeetConferenceImpl.log() Received session-accept from passiveemergenciessoarsufficiently2@muc.meet.jitsi/9c6fd92d with accepted sources:Sources{ }@1439959067
Jicofo 2021-02-06 18:36:50.324 INFO: [28] org.jitsi.jicofo.JitsiMeetConferenceImpl.log() Got transport-accept from: passiveemergenciessoarsufficiently2@muc.meet.jitsi/83bcd2fe
Jicofo 2021-02-06 18:36:50.324 INFO: [28] org.jitsi.jicofo.JitsiMeetConferenceImpl.log() Received session-accept from passiveemergenciessoarsufficiently2@muc.meet.jitsi/83bcd2fe with accepted sources:Sources{ }@455402446
Jicofo 2021-02-06 18:36:58.320 INFO: [28] org.jitsi.jicofo.xmpp.BaseBrewery.log() Added brewery instance: jvbbrewery@internal-muc.meet.jitsi/jvb-0-5869cdfdf9-v5xpp
Jicofo 2021-02-06 18:36:58.321 INFO: [28] org.jitsi.jicofo.bridge.BridgeSelector.log() Added new videobridge: Bridge[jid=jvbbrewery@internal-muc.meet.jitsi/jvb-0-5869cdfdf9-v5xpp, relayId=null, region=null, stress=0.00]
Jicofo 2021-02-06 18:36:58.321 INFO: [28] org.jitsi.jicofo.bridge.JvbDoctor.log() Scheduled health-check task for: jvbbrewery@internal-muc.meet.jitsi/jvb-0-5869cdfdf9-v5xpp

Additional information

Jitsi Version: stable-5390-3

My common environment for all components (web, jicofo, prosody, jvb):

JIBRI_BREWERY_MUC=jibribrewery
JIBRI_PENDING_TIMEOUT=90
JIBRI_RECORDER_USER=recorder
JIBRI_XMPP_USER=jibri
JICOFO_AUTH_USER=focus
JIGASI_BREWERY_MUC=jigasibrewery
JVB_AUTH_USER=jvb
JVB_BREWERY_MUC=jvbbrewery
JVB_PORT=10000
JVB_STUN_SERVERS=stun.l.google.com:19302,stun1.l.google.com:19302,stun2.l.google.com:19302
JVB_TCP_HARVESTER_DISABLED=true
JVB_TCP_PORT=4443
PUBLIC_URL=https://example.com
TZ=Europe/Berlin
XMPP_AUTH_DOMAIN=auth.meet.jitsi
XMPP_BOSH_URL_BASE=http://prosody:5280
XMPP_DOMAIN=meet.jitsi
XMPP_INTERNAL_MUC_DOMAIN=internal-muc.meet.jitsi
XMPP_MUC_DOMAIN=muc.meet.jitsi
XMPP_SERVER=prosody

Could this be related to the fact that I used the same JVB_AUTH_USER for all 3 jitsi-videobridge instances?

@haslersn We are seeing the same exact issue, did you find what was the cause? We are urgently trying to find the reason why this is happening.

We don’t have the issue anymore, but I don’t know exactly what was the solution. Here is a (possibly incomplete) list of things we changed:

  • Set JVB_STUN_SERVERS= (i.e. empty)
  • Instead of 3 Deployments, use 1 StatefulSet jvb with 3 replicas. (The Pods are automatically named jvb-0, jvb-1, jvb-2.)
  • Set MY_JVB_HOSTED_ZONE=jitsi-meet.example.com (where example.com is of course replaced by our domain)
  • Set MY_JVB_NAME to the Pod name. This is possible as follows:
                - name: MY_JVB_NAME
                  valueFrom:
                    fieldRef:
                      fieldPath: metadata.name
    
    
  • Set HOSTNAME to the Kubernetes Pod UID. This is required because the jvb Docker image uses $HOSTNAME as MUC_NICKNAME and MUC_NICKNAME must be unique accross different jitsi-videobridges, i.e. it must be a new MUC_NICKNAME if the jitsi-videobridge local IP changes, which is the case if the Pod is recreated.
                - name: HOSTNAME
                  valueFrom:
                    fieldRef:
                      fieldPath: metadata.uid
    
  • Mount the following /defaults/sip-communicator.properties :
    org.ice4j.ice.harvest.NAT_HARVESTER_LOCAL_ADDRESS={{ .Env.LOCAL_ADDRESS }}
    org.ice4j.ice.harvest.NAT_HARVESTER_PUBLIC_ADDRESS={{ .Env.MY_JVB_NAME }}.{{ .Env.MY_JVB_HOSTED_ZONE }}
    

We still have 3 separate Services jvb-0, jvb-1, jvb-2. Here’s our jvb-0 Service. The other Services are analogous.

apiVersion: v1
kind: Service
metadata:
  name: jvb-0
  annotations:
    external-dns.alpha.kubernetes.io/hostname: jvb-0.jitsi-meet.example.com.
spec:
  ports:
    - port: 10000
      protocol: UDP
  selector:
    statefulset.kubernetes.io/pod-name: jvb-0
  type: LoadBalancer
  externalTrafficPolicy: Local
1 Like

Do you guys have octo enabled? We use intraregion and are also having this problem.

That is a very good question @acruz

@haslersn do you all use octo, we are thinking that might be the issue.

Thanks so much for your help!!

Is anyone using this repo with latest docker images?

We don’t use octo.

Having similar issue here, but on vanilla Jitsi installed from apt.
Connected multiple JVBs and whenever a JVB is restarted, conferences on other JVBs crash.
We also see “Couldn’t get OctoRelayService” errors in jicofo logs, but we don’t have Octo enabled.

We are having the same issue on bare metal JVBs and Jicofo 1.0-692-hf-1, no octo. It is also discussed here Shutting down a JVB crashes a conference on another JVB

This has been fixed now in latest jicofo All conferences on a multi-JVB setup crash when a single JVB goes down · Issue #707 · jitsi/jicofo · GitHub