Strange issue in multi jvb setup after update

Hi,

today we updated our jitsi service to the current stable version 2.0.5390 from the debian packet sources. We’re running multiple jvb machines in our setup. After updating everything we wanted to test, if the handover of conferences is working when one jvb machine is failing, so we switched one jvb off. Suddenly we weren’t able to talk anymore and we had to reload to make it work again (chromium 88). Before the update this works perfectly fine with no reload. This issue is also happening when we restart a video bridge in the cluster that is empty.
We we’re also able to reproduce this issue on another jitsi instance where we know the administration guys.

Even more strange is, when performing the above mentioned procedure in a conference with only firefox clients in the conference we can constantly talk without any issue. If Chrome and Fifo users are mixed, the chrome users cannot hear the firefox users, even after a reload.

In the jicofo log we only saw this part that we found a bit strange:

Jicofo 2021-02-04 21:51:15.119 INFO: [31] org.jitsi.jicofo.xmpp.BaseBrewery.log() Removed brewery instance: jvbbrewery@internal.auth.meet.domain.com/4c7f353f-5a57-485c-aef1-1369a11a9cc1
Jicofo 2021-02-04 21:51:15.119 INFO: [31] org.jitsi.jicofo.xmpp.BaseBrewery.log() A bridge left the MUC: jvbbrewery@internal.auth.meet.domain.com/4c7f353f-5a57-485c-aef1-1369a11a9cc1
Jicofo 2021-02-04 21:51:15.119 INFO: [31] org.jitsi.jicofo.bridge.BridgeSelector.log() Removing JVB: jvbbrewery@internal.auth.meet.domain.com/4c7f353f-5a57-485c-aef1-1369a11a9cc1
Jicofo 2021-02-04 21:51:15.119 INFO: [31] org.jitsi.jicofo.bridge.JvbDoctor.log() Stopping health-check task for: jvbbrewery@internal.auth.meet.domain.com/4c7f353f-5a57-485c-aef1-1369a11a9cc1
Jicofo 2021-02-04 21:51:15.119 INFO: [38] org.jitsi.jicofo.JitsiMeetConferenceImpl.log() Creating an Octo participant for Bridge[jid=jvbbrewery@internal.auth.meet.domain.com/30042aeb-f0b6-45c7-b8c5-8cd2bc6b3fe3, relayId=null, region=n
ull, stress=0.08] in JitsiMeetConferenceImpl[gid=39026, name=backup_chaos@conference.meet.domain.com]
Jicofo 2021-02-04 21:51:15.120 INFO: [446] org.jitsi.jicofo.AbstractChannelAllocator.log() Using jvbbrewery@internal.auth.meet.domain.com/30042aeb-f0b6-45c7-b8c5-8cd2bc6b3fe3 to allocate channels for: OctoParticipant[relays=[]]@568974
789
Jicofo 2021-02-04 21:51:15.130 SEVERE: [446] org.jitsi.jicofo.AbstractChannelAllocator.log() jvbbrewery@internal.auth.meet.domain.com/30042aeb-f0b6-45c7-b8c5-8cd2bc6b3fe3 - failed to allocate channels, will consider the bridge faulty:
 XMPP error: <iq to='focus@auth.meet.domain.com/focus11159447552' from='jvbbrewery@internal.auth.meet.domain.com/30042aeb-f0b6-45c7-b8c5-8cd2bc6b3fe3' id='9jjGN-41445' type='error'><error type='cancel'><internal-server-error xmlns
='urn:ietf:params:xml:ns:xmpp-stanzas'/><text xmlns='urn:ietf:params:xml:ns:xmpp-stanzas' xml:lang='en'>Couldn&apos;t get OctoRelayService</text></error></iq>
org.jitsi.protocol.xmpp.colibri.exception.ColibriException: XMPP error: <iq to='focus@auth.meet.domain.com/focus11159447552' from='jvbbrewery@internal.auth.meet.domain.com/30042aeb-f0b6-45c7-b8c5-8cd2bc6b3fe3' id='9jjGN-41445' typ
e='error'><error type='cancel'><internal-server-error xmlns='urn:ietf:params:xml:ns:xmpp-stanzas'/><text xmlns='urn:ietf:params:xml:ns:xmpp-stanzas' xml:lang='en'>Couldn&apos;t get OctoRelayService</text></error></iq>
        at org.jitsi.impl.protocol.xmpp.colibri.ColibriConferenceImpl.maybeThrowOperationFailed(ColibriConferenceImpl.java:364)
        at org.jitsi.impl.protocol.xmpp.colibri.ColibriConferenceImpl.createColibriChannels(ColibriConferenceImpl.java:268)
        at org.jitsi.jicofo.OctoChannelAllocator.doAllocateChannels(OctoChannelAllocator.java:95)
        at org.jitsi.jicofo.AbstractChannelAllocator.allocateChannels(AbstractChannelAllocator.java:271)
        at org.jitsi.jicofo.AbstractChannelAllocator.doRun(AbstractChannelAllocator.java:190)
        at org.jitsi.jicofo.AbstractChannelAllocator.run(AbstractChannelAllocator.java:150)
        at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:834)
Jicofo 2021-02-04 21:51:15.130 SEVERE: [446] org.jitsi.jicofo.JitsiMeetConferenceImpl.log() One of our bridges failed: jvbbrewery@internal.auth.meet.domain.com/30042aeb-f0b6-45c7-b8c5-8cd2bc6b3fe3
Jicofo 2021-02-04 21:51:15.130 INFO: [446] org.jitsi.jicofo.JitsiMeetConferenceImpl.log() Region info, conference=39026: [[null, null, null, null, null, null, null]]
Jicofo 2021-02-04 21:51:15.130 INFO: [446] org.jitsi.jicofo.JitsiMeetConferenceImpl.log() Region info, conference=39026: [[null, null, null, null, null, null]]
Jicofo 2021-02-04 21:51:15.130 INFO: [446] org.jitsi.jicofo.JitsiMeetConferenceImpl.log() Region info, conference=39026: [[null, null, null, null, null]]
Jicofo 2021-02-04 21:51:15.130 INFO: [446] org.jitsi.jicofo.JitsiMeetConferenceImpl.log() Region info, conference=39026: [[null, null, null, null]]
Jicofo 2021-02-04 21:51:15.130 INFO: [446] org.jitsi.jicofo.JitsiMeetConferenceImpl.log() Region info, conference=39026: [[null, null, null]]
Jicofo 2021-02-04 21:51:15.130 INFO: [446] org.jitsi.jicofo.JitsiMeetConferenceImpl.log() Region info, conference=39026: [[null, null]]
Jicofo 2021-02-04 21:51:15.130 INFO: [446] org.jitsi.jicofo.JitsiMeetConferenceImpl.log() Region info, conference=39026: [[null]]
Jicofo 2021-02-04 21:51:15.130 INFO: [446] org.jitsi.jicofo.JitsiMeetConferenceImpl.log() Region info, conference=39026: [[null]]
Jicofo 2021-02-04 21:51:15.131 WARNING: [446] org.jitsi.jicofo.AbstractParticipant.log() Canceling OctoChannelAllocator[BridgeSession[id=39026_85e5d0, bridge=Bridge[jid=jvbbrewery@internal.auth.meet.domain.com/30042aeb-f0b6-45c7-b8c5-
8cd2bc6b3fe3, relayId=null, region=null, stress=0.08]]@1846754739, OctoParticipant[relays=[]]@568974789]@722581173
Jicofo 2021-02-04 21:51:15.131 INFO: [446] org.jitsi.jicofo.bridge.BridgeSelectionStrategy.log() Selected initial bridge Bridge[jid=jvbbrewery@internal.auth.meet.domain.com/59a6dea2-3077-4fc2-9982-4e1213910c85, relayId=null, region=nu
ll, stress=0.00] with reported stress=0.0 for participantRegion=null using strategy SingleBridgeSelectionStrategy
Jicofo 2021-02-04 21:51:15.131 INFO: [446] org.jitsi.impl.protocol.xmpp.colibri.OperationSetColibriConferenceImpl.log() Conference created: org.jitsi.impl.protocol.xmpp.colibri.ColibriConferenceImpl@1fbc241

Please let me know if you need any more information for troubleshooting.

Cheers
Eric

The most likely cause is octo being enabled in jicofo, but not in the bridge. They need to be kept in sync now, and we need to work on better defaults. The properties are jicofo.octo.enabled and videobridge.octo.enabled (in jicofo.conf and jvb.conf respectively).

Boris

Thanks for your reply @Boris_Grozev i just activated octo in jicofo.conf and jvb.conf according to the reference.conf, but I’m still facing the same issue with the same log output. A hint that might help, I’m running these instance for severe time now, and the jicofo.conf was almost empty, except for a jigasi setting. Most of the stuff ist still configured in sip-communicator.properties. Could this be responsible for the issue? And is there a manual on updating the configs to the current standard?

@Boris_Grozev do you have another suggestion regarding this issue? We’re thinking about to perform a complete new installation, to have clean configs etc., but this will take us a lot of time for our setup with 10 videobridges, so I wanted to make sure to test everything possible ahead.

Thanks in advance!

Ok we were able to locate the issue by setting up a test installation with two bridges. We configured Octo according to jitsi-videobridge/octo.md at master · jitsi/jitsi-videobridge · GitHub with one region for our complete setup.

But WITHOUT setting the testing octo stuff in config.js for meet. This resulted for us in an broken frontend running the newest version.

Now the handovers of conferences if a jvb gets offline work fine again (except for firefox as we know) and other conferences that are running on other bridges are now not affected anymore if one bridge fails.

can i set octo enabled in jvb with org.jitsi.videobridge.octo.enabled in sip-communicator.properties?