Octo not sharing load for single meeting (scaling setup)

Hi all

I am trying to setup jitsi meet and jvb using the devops scaling guide

Our setup is on EC2 with:
Instance 1 (Jitsi meet server):

  • nginx
  • prosody
  • jicofo
  • jitsi-meet-web
  • jitsi-meet-prosody
  • jitsi-meet-web-config

Instance 2 (JVB):

  • jitsi-videobridge2

Instance 3 (JVB):

  • jitsi-videobridge2

I want a single meeting’s load to be shared between instance 2 and instance 3. I’ve stopped the JVB in instance 1 manually. I have followed the octo setup here and set BRIDGE_SELECTION_STRATEGY SplitBridgeSelectionStrategy

I’m able to ping the jitsi server from both video-bridges. However only one of the instances takes all the load. If i stop 1st JVB, 2nd JVB takes all the load and vice-versa.

I’ve attached all the config files and logs.
masked_jvb2.log (24.3 KB)
masked_jvb1.log (60.5 KB)
jvb-sip-communicator.properties.txt (2.2 KB)
prosody.cfg.lua.txt (8.8 KB)
prosody.err.txt (1.1 KB)
prosody.log.txt (5.6 KB)
combinedconf.txt (5.5 KB)
jicofo.log.txt (16.7 KB)

I changed my video bridge sip-communicator.properties file as below:

org.ice4j.ice.harvest.DISABLE_AWS_HARVESTER=true
org.ice4j.ice.harvest.STUN_MAPPING_HARVESTER_ADDRESSES=meet-jit-si-turnrelay.jitsi.net:443
org.jitsi.videobridge.ENABLE_STATISTICS=true
org.jitsi.videobridge.STATISTICS_TRANSPORT=muc
org.jitsi.videobridge.xmpp.user.shard-2.HOSTNAME=jitsi.example.com
org.jitsi.videobridge.xmpp.user.shard-2.DOMAIN=auth.jitsi.example.com
org.jitsi.videobridge.xmpp.user.shard-2.USERNAME=jvb
org.jitsi.videobridge.xmpp.user.shard-2.PASSWORD=JQcZQAbd
org.jitsi.videobridge.xmpp.user.shard-2.MUC_JIDS=JvbBrewery@internal.auth.jitsi.example.com
org.jitsi.videobridge.xmpp.user.shard-2.MUC_NICKNAME=jvb-2
org.jitsi.videobridge.xmpp.user.shard-2.DISABLE_CERTIFICATE_VERIFICATION=true
# the address to bind to locally
org.jitsi.videobridge.octo.BIND_ADDRESS=0.0.0.0
# the address to advertise (in case BIND_ADDRESS is not accessible)
org.jitsi.videobridge.octo.PUBLIC_ADDRESS=1.2.3.4
# the port to bind to
org.jitsi.videobridge.octo.BIND_PORT=4096
# the region that the jitsi-videobridge instance is in
org.jitsi.videobridge.REGION=region1

And on the other jvb, keeping it same except changing shard-2 to shard-1.
Now when one jvb stops, the load moves to another jvb. But when both are running the load is not distributed. One of the JVBs only show heath checks like below:

JVB 2021-06-09 07:03:09.134 INFO: [24] VideobridgeExpireThread.expire#140: Running expire()
JVB 2021-06-09 07:03:09.165 INFO: [25] HealthChecker.run#171: Performed a successful health check in PT0.000015304S. Sticky failure: false
JVB 2021-06-09 07:03:19.166 INFO: [25] HealthChecker.run#171: Performed a successful health check in PT0.000016627S. Sticky failure: false
JVB 2021-06-09 07:03:29.165 INFO: [25] HealthChecker.run#171: Performed a successful health check in PT0.00001586S. Sticky failure: false
JVB 2021-06-09 07:03:39.165 INFO: [25] HealthChecker.run#171: Performed a successful health check in PT0.000016321S. Sticky failure: false
JVB 2021-06-09 07:03:49.165 INFO: [25] HealthChecker.run#171: Performed a successful health check in PT0.000015615S. Sticky failure: false
JVB 2021-06-09 07:03:59.165 INFO: [25] HealthChecker.run#171: Performed a successful health check in PT0.00002242S. Sticky failure: false
JVB 2021-06-09 07:04:09.134 INFO: [24] VideobridgeExpireThread.expire#140: Running expire()
JVB 2021-06-09 07:04:09.165 INFO: [25] HealthChecker.run#171: Performed a successful health check in PT0.000014951S. Sticky failure: false
JVB 2021-06-09 07:04:19.165 INFO: [25] HealthChecker.run#171: Performed a successful health check in PT0.000016194S. Sticky failure: false
JVB 2021-06-09 07:04:29.165 INFO: [25] HealthChecker.run#171: Performed a successful health check in PT0.000016484S. Sticky failure: false
JVB 2021-06-09 07:04:39.166 INFO: [25] HealthChecker.run#171: Performed a successful health check in PT0.000015225S. Sticky failure: false
JVB 2021-06-09 07:04:49.166 INFO: [25] HealthChecker.run#171: Performed a successful health check in PT0.000016422S. Sticky failure: false
JVB 2021-06-09 07:04:59.165 INFO: [25] HealthChecker.run#171: Performed a successful health check in PT0.000016168S. Sticky failure: false
JVB 2021-06-09 07:05:09.134 INFO: [24] VideobridgeExpireThread.expire#140: Running expire()
JVB 2021-06-09 07:05:09.165 INFO: [25] HealthChecker.run#171: Performed a successful health check in PT0.000014986S. Sticky failure: false
JVB 2021-06-09 07:05:19.165 INFO: [25] HealthChecker.run#171: Performed a successful health check in PT0.000016073S. Sticky failure: false
JVB 2021-06-09 07:05:29.165 INFO: [25] HealthChecker.run#171: Performed a successful health check in PT0.000016679S. Sticky failure: false
JVB 2021-06-09 07:05:39.166 INFO: [25] HealthChecker.run#171: Performed a successful health check in PT0.000015748S. Sticky failure: false
JVB 2021-06-09 07:05:49.165 INFO: [25] HealthChecker.run#171: Performed a successful health check in PT0.000016066S. Sticky failure: false

While the other shows logs whenever a participant joins or leaves:

JVB 2021-06-09 07:03:56.373 WARNING: [79] [confId=bb52d059587c4e04 gid=52044 stats_id=Niko-SBo componentId=1 conf_name=superautosdisturbterribly@conference.jitsi.example.com ufrag=7divn1f7np9fdt name=stream-d2a92c6a epId=d2a92c6a local_ufrag=7divn1f7np9fdt] MergingDatagramSocket.doRemove#349: Removing the active socket. Won't be able to send until a new one is elected.
JVB 2021-06-09 07:03:56.374 INFO: [84] [confId=bb52d059587c4e04 gid=52044 stats_id=Niko-SBo componentId=1 conf_name=superautosdisturbterribly@conference.jitsi.example.com ufrag=7divn1f7np9fdt name=stream-d2a92c6a epId=d2a92c6a local_ufrag=7divn1f7np9fdt] MergingDatagramSocket.close#142: Closing.
JVB 2021-06-09 07:03:56.374 INFO: [68] [confId=bb52d059587c4e04 epId=d2a92c6a local_ufrag=7divn1f7np9fdt gid=52044 stats_id=Niko-SBo conf_name=superautosdisturbterribly@conference.jitsi.example.com] IceTransport.startReadingData#203: Socket closed, stopping reader
JVB 2021-06-09 07:03:56.374 INFO: [68] [confId=bb52d059587c4e04 epId=d2a92c6a local_ufrag=7divn1f7np9fdt gid=52044 stats_id=Niko-SBo conf_name=superautosdisturbterribly@conference.jitsi.example.com] IceTransport.startReadingData#215: No longer running, stopped reading packets
JVB 2021-06-09 07:03:56.374 INFO: [81] [confId=bb52d059587c4e04 gid=52044 stats_id=Niko-SBo componentId=1 conf_name=superautosdisturbterribly@conference.jitsi.example.com ufrag=7divn1f7np9fdt name=stream-d2a92c6a epId=d2a92c6a local_ufrag=7divn1f7np9fdt] MergingDatagramSocket$SocketContainer.runInReaderThread#770: Failed to receive: java.net.SocketException: Socket closed
JVB 2021-06-09 07:03:56.375 INFO: [84] [confId=bb52d059587c4e04 epId=d2a92c6a gid=52044 stats_id=Niko-SBo conf_name=superautosdisturbterribly@conference.jitsi.example.com] Endpoint.expire#1014: Expired.
JVB 2021-06-09 07:04:02.234 INFO: [25] HealthChecker.run#171: Performed a successful health check in PT0.000011914S. Sticky failure: false
JVB 2021-06-09 07:04:12.199 INFO: [22] VideobridgeExpireThread.expire#140: Running expire()
JVB 2021-06-09 07:04:12.199 INFO: [22] VideobridgeExpireThread.expire#146: Conference bb52d059587c4e04 should expire, expiring it
JVB 2021-06-09 07:04:12.199 INFO: [85] [confId=bb52d059587c4e04 gid=52044 conf_name=superautosdisturbterribly@conference.jitsi.example.com] Conference.expire#498: Expiring.
JVB 2021-06-09 07:04:12.199 INFO: [85] [confId=bb52d059587c4e04 gid=52044 conf_name=superautosdisturbterribly@conference.jitsi.example.com] EndpointConnectionStatusMonitor.stop#66: Stopped
JVB 2021-06-09 07:04:12.200 INFO: [85] [confId=bb52d059587c4e04 gid=52044 conf_name=superautosdisturbterribly@conference.jitsi.example.com] Conference.updateStatisticsOnExpire#560: expire_conf,duration=79,has_failed=false,has_partially_failed=false

Should i keep them as shard-1 and shard-2 or it should be shard at both places?
@Freddie @damencho can you please take a look if the configs are right?