Jibri lost connection with multiple JVB

Hello,

Jicofo warns about having no operation bridges:

java.lang.RuntimeException: No operational bridges available (total bridge count: 2)
	at org.jitsi.jicofo.health.JicofoHealthChecker.check(JicofoHealthChecker.java:148)
	at org.jitsi.jicofo.health.JicofoHealthChecker.performCheck(JicofoHealthChecker.java:112)
	at org.jitsi.health.HealthChecker.run(HealthChecker.kt:145)
	at org.jitsi.utils.concurrent.RecurringRunnableExecutor.run(RecurringRunnableExecutor.java:216)
	at org.jitsi.utils.concurrent.RecurringRunnableExecutor.runInThread(RecurringRunnableExecutor.java:292)
	at org.jitsi.utils.concurrent.RecurringRunnableExecutor.access$000(RecurringRunnableExecutor.java:36)
	at org.jitsi.utils.concurrent.RecurringRunnableExecutor$1.run(RecurringRunnableExecutor.java:328)

Bridges are up, healthcheck is OK (/about/health), in logs we have:

Jul 21, 2021 12:16:43 PM org.jitsi.utils.logging2.LoggerImpl log
WARNING: Cannot set presence extension: not connected.
Jul 21, 2021 12:16:46 PM org.jitsi.utils.logging2.LoggerImpl log
WARNING: Unable to find encoding matching packet! packet=RtpPacket: PT=97, Ssrc=675561896, SeqNum=11349, M=true, X=true, Ts=4285578558; sources=MediaSourceDesc 1699249704 has encodings:
  primary_ssrc=2468246899,secondary_ssrcs={},layers=
    subjective_quality=0,temporal_id=0,spatial_id=-1
    subjective_quality=1,temporal_id=1,spatial_id=-1
    subjective_quality=2,temporal_id=2,spatial_id=-1
  primary_ssrc=2890625794,secondary_ssrcs={},layers=
    subjective_quality=64,temporal_id=0,spatial_id=-1
    subjective_quality=65,temporal_id=1,spatial_id=-1
    subjective_quality=66,temporal_id=2,spatial_id=-1
  primary_ssrc=3444275179,secondary_ssrcs={},layers=
    subjective_quality=128,temporal_id=0,spatial_id=-1
    subjective_quality=129,temporal_id=1,spatial_id=-1
    subjective_quality=130,temporal_id=2,spatial_id=-1
Jul 21, 2021 12:16:48 PM org.jitsi.utils.logging2.LoggerImpl log
WARNING: Unable to find encoding matching packet! packet=RtpPacket: PT=97, Ssrc=675561896, SeqNum=11350, M=true, X=true, Ts=4285278228; sources=MediaSourceDesc 839677952 has encodings:
  primary_ssrc=2468246899,secondary_ssrcs={},layers=
    subjective_quality=0,temporal_id=0,spatial_id=-1
    subjective_quality=1,temporal_id=1,spatial_id=-1
    subjective_quality=2,temporal_id=2,spatial_id=-1
  primary_ssrc=2890625794,secondary_ssrcs={},layers=
    subjective_quality=64,temporal_id=0,spatial_id=-1
    subjective_quality=65,temporal_id=1,spatial_id=-1
    subjective_quality=66,temporal_id=2,spatial_id=-1
  primary_ssrc=3444275179,secondary_ssrcs={},layers=
    subjective_quality=128,temporal_id=0,spatial_id=-1
    subjective_quality=129,temporal_id=1,spatial_id=-1
    subjective_quality=130,temporal_id=2,spatial_id=-1
Jul 21, 2021 12:16:48 PM org.jitsi.utils.logging2.LoggerImpl log
WARNING: Cannot set presence extension: not connected.
Jul 21, 2021 12:16:51 PM org.jitsi.utils.logging2.LoggerImpl log
WARNING: Received request for a nonexistent endpoint: 9b16fb08(conference dd3628e3ce307a1f)
Jul 21, 2021 12:16:53 PM org.jitsi.utils.logging2.LoggerImpl log
WARNING: Cannot set presence extension: not connected.
Jul 21, 2021 12:16:58 PM org.jitsi.utils.logging2.LoggerImpl log
WARNING: Cannot set presence extension: not connected.
Jul 21, 2021 12:17:31 PM org.jitsi.utils.logging2.LoggerImpl log
WARNING: Unable to find encoding matching packet! packet=RtpPacket: PT=97, Ssrc=675561896, SeqNum=11351, M=true, X=true, Ts=4289438208; sources=MediaSourceDesc 931922323 has encodings:
  primary_ssrc=2468246899,secondary_ssrcs={},layers=
    subjective_quality=0,temporal_id=0,spatial_id=0
    subjective_quality=1,temporal_id=1,spatial_id=0
    subjective_quality=2,temporal_id=2,spatial_id=0
  primary_ssrc=2890625794,secondary_ssrcs={},layers=
    
  primary_ssrc=3444275179,secondary_ssrcs={},layers=

On only one of them there is:

org.jivesoftware.smack.SmackException$ConnectionException: The following addresses failed: '10.233.2.87:5222' failed because: /10.233.2.87 exception: java.net.SocketTimeoutException: connect timed out

Prosody seems healthy too. I added 2 new JVB during the issue to prevent Jicofo crash.

Issue started around Jul 21, 2021 12:15:00 PM. I already had this issue months ago on previous release but could not find the real root cause. We are on stable-5963.

Full logs available here: WeTransfer

  • jvb 1 (jvb-fb6999cf6-jzdln with issue)
  • jvb 2 (jvb-fb6999cf6-rwkpn with issue)
  • jvb 3 (jvb-7d78fd84d5-xqs46 bridge I created during the issue to prevent Jifoco to crash)
  • jvb 4 (bridge I created during the issue to prevent Jifoco to crash. I didn’t exported log as it seems useless)
  • prosody
  • jicofo