Crashing browsers after multi-region JVB setup

Hello,

we had a conference last week with about 65 participants in a single room. Few users actually had camera enabled.

Most users started complaining that their browsers crash (mostly Chrome) and indeed I confirmed that my Chrome CPU usage was super high (like 100%+)

I also noticed very hight number of requests going to BOSH http-bind, I don’t have full analysis of the messages, but most of them I manually inspected was ping.

At that time the setup was 1 signalling server (xmpp/jicofo/prosody/nginx) + frontend and 5 regions X 1 JVB (OCTO) + 4 Jibri in the same region as the signalling server. Websocket data channels proxied by the signalling server, and BOSH XMPP (not websockets).

I didn’t find some straight errors in the logs, but this is what I consider suspicious:


May 01 14:20:28 general error   Top-level error, please report:
/usr/lib/prosody/modules/mod_bosh.lua:483: attempt to index field '?' (a nil value)
May 01 14:20:28 general error   
stack traceback:
    /usr/lib/prosody/modules/mod_bosh.lua:483: in function </usr/lib/prosody/modules/mod_bosh.lua:473>
    (tail call): ?
    /usr/lib/prosody/util/timer.lua:51: in function '?'
    /usr/lib/prosody/net/server_select.lua:917: in function </usr/lib/prosody/net/server_select.lua:861>
    [C]: in function 'xpcall'
    /usr/bin/prosody:400: in function 'loop'
    /usr/bin/prosody:431: in main chunk
    [C]: ?

Not sure if this is related at all as the conference started at 15:00

Jicofo 2020-05-01 15:21:45.726 INFO: [29] org.jitsi.jicofo.JitsiMeetConferenceImpl.log() Region info, conference=ffea93 octo_enabled= true: [[eu-central-1, eu-central-1, eu-central-1, eu-central-1, eu-central-1, eu-central-1, eu-central-1, eu-central-1, eu-central-1, eu-central-1, eu-central-1, eu-central-1, eu-central-1, eu-central-1, eu-central-1, eu-central-1, eu-central-1, eu-central-1, eu-central-1, ap-east-1, sa-east-1, sa-east-1, sa-east-1, ap-east-1, ap-east-1, ap-east-1, ap-east-1, ap-east-1, eu-central-1, ap-east-1, sa-east-1, eu-central-1, eu-central-1, eu-central-1]]

I wonder is this the regions of the participants ? I hope this is not what OCTO “thinks” as available JVBs.

Jicofo 2020-05-01 15:23:34.237 SEVERE: [29] org.jitsi.jicofo.SSRCValidator.log() Too many sources signalled by 98937f29 - dropping: ssrc=1574609343
Jicofo 2020-05-01 15:23:34.239 WARNING: [29] org.jitsi.jicofo.JitsiMeetConferenceImpl.log() Not sending source-add, notification would be empty

No actual idea what this means, but the log level is SEVERE so it could be related.

Related packages on the signalling server:

ii  jitsi-meet                            2.0.4384-1              all                     WebRTC JavaScript video conferences
ii  jitsi-meet-prosody                    1.0.3969-1              all                     Prosody configuration for Jitsi Meet
un  jitsi-meet-tokens                     <none>                  <none>                  (no description available)
un  jitsi-meet-turnserver                 <none>                  <none>                  (no description available)
ii  jitsi-meet-web                        1.0.3969-1              all                     WebRTC JavaScript video conferences
ii  jitsi-meet-web-config                 1.0.3969-1              all                     Configuration for web serving of Jitsi Meet
un  jitsi-videobridge                     <none>                  <none>                  (no description available)
ii  jitsi-videobridge2                    2.1-164-gfdce823f-1     all                     WebRTC compatible Selective Forwarding Unit (SFU)

The package on the JVBs:

ii  jitsi-videobridge                     1126-1                  amd64                   WebRTC compatible Selective Forwarding Unit (SFU)

What I did to “resolve” the issue during the conference is that I stopped 4 of the regions and kept single JVB and turned off data channels. I can’t be really sure if this helped as we lost half of the participants (dropped to about 35) and then it went relatively fine.

We started using Jitsi platform few weeks ago and my experience and understanding of the platform details are still limited. We started with single host setup and we didn’t had any issues until OCTO/multi-region (maybe other config related), so I wonder is there are additional overheat on the client side when using OCTO?

Is anyone else experienced similar issue with OCTO setup and multiple JVBs?
Any hints what could be the issue?
Any general clue about recommended configuration to improve client-side performance ?

P.S. I’m in a process to setup and run jitsi-meet-torture to try to reproduce the problem.

Thanks in advance,
Dincho