Kicking out members on big calls

Hi all!
Recently, probably after updating jitsi-meet to version 2.0.8044-1 and jitsi-videobridge2 to version 2.2-61-g98c9f868-1, we have a problem in calls with a large number of participants - this was in two calls with more than 200.
We have 5 bridges, octo is on. At the beginning of the call, when a large number of users have just connected, it kicks participants out of one bridge and they are redirected to other bridges. This can be seen from the graph.

On the s-rc-jvb-02 bridge, the number of participants drops sharply, on s-rc-jvb-01 it sharply increases, and on s-rc-jvb-03 you can see drops and a bounce.
At the moment of failure, the microphones of the reconnected participants are unmuted, although the “Everyone starts muted” setting was turned on by the moderator.
The rest of the time the call went through without a hitch. I don’t find any special errors in the logs, only messages about the timeout of participants, like these logs from s-rc-jvb-02:
JVB 2022-11-24 12:02:41.963 INFO: [34013] [confId=2da65b9f53922eed meeting_id=197fad47 epId=318e4095 stats_id=Kevon-Myi local_ufrag=7k5dj1gikdr8su ufrag=7k5dj1gikdr8su] ConnectivityCheckClient.processTimeout#881: timeout for pair: ~External gate IP~:16001/udp/srflx → ~Internal gate IP~:57420/udp/prflx (stream-318e4095.RTP), failing.
JVB 2022-11-24 12:02:43.508 INFO: [34013] [confId=2da65b9f53922eed meeting_id=197fad47 epId=ec04ad44 stats_id=Myrtle-Y5y local_ufrag=e1plt1gikdujl7 ufrag=e1plt1gikdujl7] ConnectivityCheckClient.processTimeout#881: timeout for pair: ~External gate IP~:16001/udp/srflx → ~Internal gate IP~:21740/udp/prflx (stream-ec04ad44.RTP), failing.

Here are the settings from jicofo.conf that might be important:
bridge {
max-bridge-packet-rate = 50000
average-participant-stress = 0.01
stress-threshold = 0.8
failure-reset-threshold = 1 minute
health-checks {
interval = 10 seconds
retry-delay = 5 seconds
brewery-jid = “
octo {
id = 15

I understand that this data is not enough to understand the situation, but can you tell me what can be done to better find the problem.

Have you been monitoring the prosody process cpu usage?
What do you see in jicofo logs, why it moves participants?

Hey, @damencho

Yes, we monitoring all Jitsi infrastructure.
Here is a CPU Utilization graph of server with Jitsi core (prosody, jicofo, nginx, all jitsi-meet packages)

As you can see at problem moment (12.08) CPU usage is normal and much less then 20%

Prosody is a single-threaded process so you need to monitor it. Monitoring the overall CPU is very misleading. You can be hitting 100% on that core and still be on 25% on a 4 Core machine.

You need to verify if that is the case … you can switch and have 2 prosodies, one for the clients and one for the jvbs …

1 Like

Hey, @damencho
Thanks for the quick response.
Anton answered about the processor load. We will be watching this more closely.
In the Jicofo logs, I found a large number of similar messages:
Jicofo 2022-11-24 12:06:29.697 SEVERE: [612] [ meeting_id=197fad47-751f-4a7c-8a69-be5a330bf9fd bridge=jvb-03] Colibri2Session$sendRequest$2.invoke#277: Received error response for updateParticipant, session failed: Unknown endpoint e4fe33fe
Jicofo 2022-11-24 12:06:29.705 SEVERE: [630] [ meeting_id=197fad47-751f-4a7c-8a69-be5a330bf9fd] ColibriV2SessionManager.updateParticipant#463: No ParticipantInfo for e4fe33fe
Jicofo 2022-11-24 12:06:29.706 WARNING: [612] [ meeting_id=197fad47-751f-4a7c-8a69-be5a330bf9fd participant=9bac745b] ParticipantInviteRunnable.lambda$doRun$0#193: Failed to convert ContentPacketExtension to Media:
Jicofo 2022-11-24 12:06:29.765 WARNING: [622] [ meeting_id=197fad47-751f-4a7c-8a69-be5a330bf9fd participant=37e8f5ef] Participant.setInviteRunnable#221: Canceling ParticipantInviteRunnable[Participant[]@2011638787]@1970859924
Jicofo 2022-11-24 12:06:29.766 WARNING: [37] [ meeting_id=197fad47-751f-4a7c-8a69-be5a330bf9fd participant=987dc829] Participant.sendQueuedRemoteSources#602: Can not signal remote sources, Jingle session not established.
Jicofo 2022-11-24 12:06:29.781 SEVERE: [622] [ meeting_id=197fad47-751f-4a7c-8a69-be5a330bf9fd] ColibriV2SessionManager.updateParticipant#463: No ParticipantInfo for 14c22d7c
Jicofo 2022-11-24 12:06:36.065 WARNING: [618] [ meeting_id=197fad47-751f-4a7c-8a69-be5a330bf9fd] JitsiMeetConferenceImpl.onSessionAcceptInternal#1275: No participant found for:
Jicofo 2022-11-24 12:06:36.246 WARNING: [618] [ meeting_id=197fad47-751f-4a7c-8a69-be5a330bf9fd] JitsiMeetConferenceImpl.onTransportInfo#1122: Failed to process transport-info, no session for:
Jicofo 2022-11-24 12:06:57.821 WARNING: [617] [ meeting_id=197fad47-751f-4a7c-8a69-be5a330bf9fd] JitsiMeetConferenceImpl.removeSources#1391: No sources or groups to be removed from e4fe33fe. The requested sources to remove: [audio=, video=, groups=]
Jicofo 2022-11-24 12:07:07.131 SEVERE: [631] [ meeting_id=197fad47-751f-4a7c-8a69-be5a330bf9fd] JitsiMeetConferenceImpl.onSessionAcceptInternal#1282: Reassigning jingle session for participant:
Jicofo 2022-11-24 12:07:37.535 SEVERE: [665] [ meeting_id=197fad47-751f-4a7c-8a69-be5a330bf9fd bridge=jvb-02] Colibri2Session$sendRequest$2.invoke#277: Received error response for updateParticipant, session failed: Unknown endpoint 2fc1c14d
Jicofo 2022-11-24 12:09:17.284 WARNING: [617] [ meeting_id=197fad47-751f-4a7c-8a69-be5a330bf9fd] JitsiMeetConferenceImpl.onMemberLeft#826: Participant not found for 9e886086. Terminated already or never started?

Do you see in the logs something about the faulty jvb?