Random disconnects, Websocket confused

Hello,

I did a fresh install of Jitsi on Debian Buster 10, according to the quickstart guide (…/handbook/docs/devops-guide/devops-guide-quickstart) I made some modifications to the interface and server, regarding logo, lastn, authentication, … but nothing technical. I use the debian default Java 11 environment. Everything runs on a bare metal server with 64GB RAM and a AMD Ryzen 7 3700X.

We have the following problem: If there are more than 300 participants (equals 15-20 conferences) people get randomly disconnected and automatically reconnected after the delay. (didn’t happen with an installation from mid 2019) The server load is small, the traffic low (only the moderator has video and audio turned on), so there’s plenty of processing power free.

I don’t have the expertise to find the exact culprit but still tried to figure out a solution. Upon my research I stumbled upon Websocket, which wasn’t activated in the old installation. In my opinion the latest stable version should have websockets activated by default ( I did a complete fresh installation of both debian and jitsi). Yet, there are some differences in my configuration and some guides:

https://github.com/jitsi/jitsi-videobridge/blob/master/doc/web-sockets.md

My configuration looks like following and in the client config.js there’s no openBridgeChannel

videobridge {
http-servers {
    public {
        port = 9090
    }
}
websockets {
    enabled = true
    domain = "jitsi.svlg-gaildorf.de:443"
    tls = true
}

}

So the question is, what’s correct. Is the guide just outdated. Or is the default configuration wrong (like tls = true, but no tls port in http-server? Websocket activated but no tls port given.

If I start a conference and look at the javacript log in the browser console and filter for websocket I see “Stream resume enabled, but WebSockets are not enabled”, however the following messages are “websocket channel opened”,… So it seems to work. Is this first “problem” related to the authentication?

Then there’s another thread about websockets:
https://community.jitsi.org/t/how-to-how-to-enable-websockets-xmpp-websocket-and-smacks-for-prosody/87920
which solves a similar issue.

But before I try this more advanced changes I wanted to make sure, that the initial config is right. Or do you have some other suggestions to solve the random disconnects. My main problem is, that it only happens with lots of participants.

Default clean installation now comes woth those websockets enabled and those are proxied through the nginx which does the tls termination and the certs used are those from nginx.

1 Like

Thank you, excluding websockets as the problem made it simpler to focus on the problem.

The solution to my problem (and maybe worth to include in the self hosted installation guide):
The default configuration of NGINX and the system limits of debian were set to too low values:

In /var/log/nginx/error.log following error occured:
768 worker connections are not enough

Solution:
I set it to 40000 now:
In /etc/nginx/nginx.conf change it to

worker_connections 40000;

I also checked ulimit -n which reported 1024 back, instead of the 65000 I set according to the install guide. I added to /etc/security/limits.conf

*     hard  nofile    65000
root  hard  nofile    65000

and to use these limit I had to add to both /etc/pam.d/common-session and /etc/pam.d/common-session-interactive following line

session required pam_limits.so

Now it runs fine with more than 300 participants.

Frank, did you actually try 300 participants or do you have a tool to benchmark the server?

It’s used by us teachers and we have about 300-380 participants. But keep in mind, the pupils have their video and audio turned off most of the time.

Interesting, what’s generating the dashboard?

Grafana:

Interesting, thanks.

I was looking at the limits on my servers:

root@eckental-jitsi-2:~# ulimit -Hn
1048576
root@eckental-jitsi-2:~# ulimit -Sn
1024

I think you might be actually decreasing the hard limit to 65000. By default the soft limit is displayed, so what you are doing is decreasing the hard limit from 1048576 to 65000. Was this not the case for you?

Increased my worker connection from 768 to 40000. We will see today. Yesterday I still had pupils dropping off the call. My daughter was one of them. She said she just dropped off the meeting onto the meeting server main page. Was this what was happening to your drops as well?

Any update on this? We too are seeing random disconnects, even without load – just a few people. Our case may be different. We stopped the problem months ago by switching from websockets to bosh… but jonathan’s VP9 pr requires websockets… so now we’re back to websockets and we’re seeing disconnects again. But based on this thread, we’ll explore beyond websockets for some other interaction.