Prosody bosh timeout issue?

#1

Hi,

In our installation, we saw sometimes prosody considered some of the long gone clients still in the room. This is causing different issues. One of them being the room will never be freed, because some ghost clients are still considered in the room. As a result, if later another meeting is held in the same room, all the history text messages will be dumped to the newly joined participants.

Not sure if we config anything wrong. We were basically following the jitsi instructions. The only related setting we can think of is the bosh_max_inactivity, where we changed the default value 60 to 30. (Testing with the default value now, will report back.)

We’ve not found a reliable way to reproduce this. Seems like happening randomly. We tried to force kill a client or cut off network etc, and we do see the participant timeout after 30 seconds as we set.

We are using jitsi-meet 3548 on a Ubuntu 18.04 server. Prosody was installed together with jitsi-meet. The version is 0.10.0-1.

We have an old setup where we have jitsi-meet 3383 together with prosody 0.9.10-1. We don’t recall seeing similar issue on that server.

Any suggestion how can we solve this or how to nail down the issue? Thanks in advance!

0 Likes

Conference leftover
#2

This is very strange, all bosh connections have a timeout and if there is no request made for a time (60 seconds by default) that connection will be removed, the participant will be dropped and will leave the muc. Are you sure the connected users that stay are using bosh?

By the way, changing the prosody to use 30 seconds for timeouts without changing the client is not a good idea, by default strophe is using 60 seconds and this will lead to problems:

0 Likes

#3

Thanks for your reply Damian! It’s strange in deeded.

We don’t have other clients expect bosh, so they must be bosh.

Thanks for pointing out the concern regarding to strophe. We are testing with the default 60s timeout right now. But I think prosody’s timeout logic shouldn’t rely on client’s setting. It should be pretty straightforward, i.e. if no request from the client for xx seconds, this client is gone. Right?

Another setting we changed is connecting multiple JVBs to Jicofo according to the instructions below. As suggested, we left some slots for future JVBs. As a result, I can see periodical error message in prosody’s log. Not sure if that can cause any problem. I believe Jitsi is doing the same so that you can scale JVBs up and down without changing Jicofo, correct?

The error message looks like this:
Mar 22 00:50:48 jvb8.meet.example.com:component warn Component not connected, bouncing error for:

0 Likes

#4

That is not correct, the client knows what is the timeout, let’s say 60 seconds. So if there was no packet for 58-59 seconds it will send something to keep the connection alive.

Those can be ignored and are normal.

We used to do that, but now we are using control mucs where jvbs connect. For example docker-jitsi is using those mucs.

0 Likes

#5

What I meant is that if the server expects some package at least every 30s and the client sends a package every 60s, then we will see the client be kicked out even quicker instead of remaining in the server.

Going forward, are you switching to use the control mucs instead of component, what’s the plan? Do we need to consider switching now as well for our deployment?

I managed to reproduce this once when I join the meeting then quickly hangup. I saw “The timeout for the confirmation about leaving the room expired.” error in console from here.

I saw below comment in the file, seems like related to the issue?

    // XXX Strophe is asynchronously sending by default. Unfortunately, that
    // means that there may not be enough time to send the unavailable
    // presence. Switching Strophe to synchronous sending is not much of an
    // option because it may lead to a noticeable delay in navigating away
    // from the current location. As a compromise, we will try to increase
    // the chances of sending the unavailable presence within the short time
    // span that we have upon unloading by invoking flush() on the
    // connection. We flush() once before sending/queuing the unavailable
    // presence in order to attemtp to have the unavailable presence at the
    // top of the send queue. We flush() once more after sending/queuing the
    // unavailable presence in order to attempt to have it sent as soon as
    // possible.
0 Likes

#6

I think I found it, at least one problem.

Our client has something like this. If leave failed, it won’t disconnect the connection. As a result if the browser tab is not closed, the connection remains.

conference.room
  .leave()
  .then(() => {
    conference.connection.disconnect();
    resolve();
  })
  .catch(err => {
    console.error('leave room failed:', err);
  });

I’m going to change it to something like this. LG to you?

conference.room
  .leave()
  .then(() => {
  })
  .catch(err => {
    console.error('room leave err:', err);
  })
  .finally(() => {
    conference.connection.disconnect();
    resolve();
  });
0 Likes

#7

Even in this cases, the timeout on the server will kick in and you will see the participant gone in no more than 60 seconds. But you were explaining that they will stay forever.

0 Likes

#8

If I don’t call conference.connection.disconnect(), prosody will consider the client still connected. I confirmed that with the prosody admin console’s user:list command. (Waited for few minutes before checking.)

On the other hand, if I skip leave and call disconnect directly, that seems be fine.

So I seems like disconnect is the one that handles the xmpp connection.

0 Likes

#9

@damencho

I thought I fixed this, but it’s still happening.

This time I made sure all my clients are closed, so it is not the same issue I solved previously.

I searched around and found this post had same problem as mine.

0 Likes

#10

Can you tell me the prosody version you use? And what storage is configured for the mucs in prosody config? Any other modules that you had added?

0 Likes

#11

What version do you use in jitsi’s deployment? It does feels like a prosody issue or prosody config issue.

We didn’t add any other modules except those added in jitsi’s config, bosh, ping etc.

How can I check the storage setting? We didn’t change “storage setting” specifically.

0 Likes

#12

storage = “null”

0 Likes

#13

We are using prosody trunk 747. You can easily check those by opening two tabs and the js console and filter by version.
That is strange … I will check 0.10 to see for some ideas …

0 Likes

#14

Reading the document about prosody storage. There’s a “none” besides “null”. Do you know what is the different?

I don’t know if it’s related, the doc mentioned something like “Prosody 0.10 and later” about internal and sql. It seems like at least something is changed since 0.10, which is the version we are using.

https://prosody.im/doc/storage

We’ve not found a reliable reproduce case…

0 Likes

#15

Oh, restarting prosody does fix the problem. So likely not the storage problem?

0 Likes

#16

It should be something around bosh timeout.

0 Likes

#17

I just checked the 0.10 code and I see

-- The number of seconds a BOSH session should remain open with no requests
local bosh_max_inactivity = module:get_option_number("bosh_max_inactivity", 60);

So bosh connections after 60 seconds of inactivity should be removed, not sure what is happening and those are not removed or participants are not removed. I would enable debugging, repro it and see in the logs is there something interesting after 60 seconds, after the participant connection was cut.

0 Likes

#18

It seems like there was a major rewrite around muc. Maybe better for us to upgrade to 0.11 as well? Plus Jitsi’s deployment is already using truck instead of 0.10.

I tried to upgrade my setup to 0.11.2. During the upgrade, I selected to keep my current configuration. Now it doesn’t work. The prosody log shows an error:
Apr 04 20:39:27 conference.xyz.com:muc error Error restoring room abc@conference.xyz.com from storage: no data storage active

Can you share a setting for the newer version of prosody?

0 Likes

#19

Removing --storage = “null” line in muc config makes the server works again. But you mentioned without this config, there might be other problems? @damencho

0 Likes

#20

So storage=“memory” is better. We already use it for docker-jitsi-meet.
I’m currently evaluating changes that are needed for 0.11. And probably will push them tomorrow in a branch, but it is not much for now.
Yeah, by the way for docker-jitsi-meet we use prosody-0.11.

0 Likes