Inconsistent state of reserved conference

Hi,

We are running jitsi release version stable-7439 with reservation enabled in prosody.

We run into a strange inconsistent behaviour of jitsi meet. The steps to reproduce it is as follows:

  • Create a reservation and join the conference with 1 or more participants
  • Restart prosody in the middle of the conference. Jitsi meet UI disconnects all participants and shows a message that it is trying to connect
  • Jitsi meet pre join page appears when prosody is available again
  • Users try to rejoin the same conference but reservation system returns a 409 error that the conference already exists

While the behaviour of the reservation system is valid in the sense that the conference is already present and has not been deleted yet through the explicit DELETE /conference call by prosody to the reservation system since prosody got restarted.

What baffles us is prosody treating the conference as a new one without checking for any reservation after being restarted. Attached is a screenshot of the errors received in jitsi meet UI.

Is this an expect6ed behaviour of prosody? Should it forget about any ongoing conferences when it is restarted?

PS: we have SQL storage enabled in prosody.

@shawn @damencho Any clues?

There is no such check on startup.
If you have configured db, then conf the main much the storage is no longer memory?
I was thinking that prosody will restore the room from there… but I’m really not familiar with db and what problems it can cause if the state of those rooms is restored with no jicofo in it …

The DB does not store any data pertaining to reserved conferences when SQL module is enabled on prosody.

On the contrary, we enabled the SQL storage on prosody thinking that the default in memory storage of prosody is what causes this when restarted during an ongoing conference but it does not seem to be so.

Even with default in memory storage of prosody, we get the same error. Jicofo logs that it has lost its connection to the XMPP server

Yep, that is normal behavior. In that case, we consider the shard as unhealthy and move people to a healthy one. But seems the reservation system doesn’t support resumes in that way … so you may consider how that can be handled and PRs are welcome.

Sorry, in AFK this week and cannot looking into this in details.

Off the top of my head, assuming reservation data is not preserved across reboots, then it is down to the reservation endpoints to return a 409 response along with a conflict_id. The plugin will then do a GET call using the conflict_id to retrieve the reservation details.

Frankly, I’ve never fully tested that use case and merely ported over the behaviour from Jicofo. The plugin doesn’t do anything special with the 409 case compared to the 200 case, and it’s there mainly for backward compatibility for existing users. Perhaps we need to do more there if rooms are being preserved across restarts :man_shrugging:

From the error message in your screenshot, it looks like your endpoint is returning a 409 but the plugin could not parse the response. Any useful errors in the logs?

No errors in the prosody log. Prosody behaves as if everything is normal after restart.

Did your endpoint return the 409 response? What does the response payload look like?

The reservation API endpoint returned 409 with the response schema exactly as outlined in tje jitsi reservation systematic document

Well you seem to be hitting this error case:

You will need to debug why it is not able to decode the response.

I think it should be more of join an existing conference maybe? Since prosody got restarted without officially killing the conference

The conference no longer exists once prosody is restarted, all clients are reloading. So this is a new conference from the perspective of prosody, jicofo, and the clients.

Is this a valid use case from the point of jitsi in terms of conference?

Or self hosters should implement their own logic if they want to go back to a previous conference iff prosody gets restarted in the middle of an ongoing conference.

This is how it is designed to work. In case of serious problems the client reloads and joins back to the conference, it can be new prosody cause it restarted, or it can be a new shard cause the one where the conference was got unhealthy for some reason…

If it is a network blip a reconnection is triggered and the client reconnects to the existing conference. If it is in the 1-minute graceful period it will reuse the existing participant in the room, if not a new one will be added and the old one expires and is removed from the room/conference.

How does this pair with reservation enabled? Should the reservation system take care of ensuring whether the room to be allocated should be reused or not?

Sorry, I’m not familiar with the reservation system to be able to answer this.

Someone who can comment about the reservation system? The official docs doesn’t throw much light on this edge case or what the restrictions are

@shawn can you chime in? :slight_smile:

The reservation plugin caches successful join states and only purges when the reservation is expired or if the room is destroyed. So in the case where users disconnect due to network glitch but prosody state is intact, then users should just rejoin the same room without any additional reservation API calls.

But in the case of prosody restarts and state is not persisted, or if everyone gets moved to a new prosody, then a rejoin would behave no different from a new join.

All the reservation plugin does is intercept the initial conference requests to jicofo, and suppresses it if API checks fail.

It does not actually deal with actual room creation and assignment.