Since updating Videobridge from amd64 2.1-595-g3637fda4-1 to 2.1-607-g153f7e4e-1 we’re having trouble with statistics and scaling: The stats-API counts conferences and participants up and up … and never lets them go again. Look at our grafana statistics
Grafana and the attached files
stats-vbdh04.txt (2.5 KB)
stats-vbdh03.txt (2.5 KB)
stats-vbdh02.txt (2.5 KB)
stats-vbdh01.txt (2.6 KB)
you will find the conferences on our videobridges never seem to end - until we shutdown them manually. Thus our scaling fails. Any ideas what is wrong?
Did you update the other components too (jicofo
, jitsi-meet
etc.)?
I can’t repro. I have an old stable (5390), tested that the conferences count decreased when closing a conference, upgraded to latest stable (6826), that corresponds to your jvb version unless I’m mistaken, did the same test, seen the same. Note that there is a few seconds of delay because the stats are cached.
Could it be because of the cache ? Note that if you set the default statistics push time, with the new config (in jvb.conf) the interval duration is not given in milliseconds, but in seconds.
Or maybe the octo bug in jicofo which was recently fixed more info JVB not recognizing completed conferences correctly · Issue #1806 · jitsi/jitsi-videobridge · GitHub
Oh, thank you all for the quick replies! Lots of good ideas to check now, but as we had everything updated and use octo i guess it is the bug @damencho referred to. Will report back as soon as i can verify.
That is fixed in latest stable jicofo.
Aah nNope. Seem,s it’s back again though we have
dpkg -l |grep jitsi
ii jitsi-meet 2.0.6865-2 all WebRTC JavaScript video conferences
ii jitsi-meet-prosody 1.0.5818-1 all Prosody configuration for Jitsi Meet
ii jitsi-meet-web 1.0.5818-1 all WebRTC JavaScript video conferences
ii jitsi-meet-web-config 1.0.5818-1 all Configuration for web serving of Jitsi Meet
ii jitsi-videobridge2 2.1-617-ga8b39c3f-1 all WebRTC compatible Selective Forwarding Unit (SFU)
and Octo Enabled SplitBridgeSelectionStrategy
Relevant stats like https://stats.adfc-intern.de/d/7QGvSR0Gz/basis-info?orgId=1&from=1643734344652&to=1643761737597
Err… did your consider the possibility that some users are neglecting to close their screen after the moderator leaves ? If you don’t have stats at the Prosody level, you can use c2s:show() with the telnet interface, or something like that:
> for k,v in pairs(prosody.full_sessions) do; print(k.." "..v.ip..' '..os.date("%x %X",v.conntime)); end
Maybe some may neglect it, but not as many and usually not all night long and forever. And it didn’t happen before.
Anyway, what is happening at the Prosody level would be interesting to know, this may not be a videobridge stats problem if meetings are never closing.
Thank you for caring, @gpatel-fr
Here is a typical part of the prosody.log
Feb 02 20:14:35 mod_bosh info New BOSH session, assigned it sid '07f204fd-4be5-4774-b7c6-95170e497519'
Feb 02 20:14:36 bosh07f204fd-4be5-4774-b7c6-95170e497519 info Authenticated as fpabgsqzbwvyv9uc@meet.adfc-intern.de
Feb 02 20:14:40 bosh6782858d-5ded-4f65-aa16-a04dbc81df45 info BOSH client disconnected: session close
Feb 02 20:15:45 boshfa0d768e-d1f0-47d6-8f3c-24bc7c192ed8 info BOSH client disconnected: session close
Feb 02 20:16:38 bosh1b18a3f8-794c-4ab5-bccf-81d709a0df4e info BOSH client disconnected: session close
Feb 02 20:16:39 speakerstats.meet.adfc-intern.de:speakerstats_component warn A module has been configured that triggers ext
ernal events.
Feb 02 20:16:39 speakerstats.meet.adfc-intern.de:speakerstats_component warn Implement this lib to trigger external events.
Feb 02 20:16:39 boshc55cf401-563c-456d-8e1b-682384885263 info BOSH client disconnected: session close
I will try to get more usefull log information this evening. What’s the best way to get the relevant info? Set the loglevel to a higher value (which)? Or would you recommend something else I can do without first installing
your log show a client connecting and disconnecting a few minutes later, nothing really mattering it seems.
If you are thinking about the warning, it’s neither relevant nor unimportant.
AFAIK to load a module like admin_telnet in prosody, you need to restart it. But to validate debug logs you need it also. Validating the admin_telnet module is less taxing on perfirmance (you just need to set a firewall if you don’t have one already), you can use the command I have shown to you in my previous post to see the connection state of Prosody.
Thank you for our patience.
I just enabled the telnet-interface and will check this evening’s conferences with c2s:show()
as you wish but the second command is much more powerful
Oh, great, I see. Thank you. I didn’t get, that it really has to start with a >
It puts our a list of currently connected clients. But how do I know since when they are in the sesson, if they are real or just being listed as zombies - and why? Are there more interesting values to query for?
I have taken quite a time to understand it myself.
conntime gives the connection begin date and hour.
this is a IP connection. Moreover, it’s a xmpp connection. I’d say that there is very probably a timeout.
You can validate the full prosody debug log by setting stanza_debug in the main prosody config file but I’d be wary of using that on a production server. Beside you have to figure out the content.
there is probably a way to get at meeting name(s) and even the participant’s nick(s) but I don’t know how :-/. It would probably need a 1000+ characters command line.
Lua introspection capabilities are not so great.
So here’s my yesterday file. You can compare to the grafana statsitics at Grafana and you’ll find at least some of the conferences continuing… until i manually killed my additional videobridges.
log-telnet-1-anonym.txt (26.6 KB)
Do you have any custom lua
module?
Nothing. Should i?
No, you shouldn’t…
Some Lua
module can cause similar issues but it’s fine since you have not…