lastN Killing video streams

I have two custom lib-jitsi-meet based clients which are joining a channel, both of which have no AudioContext, so lib-jitsi-meet is not getting audio level data correctly.

The channelLastN has been set to -1, and I have even tried making sure -1 is sent via the data channel via BridgeChannel.js.

I still see lastN messages coming in from the bridge. When I first publish my stream, the remote client sees:

2020-05-27T17:21:36.750Z [modules/RTC/BridgeChannel.js] <RTCDataChannel.channel.onmessage>:  Channel new last-n event:  [ 'd8c09767' ] {
  colibriClass: 'LastNEndpointsChangeEvent',
  lastNEndpoints: [ 'd8c09767' ],
  endpointsEnteringLastN: [ 'd8c09767' ],
  conferenceEndpoints: []
}

Everything streams, and looks normal, but suddenly I get on the receiving client a lastN:

2020-05-27T17:21:44.200Z [modules/RTC/BridgeChannel.js] <RTCDataChannel.channel.onmessage>:  Channel new last-n event:  [] {
  colibriClass: 'LastNEndpointsChangeEvent',
  lastNEndpoints: [],
  endpointsEnteringLastN: [],
  conferenceEndpoints: [ 'd8c09767' ]
}

And as a result the track is removed:

2020-05-27T17:21:44.200Z [modules/connectivity/ParticipantConnectionStatus.js] <ParticipantConnectionStatusHandler.figureOutConnectionStatus>:  Figure out conn status for d8c09767, is video muted: false is active(jvb): true video track frozen: false p2p mode: false is in last N: false currentStatus => newStatus: active => active
2020-05-27T17:21:56.282Z [modules/RTC/TraceablePeerConnection.js] <TraceablePeerConnection../modules/RTC/TraceablePeerConnection.js.TraceablePeerConnection.removeRemoteTracks>:  TPC[1,p2p:false] removed remote tracks for d8c09767 count: 1
2020-05-27T17:21:56.282Z [modules/RTC/RTC.js] <RTC.removeRemoteTracks>:  Removed remote tracks for d8c09767 count: 1

Why am I still getting these lastN messages from the bridge when it has been set to disabled?

Is there some mechanism to fully disable lastN?

One additional datapoint:

When I join with my two custom clients, where AudioContext is missing so we aren’t getting dominant speaker events from them, eventually this lastN thing occurs where the video track stalls.

The moment I join with the standard jitsi-meet client from a browser, suddenly the video starts streaming again for my two custom clients.

Is it possible some of the logic for lastN is going south because it isn’t being updated by the custom clients with their audio level information?

Just to summarize:

Custom client joins without AudioContext:

2020-05-27T18:03:49.455Z [modules/RTC/BridgeChannel.js] <RTCDataChannel.channel.onmessage>:  Channel new last-n event:  [ '42a4ae40' ] {
  colibriClass: 'LastNEndpointsChangeEvent',
  lastNEndpoints: [ '42a4ae40' ],
  endpointsEnteringLastN: [ '42a4ae40' ],
  conferenceEndpoints: []
}

We eventually receive empty lastN, video stalls:

2020-05-27T18:03:56.846Z [modules/RTC/BridgeChannel.js] <RTCDataChannel.channel.onmessage>:  Channel new last-n event:  [] {
  colibriClass: 'LastNEndpointsChangeEvent',
  lastNEndpoints: [],
  endpointsEnteringLastN: [],
  conferenceEndpoints: [ '42a4ae40' ]
}

When the third client (The browser with AudioContext) joins, we see:

2020-05-27T18:04:10.829Z [modules/statistics/RTPStatsCollector.js] <StatsCollector../modules/statistics/RTPStatsCollector.js.StatsCollector.processAudioLevelReport>:  725424180 not enough data
2020-05-27T18:04:10.829Z [modules/statistics/RTPStatsCollector.js] <StatsCollector../modules/statistics/RTPStatsCollector.js.StatsCollector.processAudioLevelReport>:  725424180 not enough data
2020-05-27T18:04:10.829Z [modules/statistics/RTPStatsCollector.js] <StatsCollector../modules/statistics/RTPStatsCollector.js.StatsCollector.processAudioLevelReport>:  1240881451 not enough data
2020-05-27T18:04:10.912Z [modules/RTC/BridgeChannel.js] <RTCDataChannel.channel.onmessage>:  Channel new dominant speaker event:  5c374023
2020-05-27T18:04:10.912Z [modules/RTC/BridgeChannel.js] <RTCDataChannel.channel.onmessage>:  Channel new last-n event:  [ '5c374023' ] {
  colibriClass: 'LastNEndpointsChangeEvent',
    lastNEndpoints: [ '5c374023' ],
      endpointsEnteringLastN: [ '5c374023' ],
        conferenceEndpoints: [ '5c374023', '42a4ae40' ]
	}
	2020-05-27T18:04:10.912Z [modules/connectivity/ParticipantConnectionStatus.js] <ParticipantConnectionStatusHandler._onLastNChanged>:  leaving/entering lastN [] [ '5c374023' ] 1590602650912
	2020-05-27T18:04:10.913Z [modules/connectivity/ParticipantConnectionStatus.js] <ParticipantConnectionStatusHandler.figureOutConnectionStatus>:  Assuming connection active by JVB - no notification
	2020-05-27T18:04:10.913Z [modules/connectivity/ParticipantConnectionStatus.js] <ParticipantConnectionStatusHandler.figureOutConnectionStatus>:  Figure out conn status for 5c374023, is video muted: false is active(jvb): true video track frozen: false p2p mode: false is in last N: true currentStatus => newStatus: active => active
	2020-05-27T18:04:14.459Z [modules/RTC/BridgeChannel.js] <RTCDataChannel.channel.onmessage>:  SelectedUpdateEvent isSelected? true
	2020-05-27T18:04:15.443Z [modules/RTC/BridgeChannel.js] <RTCDataChannel.channel.onmessage>:  Channel new last-n event:  [ '5c374023', '42a4ae40' ] {
	  colibriClass: 'LastNEndpointsChangeEvent',
	    lastNEndpoints: [ '5c374023', '42a4ae40' ],
	      endpointsEnteringLastN: [ '42a4ae40' ],
	        conferenceEndpoints: [ '5c374023', '42a4ae40' ]
		}
		2020-05-27T18:04:15.443Z [modules/connectivity/ParticipantConnectionStatus.js] <ParticipantConnectionStatusHandler._onLastNChanged>:  leaving/entering lastN [] [ '42a4ae40' ] 1590602655443
		2020-05-27T18:04:15.443Z [modules/connectivity/ParticipantConnectionStatus.js] <ParticipantConnectionStatusHandler.figureOutConnectionStatus>:  Assuming connection active by JVB - no notification
		2020-05-27T18:04:15.443Z [modules/connectivity/ParticipantConnectionStatus.js] <ParticipantConnectionStatusHandler.figureOutConnectionStatus>:  Figure out conn status for 42a4ae40, is video muted: false is active(jvb): true video track frozen: false p2p mode: false is in last N: true currentStatus => newStatus: active => active

And the video starts to stream again for the custom client.

I have checked the documentation on last-N: https://github.com/jitsi/jitsi-videobridge/blob/master/doc/last-n.md

First, when allocating video channels for a client on the Jitsi Videobridge, the "last-n" value of the video channel must >= 0. Secondly, a data channel must be established between the client and the Jitsi Videobridge. 

The above documentation makes it seem like it should be disabled entirely. I don’t meet either of those two requirements.

I have tried sending up dummy audio statistics data, and nothing I do seems to prevent the dreaded unceremonious transition from:

2020-05-27T21:25:19.571Z [modules/RTC/BridgeChannel.js] <RTCDataChannel.channel.onmessage>:  Channel new last-n event:  [ '7e14771b' ] {
  colibriClass: 'LastNEndpointsChangeEvent',
  lastNEndpoints: [ '7e14771b' ],
  endpointsEnteringLastN: [ '7e14771b' ],
  conferenceEndpoints: []
}

to

2020-05-27T21:25:59.394Z [modules/RTC/BridgeChannel.js] <RTCDataChannel.channel.onmessage>:  Channel new last-n event:  [] {
  colibriClass: 'LastNEndpointsChangeEvent',
  lastNEndpoints: [],
  endpointsEnteringLastN: [],
  conferenceEndpoints: [ '7e14771b' ]
}

This seems like it almost certainly has to be a videobridge bug of some kind.

The bridge logs have last-n set as -1 too:

<channel endpoint="6bc05b67" expire="60" id="b6f4aa3af8e6a483" initiator="true" channel-bundle-id="6bc05b67" last-n="-1" rtp-level-relay-type="translator">

Eventually the bridge will come along and fire a lastN that adds back in my endpoint.

Not even pinning the endpoint seems to resolve the issue.

If I use JitsiConference ‘pinParticipant’ I still eventually have the bridge come along and kill my stream.

Any help from anyone that can explain this wild behavior would be greatly appreciated.

This appears to be the same issue:

  • Simulcast disabled.
  • lastN disabled.
  • No Active Speaker events being sent.

Pinning endpoints doesn’t help, but apparently using ‘selectParticipants’ seems to make it happen far less often.

I still get the lastN with an empty set of candidates every so often, maybe it’s timing related? Obviously this isn’t a fix of any kind, but possibly could hint at the cause?

Hi @Jason_Thomas,

You should interpret the lastNEndpoints array as the list of endpoints that the bridge was able to able to send given a) the bandwidth estimation of the network link from the bridge to your particular receiver and b) the specific constraints signaled by the receiver.

Since you are developing custom clients it may be good to start with disable adaptivity, and enable it again when you have a working client. You can disable adaptivity by setting

org.jitsi.videobridge.TRUST_BWE=false

in /etc/jitsi/videobridge/sip-communicator.properties file.

If that doesn’t help, can you grab a webrtc-internals dump and post it here?

Cheers,
George

Hi @gpolitis,

Thanks for your reply!

I set that flag per the other issue last night, and it does indeed prevent the messages and stream shutdown as was indicated in the above linked issue.

I think based on looking at the BitrateController code in the bridge, you’re absolutely right, even though lastN is disabled, it will create a lastN group the size of the current conference, and then use the bitrate estimation to determine what to share.

I think the origin has to be that the bitrate being reported by my clients (I see stats correctly reported when joining via a browser) maybe are not accurate?

Why is it though, that when a regular client joins, suddenly the bwe stats for the entire conference are determined to be good enough to support the video streams for all of the rest of the clients, and all of the streams start?

If the intent is to shut all video feeds off when the bitrate for the group is too low, and move to an audio only conference, this sounds like it should be something that is configurable without completely disabling bwe?

Might it be that lib-jitsi-meet is detecting my browser differently and handling the stats differently?

Hi @Jason_Thomas,

I think the origin has to be that the bitrate being reported by my clients (I see stats correctly reported when joining via a browser) maybe are not accurate?

Right. Bandwidth estimations/distributions should work regardless of your client, but you need a) the correct signaling that enables the necessary RTP header extensions and simulcast and b) you also need to let the bridge know of the UI layout that is displayed on the endpoint (who’s on-stage, etc) so that the bridge is able to determine what it needs to send to each participant.

A good starting point would be to determine the simplest scenario that’s broken and carefully collect the webrtc-internals dump and JS console from all the endpoints.

Why is it though, that when a regular client joins, suddenly the bwe stats for the entire conference are determined to be good enough to support the video streams for all of the rest of the clients, and all of the streams start?

That’s a good question. I don’t have a good answer for right now, so let’s keep it as a clue.

If the intent is to shut all video feeds off when the bitrate for the group is too low, and move to an audio only conference, this sounds like it should be something that is configurable without completely disabling bwe?

If you don’t have enough bandwidth for video, then the only viable solution is to go audio only. The real question is, I think, is bwe working? and if so, is it working correctly? I suggested disabling bwe as a troubleshooting step, normally you shouldn’t have to do this.

Cheers,
George