Thanks, Lyubomir. I've found that the bridge does fire updates to the
last-n notifications on the ENDPOINTS_CHANGED event (which should fire when
endpoints join or leave) so it certainly looks like the bridge has the
right handling. As I've been looking closer at the bridge, I'm starting to
wonder if this is due to issues with the sctp connection between the bridge
and endpoint getting close prematurely for some reason. I think this is
something you guys have said you had heard of and perhaps seen before?
From what I can see in my testing, it looks like the data channel is
getting expired. I've got a theory that I think I've been able to verify
in the logs, what I see happening is this:
1) Channels for content 'data' in the bridge are 'touched' based on the
consent freshness checks, it appears that these happen every 15 seconds.
2) Channels are typically set to a 10 second expiry (believe that's the
default setting and what jitsi-meet uses, we're using it as well)
3) The VideobridgeExpireThread is set to check every 60 seconds.
4) With the times above, we can run into an issue where the
VideobridgeExpireThread can run during a period where the data channel
freshness check has not occurred within 10 seconds (since they only occur
every 15 seconds). This does not happen all the time, I had to try quite a
few times to get "lucky" with the timing to exhibit this. I've attached
some videobridge logs where this happens in the first check and the channel
gets expired. The data channel gets touched here:
2015-07-05 00:05:54.852 INFO: [30] org.jitsi.videobridge.Channel.info() BB:
touching data channel 9828d674424d074e for endpoint
d513fb7e-95d4-4902-83ad-d5dbe9f87fbd
And the VideobridgeExpireThread runs here:
2015-07-05 00:06:06.267 INFO: [13]
org.jitsi.videobridge.VideobridgeExpireThread.info() BB: Looking to expire
channel 9828d674424d074e content type data in conference 3e9d1b2b5d1abd4,
last activity time: 1438733154852, expiration time: 10000, current time:
1438733166267
Which is just outside the 10 second window, so the channel gets expired.
Since the freshness update interval divides evenly into a minute, it's
either sync'd with the VideobridgeExpireThread (the expire thread will
either always see it as being touched within the last 10 seconds) or is
isn't (it will expire it right away), so that helps contribute to the
rarity of this I think. However, I wonder if it's possible that either the
expire thread or the freshness check can be delayed or break its consistent
interval, meaning this could possibly occur later in a call as well.
Does this theory seem plausible? I'll try and do more testing to see if I
can get more data to back this up from my side as well.
-brian
expire_datachannel_right_away (29.8 KB)
···
On Tue, Aug 4, 2015 at 11:58 AM, Lyubomir Marinov < lyubomir.marinov@jitsi.org> wrote:
Hello, Brian!
Thank you very much for the feedback!
2015-08-03 21:15 GMT-05:00 Brian Baldino <brian@highfive.com>:
> I'm noticing some issues when using last-n on the videobridge. I have
> last-n set to 2 and am seeing the following (all clients join muted):
>
> 1) First client joins, consistently receives empty last-n notification
> message
> 2) Second client joins, first client appears to consistently receive a
> last-n notification containing the id of the second client
> 3) Third client joins, first client *mostly* appars to get a last-n
message,
> but I have seen instances where no new message is sent. This causes a
> problem because the last-n message is the trigger to attach the video
> stream.
Our intent is to always send a message to the first client. If no such
message is sent by Videobridge, the behavior is a bug.
> 4) A fourth client joins, no new last-n message sent (none expected,
client
> joins muted and client is already receiving 2 streams: client 2 and
client
> 3)
> 5) Either client 2 or client 3 leaves. The bridge correctly forwards a
new
> video stream to client 1 (since one of its 2 last-n streams is now gone),
> but no last-n notification is received, so client 1 doesn't know which
> stream it has begun to receive data on (in this example obviously there's
> only one possibility, but that would not be true for a larger call).
Our intent is to always send notifications about changes to the list
of last n should an element leave the conference. If no such message
is sent by Videobridge, the behavior is a bug.
> My next step is to take dive into the bridge and take a look as to why
the
> automatic last-n messages that cover the cases when streams are
> auto-forwarded to fill last-n initially, but not when there's a gap due
to
> someone leaving. In the meantime, has anyone else seen this behavior?
Should you decide to go on and fix the issue, we'll gladly work with
you on integrating your contributions.
Best regards,
Lyubomir Marinov
_______________________________________________
dev mailing list
dev@jitsi.org
Unsubscribe instructions and other list options:
http://lists.jitsi.org/mailman/listinfo/dev