P2P always fails since upgrade to 2.06433

Hello beautiful Jitsi community! Big trouble here.

Background:

I’m using Jitsi for 2 years now, on my own deployments, one dev server, one prod server

My use case is 100% one to one, so P2P connection working nearly all the time is crucial. For 2 years, it has been perfect. 1 every 100 connections would fail, the JVB was a fallback then, fine. My bandwidth was safe. Had an upgrade one year before on both deployments, experienced no problem.

With infamous Plan-B dropping from Chrome happening, things began to break last week: an upgrade to last stable was no more an option.

I kept my old Jitsi deployment on one server, and upgraded to the new Jitsi on another one.

Both on Ubuntu 18.04

Old version:
jitsi-meet 2.0.5142-1
jitsi-meet-prosody 1.0.4466-1
jitsi-meet-tokens 1.0.4466-1
jitsi-meet-turnserver 1.0. 4466-1
jitsi-meet-web 1.0. 4466-1
jitsi-meet-web-config 1.0. 4466-1
jitsi-videobridge2 2.1-376-g9f12bfe2-1

New version:
jitsi-meet 2.0.6433-1
jitsi-meet-prosody 1.0.5415-1
jitsi-meet-tokens 1.0. 5415-1
jitsi-meet-turnserver 1.0. 5415-1
jitsi-meet-web 1.0. 5415-1
jitsi-meet-web-config 1.0. 5415-1
jitsi-videobridge2 2.1-570-gb802be83-1

After few hours struggling the regular way on my dev server, everything was working.

In fact, seemed to be working.

I soon discovered monitoring my bandwidth that P2P connection was now ALWAYS failing. After further tests (And tinkering around STUN and TURN settings), I noticed that it was even weirder: if I disconnected one participant and rejoined the room, then P2P was working ALL THE TIME. The nload screenshots from old and new versions shows it well:

Old version (expected behavior):
1b no traffic before conference
2b then a short burst of outgoing traffic, and a tiny rise of incoming traffic upon connection
6b then the conference is running, (almost) no traffic again because it’s P2P

New version (problematic behavior):
1a nothing before conference
2a then a short burst of outgoing traffic, and a tiny rise of incoming traffic upon connection
3a then the conference is running, and all traffic is relayed by the server: P2P failed, that’s my very problem
4a then I disconnect one participant few seconds: back to no traffic
5a this disconnected participant rejoin - short burst of traffic like 2a or 2b
6a the conference is running again, and P2P works this time: almost no traffic (expected behavior)

From here, if I disconnect and reconnect many times, P2P works with EVERY reconnection.

I observed this behavior on EACH of my 50+ tests so far, with clients on same local network, separate networks, behind NAT or not…

It shows that it’s perfectly doable for new Jitsi version to enable a P2P connection, but never on first shot.

Here I am, kind of lost. Why is that happening? What is Jitsi doing a different way on first vs subsequent connections?

Any help would be more than appreciated.

can you confirm the problem with unstable (alpha.jitsi.net) ?

I don’t have this problem, neither on meet.jit.si, nor on alpha.jitsi.net.
Both show P2P working all time.

Only my deployment.

EDIT : I must say that the same problem may be happening on meet.jit.si or alpha.jitsi.net because they always shows P2P in connection infos, but I often get the same while my server still relay data. As I can perform no traffic test on a deployment I don’t host, I cannot be positive.

Running again my test stable version (6433) I see a clear difference between P2P and JVB mode, even if JVB is now involved even in P2P mode; I guess it’s the statistics and bandwidth management.
Anyway, using something like
curl -s localhost:8080/debug/b8f222dc1d1dbd45 | jq “.conferences”
where b8f222dc1d1dbd45 is the ID of a conference (that you can get at with curl -s localhost:8080/debug/) there is a very clear difference between the stats for Node Outgoing statistics that are going a lot faster when JVB is handling traffic (3-ways vs 2-ways). Do you see no difference on your instance ?

Hey @gpatel-fr , thanks for your help.
I’m not sure I got everything in your last post.

What I see for Outgoing Statistics after a curl -s localhost:8080/debug/my-current-conference-id | jq “.conferences” shows no real difference from P2P to JVB (or I have to be taught how to read this output).

If it’s about my being sure that P2P or JVB is used on my instance, I’m sure thanks to global traffic monitoring on a server I’m the only one to use. And it’s JVB on first connexion, P2P if I reconnect. that I’m 100% sure of.

let’s on my tiny test server with very low bandwidth, for Node Outgoing statistics for a node in p2p I get a few hundreds packets after one minute, followed by a grand total of 0 for the 2 following minutes. In JVB mode (P2p disabled) I get around 5000 packets per minute. You should be able to differentiate the 2 cases.

you mean “packet_count”, not “num_output_packets” or any other, right ?

Some prosody stuff doesn’t run under ubuntu server 18 (you should upgrade to 20.04), if my memory servers me right. Maybe this behaviour is somehow rooted there.

in fact any of these counters show a similar difference between p2p and jvb so it don’t matter.

FYI the title of your post is no longer accurate, latest stable is 6689 now.

Well. Out of despair yesterday morning I tried to give a shot to Ubuntu 18.04 → 20.04 upgrade.
Performed a regular apt-get update + upgrade before upgrading the OS, noticed in the logs that jitsi had been updated to 6689, thanks to last @gpatel-fr comment.

Then I don’t know why, I reflexively restarted nginx, prosody, jicofo, coturn, videobrige, etc. for a thousands time in few hours, and tried and launch a session. And guess what ? Everything was working as expected again. And again, and again. P2P just works first shot now.

:thinking:

It’s been 24+ hours now, tested in every way, it just works.

Thanks all, your pieces of advice leaded to a working setup. Though we won’t know how, why or what exactly happened. That’s the mystery of life.

And thanks to amazing Jitsi team, you’re doing a GREAT job.

In code we trust

2 Likes

I noticed this strange behavior, though:
session on, P2P active, no traffic on the jitsi server. Then I share screen : the screen-share part (bitrate is the same) goes through the server instead of P2P (but still P2P indicator in the UI)
Then I stop sharing screen : P2P on still, but all the stream of the user that shared go through the server now. It stops only upon leaving / rejoining the room.

I can live with that, but that still puzzles me a bit.

EDIT :

@gpatel-fr : I tried curl -s localhost:8080/debug/my-current-conference-id | jq “.conferences”
while advertised as having a p2p from Jitsi-meet UI, but observing full stream going through my server in the same time. :open_mouth:

“Node Outgoing statistics tracker 1128048031”: {
“num_input_packets”: 0,
“num_output_packets”: 0,
“num_discarded_packets”: 0,
“total_time_spent_ns”: 0,
“max_packet_process_time_ms”: 0,
“num_input_bytes”: 0,
“duration_ms”: 0,
“total_time_spent_ms”: 0,
“average_time_per_packet_ns”: 0,
“processing_throughput_mbps”: null,
“throughput_mbps”: null
},

I’m slightly baffled since all the countersof the data you posted are at zero so I can’t understand what you mean by ‘full stream’. This said, I have never searched through all these counters to see if some are specialized for screen cast.

it’s weird actually.

Maybe a full example will explain better :

  • user A ← p2p → userB : conference is on, no observable traffic on my server, 13Mbps from A, 9.5Mbs from B after GSM popover figures from UI.
  • User A shares his screen : GSM popover still indicates p2p is on , but I see a 150 - 200 kbps constant load on my nload on the server where it was near 0.
  • User A stops sharing : load on server rises to 13 Mbps (same as sent by user A camera)
  • User A mutes Camera : load back to 0 kbps (8kbps to say the truth)
  • User A unmutes Camera : load rises again up to User’s A stream bitrate (13Mbps).
  • User A or B disconnect → rejoin : back to 0 load on server.

I’m not even sure it goes through JVB. But I’m sure this exact amount of data sent by User A goes in
and then out of my server.

It’s not a big problem because my users typically don’t use screen-sharing. But will soon become one if they change their mind !

NOTE : enabling/disabling simulcast or suspension layer in mysite-config.js has no influence at all on this behavior

And I noticed I experience the same “P2P connection” with one participant’s stream going through the bridge anyway :

If I start the conference with A having no camera plugged, then plug it, all A’s stream now go through server despite the P2P connection.

:sob:

EDIT: further tests latter…

  • Tested that the stream, when going through server always do so through JVB

  • Also observed that with preferredCodec both video and p2p set to “H264” the incoming traffic to server stays correlated with stream bitrate, while outgoing traffic rises slowly to the same point and then drops suddenly back to near 0. In the meantime, conference is still running smoothly, which proves it is a ghost stream (see screenshot - click photo to see connexion info on jitsi UI, was xxxx.xxx(p2p) in the ‘more details’ section at that time) :

  • With VP8, both incoming AND outgoing stay consistently correlated to resumed stream bitrate.

Same on two servers with same versions on both - 6689

Nobody else can reproduce that ? that seems like a big bandwidth waste, and it’s not a cheap ressource those days!

I can reproduce your problem with Chrome and latest unstable. Or I think I can repro it, I don’t know exactly how you do your measures, but I can see on transceiver/endpointConnectionStats in debug monitoring similar changes when enabling screensharing. If you create a Github issue you could include that since it may be easier to repro for Jitsi devs (Don’t raise your hopes too much on this, your problem may be looked at as ‘marginal’)
Sorry, I don’t think I will have much time for Jitsi problems on this forum till end of year so good luck with it (that will do for a reply to your private message also)
FTR I think it may be a switch to unified plan Chrome problem since I don’t see it with 2 PC using Firefox (Firefox has always used unified plan)

1 Like

@gpatel-fr : Thanks a lot for that test. I’m quite sure it’s a unified plan problem also.

I tracked the problem on any media change : switching camera thru UI produces the same result.

I can handle the overload for a little while, not everyone uses different cameras or shares screen. But still it’s a problem I’d love to be addressed. Or it will become a big issue at some point. I’ll create an issue on github (low hope and so on, I got your point and thinks the same : it’s marginal from global Jitsi use perspective).

Thanks again.