Endpoints were suspended due to insufficient bandwidth

Hi Jitsi Team,
After upgarde our JVB to jitsi-videobridge2 2.1-508-gb24f756c-1

We are seeing lots of complain about the camera get to turn off during the meeting.
Even on one -one calls

JVB logs on the same issue show lots of “Endpoints were suspended due to insufficient bandwidth”.

We look at the Jitsi requirement for JVB:
LD [180p] - 200kbps
SD [360p] - 500kbps
HD [720p] - 2500kbps
We are runnig two JVBs on ec2 t3.medium

speedtest: On JVB ec2
Download: 1613.96 Mbit/s
Upload: 691.38 Mbit/s
I did grep all logs which show the range of (bwe= bps) for last two weeks and here is the range:
The top range:

3914335
3913947
3884043
3844058
3795956
.
.

2470728

The lowest Range 30000

Any suggestion if JVB server is weak or clint speed is poor.

The same client doesn’t have issues before even their Internet speed is the same.

Are you seeing the problem on Firefox? I mean, is the browser that sees the videos as “Endpoints were suspended due to insufficient bandwidth” is FF?

Hi @damencho Not FF, It happens on Chrome, Edge, and Safari

either it’s true and it’s logical that your endpoints were suspended or it’s false and bwe can’t be trusted for some reason. AFAIK bw is estimated by clients, not jvb, and it’s done based on tcc and remb parameters (at some point it was said on this forum that FF can do only remb, not tcc). Are they enabled ? AFAIK these estimations are passed through jvb websockets, are they working for all your birdges?
As a quick test (it’s NOT a solution!) can you try to disable bwe on your bridges ?

videobridge {
http-servers {
public {
port = 9090
}
}
websockets {
enabled = true
domain = “xx.xx.xx:443”
tls = true
server-id = 10.10.2.24
}
cc {
trust-bwe = false
}
health {
interval = 3640000
}
}

Do the new release of JVB will fix this issue?

You may want to try the latest unstable, I think it fixes the insufficient bandwidth problem. I’m testing it and so far, I haven’t encountered that problem.

jitsi-videobridge2 2.1-568-g9cbc8644-1

I set up 500kbps D/U connection ,
After applying trust-bwe = false to the JVB the “Endpoints were suspended due to insufficient bandwidth”
is gone but seeing this now ,

JVB 2021-09-30 22:48:35.588 WARNING: [109] [confId=5c5f2d47a48b8da0 epId=2f1e8d2d gid=3282 stats_id=Demetris-Pro conf_name=10284@ conference.xx.xx] TransportCcEngine.tccReceived#170: TCC packet contained received sequence numbers: 43755, 43773, 43790, 43828-43829, 43842. Couldn’t find packet detail for the seq nums: 43755. Latest seqNum was 44762, size is 1000. Latest RTT is 791.057977 ms.

JVB 2021-09-30 22:48:35.786 WARNING: [109] [confId=5c5f2d47a48b8da0 epId=2f1e8d2d gid=3282 stats_id=Demetris-Pro conf_name=10284@ conference.xx.xx] TransportCcEngine.tccReceived#170: TCC packet contained received sequence numbers: 43935-43936, 43943, 43967, 43971, 43976. Couldn’t find packet detail for the seq nums: 43935-43936, 43943. Latest seqNum was 44947, size is 1000. Latest RTT is 792.622657 ms.

JVB 2021-09-30 22:48:35.884 WARNING: [109] [confId=5c5f2d47a48b8da0 epId=2f1e8d2d gid=3282 stats_id=Demetris-Pro conf_name=10284@conference.xx.xx.xx] TransportCcEngine.tccReceived#170: TCC packet contained received sequence numbers: 44021, 44095, 44104, 44106. Couldn’t find packet detail for the seq nums: 44021. Latest seqNum was 45024, size is 1000. Latest RTT is 793.21543 ms.

And when this happens we get black screen.

Those RTT values are very high, and those messages indicate TCC acks coming so late that the JVB has already “forgotten” about the packets it sent. Either there is some serious connectivity issue somewhere between JVB and client, or there is another constraint such as insufficient processing power at JVB side or client side. The t3 range at AWS has quite low performance, what are your CPU usage stats like on the JVB server?




Lower than 40 max

I’d recommend using tools on the server itself, as CW monitoring can miss short-term spikes. Also check CPU steal time which is particularly important on low-end instances. Beyond that, as in my post above, if it’s not CPU utilisation on the server then it’s mostly likely either a connectivity issue between server and client or an overloaded client.

what do you mean by that ? The only thing that I can imagine is that you installed on your client a software capping the network bandwidth. I know of such tools on Linux but never used them. If this is it, if you use only low bandwidth (200kbits) on Jitsi it should be mostly fine (although at this level, any spike generated by other processes on the client could get in the way), but at medium resolution (500kbs) you will run into problems all the time because absolutely no margin of security will exist and Internet and client OSes are not working in real time domain. To work reliably you need considerables security margins.

If your workload did not change at the same time as your upgrade (that would assume that you did not notice) there is either a problem in the Jitsi code of the upgrade you installed, and in this case upgrade to latest or revert to previous are both possible solutions (one uncertain, the other more certain but not attractive from a long term POV), or your system was working before but was without you realizing it already a bit on the limit and a lack of optimization for the new version has triggered this problem. Migration to a new version don’t fix all parameters, it tries to keep old stuff intact but such behaviour can lead to problems. In this case installing on a fresh system can sometimes solve it since you get all default options. Well, it’s complicated but real time systems are complex to optimize, especially if your system has no wide operating security margins.

Thanks All,
After upgrading the JVB look like the issue got resolved

1 Like

Still seeing this problem with insufficient bandwidth problems in the jvb.log, but no errors seen in prosody or nginx.

Jitsi Videobridge2 Version 2.1-570-gb802be83-1
Jitsi Version 5415
JVB instance c5n.2xlarge
JMS c5a.4xlarge.

Servers running Ubuntu 18.04.6 LTS

All chrome users.

see issues showing up in log with only 4-5 users and very bad at just 15 users.

Users are in India, east coast USA, and west coast USA. Hosted in East coast AWS (Virginia).

These are the only users on the entire server cluster and unusable.

I could maybe see this with the folks in India, but it doesn’t make sense that even my client IP (96.79.202.21) in the logs is claiming insufficient bandwidth when I have 100 Mbps down and 35 Mbps upstream.

Suggestions?
Thanks!

example jvb.log output:

JVB 2021-10-25 14:45:40.036 INFO: [568] [confId=5eb989d96964b30f gid=2292 stats_id=Gloria-tCu conf_name=vclass_384528@conference.uat-aws-ediolivemeet.myedio.com ufrag=ced121firt8gov epId=50882fdb local_ufrag=ced121firt8gov] ConnectivityCheckClient.processTimeout#874: timeout for pair: 3.209.52.119:10000/udp/srflx → 96.79.202.21:10808/udp/prflx (stream-50882fdb.RTP), failing.
JVB 2021-10-25 14:45:40.637 INFO: [64] [confId=5eb989d96964b30f epId=0bf4ab8b gid=2292 stats_id=Ana-xwM conf_name=vclass_384528@conference.uat-aws-ediolivemeet.myedio.com] BandwidthAllocator.allocate#326: Endpoints were suspended due to insufficient bandwidth (bwe=32837 bps): d5406b81
JVB 2021-10-25 14:45:43.036 INFO: [568] [confId=5eb989d96964b30f gid=2292 stats_id=Gloria-tCu conf_name=vclass_384528@conference.uat-aws-ediolivemeet.myedio.com ufrag=ced121firt8gov epId=50882fdb local_ufrag=ced121firt8gov] ConnectivityCheckClient.processTimeout#874: timeout for pair: 3.209.52.119:10000/udp/srflx → 96.79.202.21:10808/udp/prflx (stream-50882fdb.RTP), failing.
JVB 2021-10-25 14:45:43.081 INFO: [23] HealthChecker.run#171: Performed a successful health check in PT0S. Sticky failure: false
JVB 2021-10-25 14:45:46.036 INFO: [568] [confId=5eb989d96964b30f gid=2292 stats_id=Gloria-tCu conf_name=vclass_384528@conference.uat-aws-ediolivemeet.myedio.com ufrag=ced121firt8gov epId=50882fdb local_ufrag=ced121firt8gov] ConnectivityCheckClient.processTimeout#874: timeout for pair: 3.209.52.119:10000/udp/srflx → 96.79.202.21:10808/udp/prflx (stream-50882fdb.RTP), failing.
JVB 2021-10-25 14:45:46.172 INFO: [68] [confId=5eb989d96964b30f epId=0bf4ab8b gid=2292 stats_id=Ana-xwM conf_name=vclass_384528@conference.uat-aws-ediolivemeet.myedio.com] BandwidthAllocator.allocate#326: Endpoints were suspended due to insufficient bandwidth (bwe=38336 bps): d5406b81
JVB 2021-10-25 14:45:47.908 INFO: [68] [confId=5eb989d96964b30f epId=0bf4ab8b gid=2292 stats_id=Ana-xwM conf_name=vclass_384528@conference.uat-aws-ediolivemeet.myedio.com] BandwidthAllocator.allocate#326: Endpoints were suspended due to insufficient bandwidth (bwe=44426 bps): d5406b81
JVB 2021-10-25 14:45:49.036 INFO: [568] [confId=5eb989d96964b30f gid=2292 stats_id=Gloria-tCu conf_name=vclass_384528@conference.uat-aws-ediolivemeet.myedio.com ufrag=ced121firt8gov epId=50882fdb local_ufrag=ced121firt8gov] ConnectivityCheckClient.processTimeout#874: timeout for pair: 3.209.52.119:10000/udp/srflx → 96.79.202.21:10808/udp/prflx (stream-50882fdb.RTP), failing.
JVB 2021-10-25 14:45:49.677 INFO: [63] [confId=5eb989d96964b30f epId=0bf4ab8b gid=2292 stats_id=Ana-xwM conf_name=vclass_384528@conference.uat-aws-ediolivemeet.myedio.com] BandwidthAllocator.allocate#326: Endpoints were suspended due to insufficient bandwidth (bwe=51697 bps): d5406b81
JVB 2021-10-25 14:45:51.237 INFO: [68] [confId=5eb989d96964b30f epId=0bf4ab8b gid=2292 stats_id=Ana-xwM conf_name=vclass_384528@conference.uat-aws-ediolivemeet.myedio.com] BandwidthAllocator.allocate#326: Endpoints were suspended due to insufficient bandwidth (bwe=59615 bps): d5406b81
JVB 2021-10-25 14:45:52.036 INFO: [568] [confId=5eb989d96964b30f gid=2292 stats_id=Gloria-tCu conf_name=vclass_384528@conference.uat-aws-ediolivemeet.myedio.com ufrag=ced121firt8gov epId=50882fdb local_ufrag=ced121firt8gov] ConnectivityCheckClient.processTimeout#874: timeout for pair: 3.209.52.119:10000/udp/srflx → 96.79.202.21:10808/udp/prflx (stream-50882fdb.RTP), failing.
JVB 2021-10-25 14:45:53.082 INFO: [23] HealthChecker.run#171: Performed a successful health check in PT0S. Sticky failure: false
JVB 2021-10-25 14:45:53.187 INFO: [68] [confId=5eb989d96964b30f epId=0bf4ab8b gid=2292 stats_id=Ana-xwM conf_name=vclass_384528@conference.uat-aws-ediolivemeet.myedio.com] BandwidthAllocator.allocate#326: Endpoints were suspended due to insufficient bandwidth (bwe=69230 bps): d5406b81
JVB 2021-10-25 14:45:55.037 INFO: [568] [confId=5eb989d96964b30f gid=2292 stats_id=Gloria-tCu conf_name=vclass_384528@conference.uat-aws-ediolivemeet.myedio.com ufrag=ced121firt8gov epId=50882fdb local_ufrag=ced121firt8gov] ConnectivityCheckClient.processTimeout#874: timeout for pair: 3.209.52.119:10000/udp/srflx → 96.79.202.21:10808/udp/prflx (stream-50882fdb.RTP), failing.
JVB 2021-10-25 14:45:55.057 INFO: [70] [confId=5eb989d96964b30f epId=0bf4ab8b gid=2292 stats_id=Ana-xwM conf_name=vclass_384528@conference.uat-aws-ediolivemeet.myedio.com] BandwidthAllocator.allocate#326: Endpoints were suspended due to insufficient bandwidth (bwe=80141 bps): d5406b81
JVB 2021-10-25 14:45:56.937 INFO: [64] [confId=5eb989d96964b30f epId=0bf4ab8b gid=2292 stats_id=Ana-xwM conf_name=vclass_384528@conference.uat-aws-ediolivemeet.myedio.com] BandwidthAllocator.allocate#326: Endpoints were suspended due to insufficient bandwidth (bwe=92271 bps): d5406b81