Dropping a packet because the queue is full

Hi,
Since the upgrade to the jvb2, I noticed some strange drop in my banwidth graph usage, that goes for example from more than 60-80Mbps to 0Mbps during few secondes. At first I thought that it was only a glitch from my monitoring tools.
But then, by checking real instant banwidth usage with bmon, I saw that it was real.
Checking the logs show me that durring these periods, I have a lot of following suspicious messages:

2020-04-15 18:34:30.140 INFOS: [58] org.ice4j.ice.harvest.AbstractUdpListener$MySocket.addBuffer: Dropping a packet because the queue is full. Remote address = /X.X.X.X:51853 ufrag=ea1s71e5v9s8l2
2020-04-15 18:34:30.141 INFOS: [58] org.ice4j.ice.harvest.AbstractUdpListener$MySocket.addBuffer: Dropping a packet because the queue is full. Remote address = /X.X.X.X:51853 ufrag=ea1s71e5v9s8l2
2020-04-15 18:34:30.141 INFOS: [58] org.ice4j.ice.harvest.AbstractUdpListener$MySocket.addBuffer: Dropping a packet because the queue is full. Remote address = /X.X.X.X:51853 ufrag=ea1s71e5v9s8l2
2020-04-15 18:34:30.141 INFOS: [58] org.ice4j.ice.harvest.AbstractUdpListener$MySocket.addBuffer: Dropping a packet because the queue is full. Remote address = /X.X.X.X:51853 ufrag=ea1s71e5v9s8l2
2020-04-15 18:34:30.141 INFOS: [58] org.ice4j.ice.harvest.AbstractUdpListener$MySocket.addBuffer: Dropping a packet because the queue is full. Remote address = /X.X.X.X:51853 ufrag=ea1s71e5v9s8l2
2020-04-15 18:34:30.142 INFOS: [58] org.ice4j.ice.harvest.AbstractUdpListener$MySocket.addBuffer: Dropping a packet because the queue is full. Remote address = /X.X.X.X:51853 ufrag=ea1s71e5v9s8l2
2020-04-15 18:34:30.142 INFOS: [58] org.ice4j.ice.harvest.AbstractUdpListener$MySocket.addBuffer: Dropping a packet because the queue is full. Remote address = /X.X.X.X:51853 ufrag=ea1s71e5v9s8l2

Kernel buffer size seems more or less correct to me:

net.core.rmem_max = 10485760
net.core.netdev_max_backlog = 100000

Running Debian 10, with current stable deb packages (jvb2, 2.1-169-ga28eb88e-1)
During the last dropping wave, the jvb was running 5 conferences with a total of 35 participants.
Previous droping was with 54 participants in 4 conferences.

Checking kernel buffer with a watch -n1 "ss -lntu | grep 10000" show that Recv-Q and Send-Q are most of the time at 0. Some time briefly around 2000. And rarely around 16k. But this is not exhausive, and now it’s the end of the day here, so the load is currently decreasing…

Do you have any idea of what cause these drops, and what might be tuned to avoid that ?

Thanks

Hi, have you found some more details on this? I’ve some dropped packets on my jvb2 server too.

Thank you,

Kind regards,

Milan

Do you see a respective entry about dropped packets in your syslog (default should be /var/log/syslog). If the kernel causes the dropped packets there should be some info about it in the syslog. Just as an example, this could be something like ip_conntrack table full, dropping packet in which case you should increase the size of the conntrack table.

I haven’t seen any drop in the logs today, despite the heavy load.
How ever it’s quite hard for me to say if the problem is definitively solved or if it’s going to re-appear tomorrow…

Changes I have done:

  • Upgrade JVB2 from 2.1-169-ga28eb88e-1 to 2.1-183-gdbddd169-1
  • Correctly configure a turn server, and disable TCP on JVB
  • use a lower default resolution ( resolution: 480 instead of 720)

Hi! No, there are none of such a errors in syslog/dmesg. Only in jvb log.

Kind regards,

Milan

Hi, I think lowering video resolution can help a lot with stress on server. I’m using websocket for bridge channel and thus needed to disable turn in nginx config. :frowning:

I see we have common error messages in our logs. :slight_smile:

Kind regards,

Milan

I also use websockets. But my turnserver is running on a separated and dedicated IP address/fqdn, so it is not behind nginx

how are you solved problem with alpn in nginx? I had to forward all (default) traffic to web instead to turn server to get websocket to work with chome, firefox worked fine with default setting.

Thank you,

Milan

I do have the same exact issue on how to match websocket traffic in nginx. I didn’t found any solution with nginx.
To simplify, let’s say that my turn server is not installed on the same server than jitsi.
In fact it is on the same server, but using an other public IP than the nginx one.

Thank you, I wanted to find out elegant solution inline with whole jitsi installation, some regexp that can catch headers from chrome, but yours is nice too. :slight_smile: Thank you!

Milan

here the RCV-Queue seems to be full:
udp UNCONN 2314624 0 [::ffff:144.76.nnn.mmm]:10000 *:*

The autotuning UDP buffer limits have been set in /etc/sysctl.conf to
net.ipv4.udp_mem = 764178 1018904 1528356

Well, some UDP packets always get lost in conferencing apps:

netstat -suna
IcmpMsg:
    InType3: 214013
    InType8: 7622
    InType11: 47
    OutType0: 7622
    OutType3: 116121
Udp:
    761451988 packets received
    1799949 packets to unknown port received
    1603365 packet receive errors
    821880595 packets sent
    1603365 receive buffer errors
    1045 send buffer errors
    IgnoredMulti: 1128
UdpLite:
IpExt:
    InBcastPkts: 1128
    InOctets: 515589319994
    OutOctets: 525556799798
    InBcastOctets: 367728
    InNoECTPkts: 989841317
    InECT1Pkts: 30
    InECT0Pkts: 1991733
    InCEPkts: 266

But why do I have so many in the receive queue? Users were reporting about bad audio quality, video was fine.

Was there screen share in the meeting with bad audio? We know about such an issue, which is fixed in the latest unstable and will soon go out in the stable.