Conversation breaks after three minutes - mac check in GCM failed

Hi there,

just got jitsi running (as described here) in a docker container behind the traefik router/loadbalancer. During a test conversation I noticed this exception occurring every few seconds:

jvb_1      | Mar 29, 2020 9:35:56 AM org.jitsi.utils.logging.LoggerImpl log                                                                                                  
jvb_1      | WARNING: Failed to handle packet:                                                                                                                               
jvb_1      | java.lang.IllegalArgumentException: RTP packet header length 24 > length 22                                                                                     
jvb_1      |    at org.jitsi.rtp.rtp.RtpPacket.<init>(RtpPacket.kt:111)                                                                                                      
jvb_1      |    at org.jitsi.nlj.rtp.AudioRtpPacket.<init>(AudioRtpPacket.kt:24)                                                                                             
jvb_1      |    at org.jitsi.nlj.transform.node.RtpParser.transform(RtpParser.kt:44)                                                                                         
jvb_1      |    at org.jitsi.nlj.transform.node.TransformerNode.doProcessPacket(Node.kt:371)                                                                                 
jvb_1      |    at org.jitsi.nlj.transform.node.StatsKeepingNode.processPacket(Node.kt:201)                                                                                  
jvb_1      |    at org.jitsi.nlj.transform.node.ExclusivePathDemuxer.doProcessPacket(Node.kt:534)                                                                            
jvb_1      |    at org.jitsi.nlj.transform.node.StatsKeepingNode.processPacket(Node.kt:201)                                                                                  
jvb_1      |    at org.jitsi.nlj.transform.node.Node.next(Node.kt:106)                                                                                                       
jvb_1      |    at org.jitsi.nlj.transform.node.NeverDiscardNode.doProcessPacket(Node.kt:358)                                                                                
jvb_1      |    at org.jitsi.nlj.transform.node.StatsKeepingNode.processPacket(Node.kt:201)
jvb_1      |    at org.jitsi.nlj.RtpReceiverImpl.doProcessPacket(RtpReceiverImpl.kt:229)
jvb_1      |    at org.jitsi.nlj.StatsKeepingPacketHandler.processPacket(PacketHandler.kt:44)
jvb_1      |    at org.jitsi.nlj.RtpReceiverImpl.handleIncomingPacket(RtpReceiverImpl.kt:223)
jvb_1      |    at org.jitsi.nlj.RtpReceiverImpl.access$handleIncomingPacket(RtpReceiverImpl.kt:69)
jvb_1      |    at org.jitsi.nlj.RtpReceiverImpl$incomingPacketQueue$1.invoke(RtpReceiverImpl.kt:96)
jvb_1      |    at org.jitsi.nlj.RtpReceiverImpl$incomingPacketQueue$1.invoke(RtpReceiverImpl.kt:69)
jvb_1      |    at org.jitsi.nlj.util.PacketInfoQueue$sam$org_jitsi_utils_queue_PacketQueue_PacketHandler$0.handlePacket(PacketInfoQueue.kt)
jvb_1      |    at org.jitsi.utils.queue.PacketQueue$HandlerAdapter.handleItem(PacketQueue.java:571)
jvb_1      |    at org.jitsi.utils.queue.AsyncQueueHandler$1.run(AsyncQueueHandler.java:141)
jvb_1      |    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
jvb_1      |    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
jvb_1      |    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
jvb_1      |    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
jvb_1      |    at java.lang.Thread.run(Thread.java:748)

Then, the conversation just freezes and the server logs this:

jvb_1      | Mar 29, 2020 9:35:58 AM org.jitsi.utils.logging2.LoggerImpl log                                                                                                 
jvb_1      | INFO: Alert raised: level=2, description=20, message=null cause=null org.jitsi.nlj.dtls.TlsClientImpl.notifyAlertRaised(TlsClientImpl.kt:197)                   
jvb_1      | org.bouncycastle.tls.DTLSRecordLayer.raiseAlert(Unknown Source)
jvb_1      | org.bouncycastle.tls.DTLSRecordLayer.fail(Unknown Source)
jvb_1      | org.bouncycastle.tls.DTLSTransport.receive(Unknown Source)
jvb_1      | org.jitsi.nlj.dtls.DtlsStack.processIncomingProtocolData(DtlsStack.kt:169)
jvb_1      | org.jitsi.nlj.transform.node.incoming.ProtocolReceiver.transform(ProtocolReceiver.kt:29)
jvb_1      | org.jitsi.nlj.transform.node.MultipleOutputTransformerNode.doProcessPacket(Node.kt:460)
jvb_1      | org.jitsi.nlj.transform.node.StatsKeepingNode.processPacket(Node.kt:201)
jvb_1      | org.jitsi.nlj.transform.node.ExclusivePathDemuxer.doProcessPacket(Node.kt:534)
jvb_1      | org.jitsi.nlj.transform.node.StatsKeepingNode.processPacket(Node.kt:201)
jvb_1      | org.jitsi.videobridge.DtlsTransport.lambda$installIncomingPacketReader$3(DtlsTransport.java:447)
jvb_1      | java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
jvb_1      | java.util.concurrent.FutureTask.run(FutureTask.java:266)
jvb_1      | java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
jvb_1      | java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
jvb_1      | java.lang.Thread.run(Thread.java:748)
jvb_1      |
jvb_1      | Mar 29, 2020 9:35:58 AM org.jitsi.utils.logging2.LoggerImpl log
jvb_1      | WARNING: Stopping reader:
jvb_1      | org.bouncycastle.tls.TlsFatalAlert: bad_record_mac(20)
jvb_1      |    at org.bouncycastle.tls.crypto.impl.TlsAEADCipher.decodeCiphertext(Unknown Source)
jvb_1      |    at org.bouncycastle.tls.DTLSRecordLayer.receive(Unknown Source)
jvb_1      |    at org.bouncycastle.tls.DTLSTransport.receive(Unknown Source)
jvb_1      |    at org.jitsi.nlj.dtls.DtlsStack.processIncomingProtocolData(DtlsStack.kt:169)
jvb_1      |    at org.jitsi.nlj.transform.node.incoming.ProtocolReceiver.transform(ProtocolReceiver.kt:29)
jvb_1      |    at org.jitsi.nlj.transform.node.MultipleOutputTransformerNode.doProcessPacket(Node.kt:460)
jvb_1      |    at org.jitsi.nlj.transform.node.StatsKeepingNode.processPacket(Node.kt:201)
jvb_1      |    at org.jitsi.nlj.transform.node.ExclusivePathDemuxer.doProcessPacket(Node.kt:534)
jvb_1      |    at org.jitsi.nlj.transform.node.StatsKeepingNode.processPacket(Node.kt:201)
jvb_1      |    at org.jitsi.videobridge.DtlsTransport.lambda$installIncomingPacketReader$3(DtlsTransport.java:447)
jvb_1      |    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
jvb_1      |    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
jvb_1      |    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
jvb_1      |    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
jvb_1      |    at java.lang.Thread.run(Thread.java:748)
jvb_1      | Caused by: org.bouncycastle.crypto.RuntimeCryptoException: org.bouncycastle.crypto.InvalidCipherTextException: mac check in GCM failed
jvb_1      |    at org.bouncycastle.tls.crypto.impl.bc.BcTlsCrypto$AeadOperator.doFinal(Unknown Source)
jvb_1      |    ... 15 more

Any idea what causes this?

Thanks in advance,
j.

Hi folks,

would be really awesome to get some feedback on this issue. A solution would be even better :wink:. These are the versions of the software involved btw:

JVB 2.1.160-gc889cdd5
JiCoFo 1.0.540
Prosody version 0.11.5

Or should I file a bug report with gitlab?

Cheers and thanks,
j.

@mattvoss have you encountered something like I described here? I’d be grateful for a clue on where to start searching for the error …

Cheers,
j.

I haven’t. However, from what I have found running behind Traefik or any proxy causes all kinds of routing problems for jvb. This is because jvb determines what ip address it is broadcasting from. For me since my Traefik entry point was on a completely different server that ip was wrong if I proxied traffic using Traefik.

That’s why I finally gave up and exposed ports 10000 and 4443 via host networking. jvb just doesn’t like to be proxied at least not right now.

The few times I did get it working I also found Traefik’s udp proxing code to be slow. Makes sense, it was just added. Performance will improve over time.

My recommendation is to ditch proxying of udp. Too many things going on to make it work reliably. Since I did that I’ve got everything working without issue. Although I had to blow away all configurations to make everything work reliably.

closed

Your were right after all, sorry for my doubts. I did a test and routed udp port 10000 “around” traefik and everything is working fine now. We will have a first live test this afternoon. I will close this topic since all the described errors are gone now that traefik is out of the udp loop.
Still, I am convinced that with some debugging on the traefik side this issue will be resolved in the near future. I’ll follow up in the traefik forum :smile:.