[jitsi-users] SRTP problems when using mixed audio


#1

Hi,

I am using JVB with jicofo and mixed audio (see attached crude patch to
jicofo just FYI).

the good: it seems to be mostly working.

the bad: a percentage of calls (maybe 20%) are failing and the webrtc
android client is throwing (lots of) errors as follows:

E/libjingle( 6718): Failed to unprotect audio RTP packet: size=39,
seqnum=14646, SSRC=1150943895
W/libjingle( 6718): Failed to unprotect SRTP packet, err=10
E/libjingle( 6718): Failed to unprotect audio RTP packet: size=39,
seqnum=14647, SSRC=1150943895
W/libjingle( 6718): Failed to unprotect SRTP packet, err=10
E/libjingle( 6718): Failed to unprotect audio RTP packet: size=39,
seqnum=14648, SSRC=1150943895
W/libjingle( 6718): Failed to unprotect SRTP packet, err=10
E/libjingle( 6718): Failed to unprotect audio RTP packet: size=39,
seqnum=14649, SSRC=1150943895

it sounds vaguely like this:
https://bugs.chromium.org/p/webrtc/issues/detail?id=3563

however, I am not going unhold type stuff here. so I'm wondering if it
could be relevant. I'm not really sure I follow the crux of what is being
discussed in the above bug but it did seem potemtially releveant since it
is the same warning on the clientside and the server involved a webrtc
gateway.

also, not sure if it is significant but In my jvb.log I noticed this:
Unknown DTLS handshake message type: -10

another example is this:
Unknown DTLS handshake message type: -17

this seems strange. as if the parsing code is not finding the approriate
magic in the protocol.

has anyone any ideas what could be going wrong here?

are there any known problems with using DTLS and mixed audio?

can anyone suggest anything I can do to debug this or provide any further
information which may be of use to developers?

Thanks,

jicofo.diff (1.6 KB)


#2

Hi Raoul,

Hi,

I am using JVB with jicofo and mixed audio (see attached crude patch to
jicofo just FYI).

the good: it seems to be mostly working.

the bad: a percentage of calls (maybe 20%) are failing and the webrtc
android client is throwing (lots of) errors as follows:

E/libjingle( 6718): Failed to unprotect audio RTP packet: size=39,
seqnum=14646, SSRC=1150943895
W/libjingle( 6718): Failed to unprotect SRTP packet, err=10
E/libjingle( 6718): Failed to unprotect audio RTP packet: size=39,
seqnum=14647, SSRC=1150943895
W/libjingle( 6718): Failed to unprotect SRTP packet, err=10
E/libjingle( 6718): Failed to unprotect audio RTP packet: size=39,
seqnum=14648, SSRC=1150943895
W/libjingle( 6718): Failed to unprotect SRTP packet, err=10
E/libjingle( 6718): Failed to unprotect audio RTP packet: size=39,
seqnum=14649, SSRC=1150943895

err=10 indicates that the index is too old[0]. This could be caused by the bridge sending invalid RTP sequence numbers. You can check by taking a tcpdump when the issue occurs and looking for the SRTP packets for that SSRC. The RTP sequence numbers shouldn't contain any gaps.

it sounds vaguely like this:
https://bugs.chromium.org/p/webrtc/issues/detail?id=3563

however, I am not going unhold type stuff here. so I'm wondering if it
could be relevant. I'm not really sure I follow the crux of what is
being discussed in the above bug but it did seem potemtially releveant
since it is the same warning on the clientside and the server involved a
webrtc gateway.

also, not sure if it is significant but In my jvb.log I noticed this:
Unknown DTLS handshake message type: -10

another example is this:
Unknown DTLS handshake message type: -17

We have been observing these for a some time, and while the exact reason is unknown, Lyubomir has confirmed that they don't cause any issues (apart from potentially using more UDP packets the the DTLS payload than necessary).

Regards,
Boris

[0] https://github.com/cisco/libsrtp/blob/master/include/srtp.h#L256

···

On 26/04/16 18:59, Raoul Duke wrote:


#3

Hi Boris,

Hi Raoul,

Hi,

I am using JVB with jicofo and mixed audio (see attached crude patch to
jicofo just FYI).

the good: it seems to be mostly working.

the bad: a percentage of calls (maybe 20%) are failing and the webrtc
android client is throwing (lots of) errors as follows:

E/libjingle( 6718): Failed to unprotect audio RTP packet: size=39,
seqnum=14646, SSRC=1150943895
W/libjingle( 6718): Failed to unprotect SRTP packet, err=10
E/libjingle( 6718): Failed to unprotect audio RTP packet: size=39,
seqnum=14647, SSRC=1150943895
W/libjingle( 6718): Failed to unprotect SRTP packet, err=10
E/libjingle( 6718): Failed to unprotect audio RTP packet: size=39,
seqnum=14648, SSRC=1150943895
W/libjingle( 6718): Failed to unprotect SRTP packet, err=10
E/libjingle( 6718): Failed to unprotect audio RTP packet: size=39,
seqnum=14649, SSRC=1150943895

err=10 indicates that the index is too old[0]. This could be caused by the
bridge sending invalid RTP sequence numbers. You can check by taking a
tcpdump when the issue occurs and looking for the SRTP packets for that
SSRC. The RTP sequence numbers shouldn't contain any gaps.

I followed your suggestion and I believe I see something significant.

I replicated the problematic behavior while tcpdumping on the server.

in my clientside logs I see this error:
W/libjingle(23074): Failed to unprotect SRTP packet, err=10
E/libjingle(23074): Failed to unprotect audio RTP packet: size=109,
seqnum=32616, SSRC=2297436400

I then looked through the tcpdump in wireshark for the corresponding info
and found this snippet:

http://i.imgur.com/6WFjjPP.png

it looks to me as if the error in the client logs corresponds exactly to
this jump from seq 60076 in that SSRC to 32616. i.e. it seems to me that
libsrtp on the clientside is unhappy with the backwards jump in sequence
numbers.

does that seem like a valid analysis? if so that seems like unexpected
behavior from jvb, right?

I have attached the associated jvb.log (redacted ips / hosts). I am also
happy to share any other logs or captures with you that might help.

Note: that it is a totally vanilla setup apart from the addition
of rtp-level-relay-type="mixer" to some of the XML attributes when creating
channels.

can you see anything I could be doing wrong here?

is there any extra logging that I could enable to pinpoint the issue
further?

All help / suggestions greatly appreciated,

Thanks.

jvb.log (152 KB)

···

On Wed, Apr 27, 2016 at 3:57 AM, Boris Grozev <boris@jitsi.org> wrote:

On 26/04/16 18:59, Raoul Duke wrote:


#4

Yes, that definitely looks wrong. No idea why it happens, though.

Boris

···

On 27/04/16 19:47, Raoul Duke wrote:

Hi Boris,

On Wed, Apr 27, 2016 at 3:57 AM, Boris Grozev <boris@jitsi.org > <mailto:boris@jitsi.org>> wrote:

    Hi Raoul,

    On 26/04/16 18:59, Raoul Duke wrote:

        Hi,

        I am using JVB with jicofo and mixed audio (see attached crude
        patch to
        jicofo just FYI).

        the good: it seems to be mostly working.

        the bad: a percentage of calls (maybe 20%) are failing and the
        webrtc
        android client is throwing (lots of) errors as follows:

        E/libjingle( 6718): Failed to unprotect audio RTP packet: size=39,
        seqnum=14646, SSRC=1150943895
        W/libjingle( 6718): Failed to unprotect SRTP packet, err=10
        E/libjingle( 6718): Failed to unprotect audio RTP packet: size=39,
        seqnum=14647, SSRC=1150943895
        W/libjingle( 6718): Failed to unprotect SRTP packet, err=10
        E/libjingle( 6718): Failed to unprotect audio RTP packet: size=39,
        seqnum=14648, SSRC=1150943895
        W/libjingle( 6718): Failed to unprotect SRTP packet, err=10
        E/libjingle( 6718): Failed to unprotect audio RTP packet: size=39,
        seqnum=14649, SSRC=1150943895

    err=10 indicates that the index is too old[0]. This could be caused
    by the bridge sending invalid RTP sequence numbers. You can check by
    taking a tcpdump when the issue occurs and looking for the SRTP
    packets for that SSRC. The RTP sequence numbers shouldn't contain
    any gaps.

I followed your suggestion and I believe I see something significant.

I replicated the problematic behavior while tcpdumping on the server.

in my clientside logs I see this error:
W/libjingle(23074): Failed to unprotect SRTP packet, err=10
E/libjingle(23074): Failed to unprotect audio RTP packet: size=109,
seqnum=32616, SSRC=2297436400

I then looked through the tcpdump in wireshark for the corresponding
info and found this snippet:

http://i.imgur.com/6WFjjPP.png

it looks to me as if the error in the client logs corresponds exactly to
this jump from seq 60076 in that SSRC to 32616. i.e. it seems to me
that libsrtp on the clientside is unhappy with the backwards jump in
sequence numbers.

does that seem like a valid analysis? if so that seems like unexpected
behavior from jvb, right?


#5

PS - here is the jvb.log as an external link in case the mailing list
snarfs the attachment:

http://filebin.ca/2fG7hhJg3Hyh/jvb.log

···

On Thu, Apr 28, 2016 at 1:47 AM, Raoul Duke <rduke496@gmail.com> wrote:

Hi Boris,

On Wed, Apr 27, 2016 at 3:57 AM, Boris Grozev <boris@jitsi.org> wrote:

Hi Raoul,

On 26/04/16 18:59, Raoul Duke wrote:

Hi,

I am using JVB with jicofo and mixed audio (see attached crude patch to
jicofo just FYI).

the good: it seems to be mostly working.

the bad: a percentage of calls (maybe 20%) are failing and the webrtc
android client is throwing (lots of) errors as follows:

E/libjingle( 6718): Failed to unprotect audio RTP packet: size=39,
seqnum=14646, SSRC=1150943895
W/libjingle( 6718): Failed to unprotect SRTP packet, err=10
E/libjingle( 6718): Failed to unprotect audio RTP packet: size=39,
seqnum=14647, SSRC=1150943895
W/libjingle( 6718): Failed to unprotect SRTP packet, err=10
E/libjingle( 6718): Failed to unprotect audio RTP packet: size=39,
seqnum=14648, SSRC=1150943895
W/libjingle( 6718): Failed to unprotect SRTP packet, err=10
E/libjingle( 6718): Failed to unprotect audio RTP packet: size=39,
seqnum=14649, SSRC=1150943895

err=10 indicates that the index is too old[0]. This could be caused by
the bridge sending invalid RTP sequence numbers. You can check by taking a
tcpdump when the issue occurs and looking for the SRTP packets for that
SSRC. The RTP sequence numbers shouldn't contain any gaps.

I followed your suggestion and I believe I see something significant.

I replicated the problematic behavior while tcpdumping on the server.

in my clientside logs I see this error:
W/libjingle(23074): Failed to unprotect SRTP packet, err=10
E/libjingle(23074): Failed to unprotect audio RTP packet: size=109,
seqnum=32616, SSRC=2297436400

I then looked through the tcpdump in wireshark for the corresponding info
and found this snippet:

http://i.imgur.com/6WFjjPP.png

it looks to me as if the error in the client logs corresponds exactly to
this jump from seq 60076 in that SSRC to 32616. i.e. it seems to me that
libsrtp on the clientside is unhappy with the backwards jump in sequence
numbers.

does that seem like a valid analysis? if so that seems like unexpected
behavior from jvb, right?

I have attached the associated jvb.log (redacted ips / hosts). I am also
happy to share any other logs or captures with you that might help.

Note: that it is a totally vanilla setup apart from the addition
of rtp-level-relay-type="mixer" to some of the XML attributes when creating
channels.

can you see anything I could be doing wrong here?

is there any extra logging that I could enable to pinpoint the issue
further?

All help / suggestions greatly appreciated,

Thanks.


#6

Hi Boris,

Hi Boris,

    Hi Raoul,

        Hi,

        I am using JVB with jicofo and mixed audio (see attached crude
        patch to
        jicofo just FYI).

        the good: it seems to be mostly working.

        the bad: a percentage of calls (maybe 20%) are failing and the
        webrtc
        android client is throwing (lots of) errors as follows:

        E/libjingle( 6718): Failed to unprotect audio RTP packet: size=39,
        seqnum=14646, SSRC=1150943895
        W/libjingle( 6718): Failed to unprotect SRTP packet, err=10
        E/libjingle( 6718): Failed to unprotect audio RTP packet: size=39,
        seqnum=14647, SSRC=1150943895
        W/libjingle( 6718): Failed to unprotect SRTP packet, err=10
        E/libjingle( 6718): Failed to unprotect audio RTP packet: size=39,
        seqnum=14648, SSRC=1150943895
        W/libjingle( 6718): Failed to unprotect SRTP packet, err=10
        E/libjingle( 6718): Failed to unprotect audio RTP packet: size=39,
        seqnum=14649, SSRC=1150943895

    err=10 indicates that the index is too old[0]. This could be caused
    by the bridge sending invalid RTP sequence numbers. You can check by
    taking a tcpdump when the issue occurs and looking for the SRTP
    packets for that SSRC. The RTP sequence numbers shouldn't contain
    any gaps.

I followed your suggestion and I believe I see something significant.

I replicated the problematic behavior while tcpdumping on the server.

in my clientside logs I see this error:
W/libjingle(23074): Failed to unprotect SRTP packet, err=10
E/libjingle(23074): Failed to unprotect audio RTP packet: size=109,
seqnum=32616, SSRC=2297436400

I then looked through the tcpdump in wireshark for the corresponding
info and found this snippet:

http://i.imgur.com/6WFjjPP.png

it looks to me as if the error in the client logs corresponds exactly to
this jump from seq 60076 in that SSRC to 32616. i.e. it seems to me
that libsrtp on the clientside is unhappy with the backwards jump in
sequence numbers.

does that seem like a valid analysis? if so that seems like unexpected
behavior from jvb, right?

Yes, that definitely looks wrong. No idea why it happens, though.

Thanks Boris.

can you suggest a suitable way I can proceed to get some help from the
community? is raising a github "issue" the appropriate next step? are
there any relevant things I could could capture to shed more light on the
subject?

Best Regards,
RD

···

On Thu, Apr 28, 2016 at 2:29 AM, Boris Grozev <boris@jitsi.org> wrote:

On 27/04/16 19:47, Raoul Duke wrote:

On Wed, Apr 27, 2016 at 3:57 AM, Boris Grozev <boris@jitsi.org >> <mailto:boris@jitsi.org>> wrote:
    On 26/04/16 18:59, Raoul Duke wrote:


#7

I went ahead and raised a github issue:

https://github.com/jitsi/jitsi-videobridge/issues/230

my latest comment (
https://github.com/jitsi/jitsi-videobridge/issues/230#issuecomment-215595404)
has a succint section of FINER level logging jvb.log which I think might
shed further light on the issue. the snippet shows the sequence numbers
before and after the backwards jump and may point to factors which are
causing it.

It would be much appreciated if someone experienced with the code could
have a quick look at the aforementioned log and give me any feedback on
pointers to things which might be causing the problem or other things I can
dig into.

Thanks.

···

On Thu, Apr 28, 2016 at 3:33 AM, Raoul Duke <rduke496@gmail.com> wrote:

Hi Boris,

On Thu, Apr 28, 2016 at 2:29 AM, Boris Grozev <boris@jitsi.org> wrote:

On 27/04/16 19:47, Raoul Duke wrote:

Hi Boris,

On Wed, Apr 27, 2016 at 3:57 AM, Boris Grozev <boris@jitsi.org >>> <mailto:boris@jitsi.org>> wrote:

    Hi Raoul,

    On 26/04/16 18:59, Raoul Duke wrote:

        Hi,

        I am using JVB with jicofo and mixed audio (see attached crude
        patch to
        jicofo just FYI).

        the good: it seems to be mostly working.

        the bad: a percentage of calls (maybe 20%) are failing and the
        webrtc
        android client is throwing (lots of) errors as follows:

        E/libjingle( 6718): Failed to unprotect audio RTP packet:
size=39,
        seqnum=14646, SSRC=1150943895
        W/libjingle( 6718): Failed to unprotect SRTP packet, err=10
        E/libjingle( 6718): Failed to unprotect audio RTP packet:
size=39,
        seqnum=14647, SSRC=1150943895
        W/libjingle( 6718): Failed to unprotect SRTP packet, err=10
        E/libjingle( 6718): Failed to unprotect audio RTP packet:
size=39,
        seqnum=14648, SSRC=1150943895
        W/libjingle( 6718): Failed to unprotect SRTP packet, err=10
        E/libjingle( 6718): Failed to unprotect audio RTP packet:
size=39,
        seqnum=14649, SSRC=1150943895

    err=10 indicates that the index is too old[0]. This could be caused
    by the bridge sending invalid RTP sequence numbers. You can check by
    taking a tcpdump when the issue occurs and looking for the SRTP
    packets for that SSRC. The RTP sequence numbers shouldn't contain
    any gaps.

I followed your suggestion and I believe I see something significant.

I replicated the problematic behavior while tcpdumping on the server.

in my clientside logs I see this error:
W/libjingle(23074): Failed to unprotect SRTP packet, err=10
E/libjingle(23074): Failed to unprotect audio RTP packet: size=109,
seqnum=32616, SSRC=2297436400

I then looked through the tcpdump in wireshark for the corresponding
info and found this snippet:

http://i.imgur.com/6WFjjPP.png

it looks to me as if the error in the client logs corresponds exactly to
this jump from seq 60076 in that SSRC to 32616. i.e. it seems to me
that libsrtp on the clientside is unhappy with the backwards jump in
sequence numbers.

does that seem like a valid analysis? if so that seems like unexpected
behavior from jvb, right?

Yes, that definitely looks wrong. No idea why it happens, though.

Thanks Boris.

can you suggest a suitable way I can proceed to get some help from the
community? is raising a github "issue" the appropriate next step? are
there any relevant things I could could capture to shed more light on the
subject?

Best Regards,
RD