[jitsi-dev] [libjitsi] OpenSSL HMAC-SHA1 crashes Jitsi Videobridge (#35)


#1

Boris Grozev has reported a crash in Jitsi Videobridge which appears to be caused by the OpenSSL HMAC-SHA1 JNI implementation: http://pastebin.com/Em2SrZkU

···

---
Reply to this email directly or view it on GitHub:
https://github.com/jitsi/libjitsi/issues/35


#2

Got the same error using the last commit of Jitsi Videobridge:

[thread 140532624729856 also had an error]#

# A fatal error has been detected by the Java Runtime Environment:

···

#

# SIGSEGV (0xb) at pc=0x00007fd0088ba28a, pid=2027, tid=140532641572608

#

# JRE version: OpenJDK Runtime Environment (7.0_65-b32) (build 1.7.0_65-b32)

# Java VM: OpenJDK 64-Bit Server VM (24.65-b04 mixed mode linux-amd64
compressed oops)

# Derivative: IcedTea 2.5.2

# Distribution: Ubuntu 14.04 LTS, package 7u65-2.5.2-3~14.04

# Problematic frame:

# C [libcrypto.so.1.0.0+0xea28a]# [ timer expired, abort... ]

/usr/share/jitsi-videobridge/jvb.sh: line 32: 2027 Aborted
(core dumped) LD_LIBRARY_PATH=$libs java -Xmx$VIDEOBRIDGE_MAX_MEMORY
$VIDEOBRIDGE_DEBUG_OPTIONS -XX

Regards,

Zalmoxisus

On Mon, Mar 9, 2015 at 12:29 PM, Любомир Маринов <notifications@github.com> wrote:

Boris Grozev has reported a crash in Jitsi Videobridge which appears to be
caused by the OpenSSL HMAC-SHA1 JNI implementation:
http://pastebin.com/Em2SrZkU


Reply to this email directly or view it on GitHub
<https://github.com/jitsi/libjitsi/issues/35>.

_______________________________________________
dev mailing list
dev@jitsi.org
Unsubscribe instructions and other list options:
http://lists.jitsi.org/mailman/listinfo/dev


#3

Closed #35.

···

---
Reply to this email directly or view it on GitHub:
https://github.com/jitsi/libjitsi/issues/35#event-268440433


#4

Presumed fixed with https://github.com/jitsi/jitsi-videobridge/commit/473f24edee95b6d0432f13d51b5b500a0a22c0bb and https://github.com/jitsi/libjitsi/commit/6065f503917fa9696aee117d11309963899e80a7

···

---
Reply to this email directly or view it on GitHub:
https://github.com/jitsi/libjitsi/issues/35#issuecomment-87663737


#5

Update: In may case it occurs only when useRtcpMux is enabled.

···

On Mon, Mar 9, 2015 at 1:59 PM, Michael Diordiev <zalmoxisus@gmail.com> wrote:

Got the same error using the last commit of Jitsi Videobridge:

[thread 140532624729856 also had an error]#

# A fatal error has been detected by the Java Runtime Environment:

#

# SIGSEGV (0xb) at pc=0x00007fd0088ba28a, pid=2027, tid=140532641572608

#

# JRE version: OpenJDK Runtime Environment (7.0_65-b32) (build
1.7.0_65-b32)

# Java VM: OpenJDK 64-Bit Server VM (24.65-b04 mixed mode linux-amd64
compressed oops)

# Derivative: IcedTea 2.5.2

# Distribution: Ubuntu 14.04 LTS, package 7u65-2.5.2-3~14.04

# Problematic frame:

# C [libcrypto.so.1.0.0+0xea28a]# [ timer expired, abort... ]

/usr/share/jitsi-videobridge/jvb.sh: line 32: 2027 Aborted
  (core dumped) LD_LIBRARY_PATH=$libs java -Xmx$VIDEOBRIDGE_MAX_MEMORY
$VIDEOBRIDGE_DEBUG_OPTIONS -XX

Regards,

Zalmoxisus

On Mon, Mar 9, 2015 at 12:29 PM, Любомир Маринов <notifications@github.com > > wrote:

Boris Grozev has reported a crash in Jitsi Videobridge which appears to
be caused by the OpenSSL HMAC-SHA1 JNI implementation:
http://pastebin.com/Em2SrZkU


Reply to this email directly or view it on GitHub
<https://github.com/jitsi/libjitsi/issues/35>.

_______________________________________________
dev mailing list
dev@jitsi.org
Unsubscribe instructions and other list options:
http://lists.jitsi.org/mailman/listinfo/dev


#6

Does that mean that you have a way to reproduce it, or that you see if relatively often? AFAIK we only saw this once (indeed with rtcpmux enabled).

Boris

···

On 11/03/15 12:57, Michael Diordiev wrote:

Update: In may case it occurs only when useRtcpMux is enabled.


#7

Unfortunately, I do not see how to reproduce it. When we have many
participants (more than 10), it usually occurs once a hour. If there
are not so many participants, it can occur once a day.

Maybe it is somehow affected when users reconnects using the same jid
(we do not use anonymousdomain). Four months ago I reported another
issue for jitsi videobridge with useRtcpMux enabled and
non-anonymousdomain [1]. Paweldomas fixed it [2], but it still
sometimes throwed the exception, so I just renounced to use rtcmux.
Last week I started again to use rtcmux because it is required to
support firefox [3], and I got these errors.

[1] http://markmail.org/message/j5css5w6tpghd5sp#query:+page:1+mid:vnor7mm3tmsjiklp+state:results
[2] https://github.com/jitsi/jitsi-videobridge/commit/0edc4009bbbd1b3aeaa08136b08eb9f454e4b410
[3] https://github.com/jitsi/jitsi-meet/commit/05bbfda5bb6b54ff78dec93ede3d5054ad49843a

Regards,
Zalmoxisus

···

On Wed, Mar 11, 2015 at 2:36 PM, Boris Grozev <boris@jitsi.org> wrote:

On 11/03/15 12:57, Michael Diordiev wrote:

Update: In may case it occurs only when useRtcpMux is enabled.

Does that mean that you have a way to reproduce it, or that you see if
relatively often? AFAIK we only saw this once (indeed with rtcpmux enabled).

Boris

_______________________________________________
dev mailing list
dev@jitsi.org
Unsubscribe instructions and other list options:
http://lists.jitsi.org/mailman/listinfo/dev


#8

I wasn't right. Just got this exception with useRtcpMux disabled:

···

#

# A fatal error has been detected by the Java Runtime Environment:

#

# SIGSEGV (0xb) at pc=0x00007f43dbd103c7, pid=24638, tid=139929327531776

#

# JRE version: OpenJDK Runtime Environment (7.0_65-b32) (build 1.7.0_65-b32)

# Java VM: OpenJDK 64-Bit Server VM (24.65-b04 mixed mode linux-amd64
compressed oops)

# Derivative: IcedTea 2.5.2

# Distribution: Ubuntu 14.04 LTS, package 7u65-2.5.2-3~14.04

# Problematic frame:

# C [libcrypto.so.1.0.0+0xea3c7] EVP_MD_CTX_cleanup+0xd7

#

# Failed to write core dump. Core dumps have been disabled. To enable
core dumping, try "ulimit -c unlimited" before starting Java again

#

# An error report file with more information is saved as:

# /tmp/hs_err_pid24638.log

#

# If you would like to submit a bug report, please include

# instructions on how to reproduce the bug and visit:

# http://icedtea.classpath.org/bugzilla

# The crash happened outside the Java Virtual Machine in native code.

# See problematic frame for where to report the bug.

#

Gak, chk->snd_count:30 >= max:30 - send abort

/usr/share/jitsi-videobridge/jvb.sh: line 32: 24638 Aborted
     (core dumped) LD_LIBRARY_PATH=$libs java
-Xmx$VIDEOBRIDGE_MAX_MEMORY $VIDEOBRIDGE_DEBUG_OPTIONS
-XX:-HeapDumpOnOutOfMemoryError -Djava.library.path=$libs
-Djava.util.logging.config.file=$logging_config -cp $cp $mainClass $@

Regards,
Zalmoxisus

On Wed, Mar 11, 2015 at 3:06 PM, Michael Diordiev <zalmoxisus@gmail.com> wrote:

Unfortunately, I do not see how to reproduce it. When we have many
participants (more than 10), it usually occurs once a hour. If there
are not so many participants, it can occur once a day.

Maybe it is somehow affected when users reconnects using the same jid
(we do not use anonymousdomain). Four months ago I reported another
issue for jitsi videobridge with useRtcpMux enabled and
non-anonymousdomain [1]. Paweldomas fixed it [2], but it still
sometimes throwed the exception, so I just renounced to use rtcmux.
Last week I started again to use rtcmux because it is required to
support firefox [3], and I got these errors.

[1] http://markmail.org/message/j5css5w6tpghd5sp#query:+page:1+mid:vnor7mm3tmsjiklp+state:results
[2] https://github.com/jitsi/jitsi-videobridge/commit/0edc4009bbbd1b3aeaa08136b08eb9f454e4b410
[3] https://github.com/jitsi/jitsi-meet/commit/05bbfda5bb6b54ff78dec93ede3d5054ad49843a

Regards,
Zalmoxisus

On Wed, Mar 11, 2015 at 2:36 PM, Boris Grozev <boris@jitsi.org> wrote:

On 11/03/15 12:57, Michael Diordiev wrote:

Update: In may case it occurs only when useRtcpMux is enabled.

Does that mean that you have a way to reproduce it, or that you see if
relatively often? AFAIK we only saw this once (indeed with rtcpmux enabled).

Boris

_______________________________________________
dev mailing list
dev@jitsi.org
Unsubscribe instructions and other list options:
http://lists.jitsi.org/mailman/listinfo/dev


#9

Maybe this is not an issue at all, but just in case.

OpenSSL is not thread-safe by default
(http://wiki.openssl.org/index.php/Libcrypto_API#Thread_Safety)
Could it be that threading issues arise when sufficiently many users are
active?

Just a quick thought ..

Danny

···

On 11-03-15 14:16, Michael Diordiev wrote:

I wasn't right. Just got this exception with useRtcpMux disabled:

#

# A fatal error has been detected by the Java Runtime Environment:

#

# SIGSEGV (0xb) at pc=0x00007f43dbd103c7, pid=24638, tid=139929327531776

#

# JRE version: OpenJDK Runtime Environment (7.0_65-b32) (build 1.7.0_65-b32)

# Java VM: OpenJDK 64-Bit Server VM (24.65-b04 mixed mode linux-amd64
compressed oops)

# Derivative: IcedTea 2.5.2

# Distribution: Ubuntu 14.04 LTS, package 7u65-2.5.2-3~14.04

# Problematic frame:

# C [libcrypto.so.1.0.0+0xea3c7] EVP_MD_CTX_cleanup+0xd7

#

# Failed to write core dump. Core dumps have been disabled. To enable
core dumping, try "ulimit -c unlimited" before starting Java again

#

# An error report file with more information is saved as:

# /tmp/hs_err_pid24638.log

#

# If you would like to submit a bug report, please include

# instructions on how to reproduce the bug and visit:

# http://icedtea.classpath.org/bugzilla

# The crash happened outside the Java Virtual Machine in native code.

# See problematic frame for where to report the bug.

#

Gak, chk->snd_count:30 >= max:30 - send abort

/usr/share/jitsi-videobridge/jvb.sh: line 32: 24638 Aborted
     (core dumped) LD_LIBRARY_PATH=$libs java
-Xmx$VIDEOBRIDGE_MAX_MEMORY $VIDEOBRIDGE_DEBUG_OPTIONS
-XX:-HeapDumpOnOutOfMemoryError -Djava.library.path=$libs
-Djava.util.logging.config.file=$logging_config -cp $cp $mainClass $@

Regards,
Zalmoxisus

On Wed, Mar 11, 2015 at 3:06 PM, Michael Diordiev <zalmoxisus@gmail.com> wrote:

Unfortunately, I do not see how to reproduce it. When we have many
participants (more than 10), it usually occurs once a hour. If there
are not so many participants, it can occur once a day.

Maybe it is somehow affected when users reconnects using the same jid
(we do not use anonymousdomain). Four months ago I reported another
issue for jitsi videobridge with useRtcpMux enabled and
non-anonymousdomain [1]. Paweldomas fixed it [2], but it still
sometimes throwed the exception, so I just renounced to use rtcmux.
Last week I started again to use rtcmux because it is required to
support firefox [3], and I got these errors.

[1] http://markmail.org/message/j5css5w6tpghd5sp#query:+page:1+mid:vnor7mm3tmsjiklp+state:results
[2] https://github.com/jitsi/jitsi-videobridge/commit/0edc4009bbbd1b3aeaa08136b08eb9f454e4b410
[3] https://github.com/jitsi/jitsi-meet/commit/05bbfda5bb6b54ff78dec93ede3d5054ad49843a

Regards,
Zalmoxisus

On Wed, Mar 11, 2015 at 2:36 PM, Boris Grozev <boris@jitsi.org> wrote:

On 11/03/15 12:57, Michael Diordiev wrote:

Update: In may case it occurs only when useRtcpMux is enabled.

Does that mean that you have a way to reproduce it, or that you see if
relatively often? AFAIK we only saw this once (indeed with rtcpmux enabled).

Boris

_______________________________________________
dev mailing list
dev@jitsi.org
Unsubscribe instructions and other list options:
http://lists.jitsi.org/mailman/listinfo/dev

_______________________________________________
dev mailing list
dev@jitsi.org
Unsubscribe instructions and other list options:
http://lists.jitsi.org/mailman/listinfo/dev


#10

Maybe this is not an issue at all, but just in case.

OpenSSL is not thread-safe by default
(http://wiki.openssl.org/index.php/Libcrypto_API#Thread_Safety)
Could it be that threading issues arise when sufficiently many users are
active?

Just a quick thought ..

That's interesting!

Our native OpenSSL HMAC_CTXs are each tied to a BaseSRTPCryptoContext. And the authenticatePacketHMAC() method is not thread safe regardless of whether we use OpenSSL's HMAC or not[1]. I think this could explain the crash, if we somehow end-up using a single BaseSRTPCryptoContext in more than one thread.

For different users we definitely use different contexts. Without bundle and without rtcp-mux we have a different DtlsPacketTransformer for each thread, so different contexts again. We always use different contexts for sending and receiving. DataChannels don't use SRTP, so they shouldn't be an issue. Ditto for the actual DTLS connect threads.

The only thing I see left is when bundle and rtcp-mux are enabled. In this case we have 4 threads ({audio, video} x {RTP, RTCP}) that share a DtlsTransforEngine. So far I don't see how we could end up using the same BaseSRTPCryptoContext in more than one thread, but it's complex code and I haven't looked very carefully.

Michael, if you remove libjnopenssl.so, the bridge will fallback to using the BouncyCastle HMAC implementation. If the problem is OpenSSL-specific, this should resolve it. If the problem is something like the speculation above, this should still help, because you will have single java threads dieing instead of the whole jvm. It might also give us some useful information.

Regards,
Boris

[1] https://github.com/jitsi/libjitsi/blob/master/src/org/jitsi/impl/neomedia/transform/srtp/BaseSRTPCryptoContext.java#L261

···

On 12/03/15 20:20, Danny van Heumen wrote:

Danny

On 11-03-15 14:16, Michael Diordiev wrote:

I wasn't right. Just got this exception with useRtcpMux disabled:

#

# A fatal error has been detected by the Java Runtime Environment:

#

# SIGSEGV (0xb) at pc=0x00007f43dbd103c7, pid=24638, tid=139929327531776

#

# JRE version: OpenJDK Runtime Environment (7.0_65-b32) (build 1.7.0_65-b32)

# Java VM: OpenJDK 64-Bit Server VM (24.65-b04 mixed mode linux-amd64
compressed oops)

# Derivative: IcedTea 2.5.2

# Distribution: Ubuntu 14.04 LTS, package 7u65-2.5.2-3~14.04

# Problematic frame:

# C [libcrypto.so.1.0.0+0xea3c7] EVP_MD_CTX_cleanup+0xd7

#

# Failed to write core dump. Core dumps have been disabled. To enable
core dumping, try "ulimit -c unlimited" before starting Java again

#

# An error report file with more information is saved as:

# /tmp/hs_err_pid24638.log

#

# If you would like to submit a bug report, please include

# instructions on how to reproduce the bug and visit:

# http://icedtea.classpath.org/bugzilla

# The crash happened outside the Java Virtual Machine in native code.

# See problematic frame for where to report the bug.

#

Gak, chk->snd_count:30 >= max:30 - send abort

/usr/share/jitsi-videobridge/jvb.sh: line 32: 24638 Aborted
      (core dumped) LD_LIBRARY_PATH=$libs java
-Xmx$VIDEOBRIDGE_MAX_MEMORY $VIDEOBRIDGE_DEBUG_OPTIONS
-XX:-HeapDumpOnOutOfMemoryError -Djava.library.path=$libs
-Djava.util.logging.config.file=$logging_config -cp $cp $mainClass $@

Regards,
Zalmoxisus

On Wed, Mar 11, 2015 at 3:06 PM, Michael Diordiev <zalmoxisus@gmail.com> wrote:

Unfortunately, I do not see how to reproduce it. When we have many
participants (more than 10), it usually occurs once a hour. If there
are not so many participants, it can occur once a day.

Maybe it is somehow affected when users reconnects using the same jid
(we do not use anonymousdomain). Four months ago I reported another
issue for jitsi videobridge with useRtcpMux enabled and
non-anonymousdomain [1]. Paweldomas fixed it [2], but it still
sometimes throwed the exception, so I just renounced to use rtcmux.
Last week I started again to use rtcmux because it is required to
support firefox [3], and I got these errors.

[1] http://markmail.org/message/j5css5w6tpghd5sp#query:+page:1+mid:vnor7mm3tmsjiklp+state:results
[2] https://github.com/jitsi/jitsi-videobridge/commit/0edc4009bbbd1b3aeaa08136b08eb9f454e4b410
[3] https://github.com/jitsi/jitsi-meet/commit/05bbfda5bb6b54ff78dec93ede3d5054ad49843a

Regards,
Zalmoxisus

On Wed, Mar 11, 2015 at 2:36 PM, Boris Grozev <boris@jitsi.org> wrote:

On 11/03/15 12:57, Michael Diordiev wrote:

Update: In may case it occurs only when useRtcpMux is enabled.

Does that mean that you have a way to reproduce it, or that you see if
relatively often? AFAIK we only saw this once (indeed with rtcpmux enabled).

Boris

_______________________________________________
dev mailing list
dev@jitsi.org
Unsubscribe instructions and other list options:
http://lists.jitsi.org/mailman/listinfo/dev

_______________________________________________
dev mailing list
dev@jitsi.org
Unsubscribe instructions and other list options:
http://lists.jitsi.org/mailman/listinfo/dev

_______________________________________________
dev mailing list
dev@jitsi.org
Unsubscribe instructions and other list options:
http://lists.jitsi.org/mailman/listinfo/dev


#11

Boris, thank you for the advance. In addition to disabling RtcpMux in
jitsi-meet, I also commented the lines 582-586 from [1], and that
helped. For more than 24 hours I got no errors (as I said, usually it
crashed each hour).

Regards,
Zalmoxisus

[1] https://github.com/jitsi/jitsi-meet/blob/master/modules/xmpp/strophe.emuc.js#L582

···

On Thu, Mar 12, 2015 at 11:35 PM, Boris Grozev <boris@jitsi.org> wrote:

On 12/03/15 20:20, Danny van Heumen wrote:

Maybe this is not an issue at all, but just in case.

OpenSSL is not thread-safe by default
(http://wiki.openssl.org/index.php/Libcrypto_API#Thread_Safety)
Could it be that threading issues arise when sufficiently many users are
active?

Just a quick thought ..

That's interesting!

Our native OpenSSL HMAC_CTXs are each tied to a BaseSRTPCryptoContext. And
the authenticatePacketHMAC() method is not thread safe regardless of whether
we use OpenSSL's HMAC or not[1]. I think this could explain the crash, if we
somehow end-up using a single BaseSRTPCryptoContext in more than one thread.

For different users we definitely use different contexts. Without bundle and
without rtcp-mux we have a different DtlsPacketTransformer for each thread,
so different contexts again. We always use different contexts for sending
and receiving. DataChannels don't use SRTP, so they shouldn't be an issue.
Ditto for the actual DTLS connect threads.

The only thing I see left is when bundle and rtcp-mux are enabled. In this
case we have 4 threads ({audio, video} x {RTP, RTCP}) that share a
DtlsTransforEngine. So far I don't see how we could end up using the same
BaseSRTPCryptoContext in more than one thread, but it's complex code and I
haven't looked very carefully.

Michael, if you remove libjnopenssl.so, the bridge will fallback to using
the BouncyCastle HMAC implementation. If the problem is OpenSSL-specific,
this should resolve it. If the problem is something like the speculation
above, this should still help, because you will have single java threads
dieing instead of the whole jvm. It might also give us some useful
information.

Regards,
Boris

[1]
https://github.com/jitsi/libjitsi/blob/master/src/org/jitsi/impl/neomedia/transform/srtp/BaseSRTPCryptoContext.java#L261

Danny

On 11-03-15 14:16, Michael Diordiev wrote:

I wasn't right. Just got this exception with useRtcpMux disabled:

#

# A fatal error has been detected by the Java Runtime Environment:

#

# SIGSEGV (0xb) at pc=0x00007f43dbd103c7, pid=24638, tid=139929327531776

#

# JRE version: OpenJDK Runtime Environment (7.0_65-b32) (build
1.7.0_65-b32)

# Java VM: OpenJDK 64-Bit Server VM (24.65-b04 mixed mode linux-amd64
compressed oops)

# Derivative: IcedTea 2.5.2

# Distribution: Ubuntu 14.04 LTS, package 7u65-2.5.2-3~14.04

# Problematic frame:

# C [libcrypto.so.1.0.0+0xea3c7] EVP_MD_CTX_cleanup+0xd7

#

# Failed to write core dump. Core dumps have been disabled. To enable
core dumping, try "ulimit -c unlimited" before starting Java again

#

# An error report file with more information is saved as:

# /tmp/hs_err_pid24638.log

#

# If you would like to submit a bug report, please include

# instructions on how to reproduce the bug and visit:

# http://icedtea.classpath.org/bugzilla

# The crash happened outside the Java Virtual Machine in native code.

# See problematic frame for where to report the bug.

#

Gak, chk->snd_count:30 >= max:30 - send abort

/usr/share/jitsi-videobridge/jvb.sh: line 32: 24638 Aborted
      (core dumped) LD_LIBRARY_PATH=$libs java
-Xmx$VIDEOBRIDGE_MAX_MEMORY $VIDEOBRIDGE_DEBUG_OPTIONS
-XX:-HeapDumpOnOutOfMemoryError -Djava.library.path=$libs
-Djava.util.logging.config.file=$logging_config -cp $cp $mainClass $@

Regards,
Zalmoxisus

On Wed, Mar 11, 2015 at 3:06 PM, Michael Diordiev <zalmoxisus@gmail.com> >>> wrote:

Unfortunately, I do not see how to reproduce it. When we have many
participants (more than 10), it usually occurs once a hour. If there
are not so many participants, it can occur once a day.

Maybe it is somehow affected when users reconnects using the same jid
(we do not use anonymousdomain). Four months ago I reported another
issue for jitsi videobridge with useRtcpMux enabled and
non-anonymousdomain [1]. Paweldomas fixed it [2], but it still
sometimes throwed the exception, so I just renounced to use rtcmux.
Last week I started again to use rtcmux because it is required to
support firefox [3], and I got these errors.

[1]
http://markmail.org/message/j5css5w6tpghd5sp#query:+page:1+mid:vnor7mm3tmsjiklp+state:results
[2]
https://github.com/jitsi/jitsi-videobridge/commit/0edc4009bbbd1b3aeaa08136b08eb9f454e4b410
[3]
https://github.com/jitsi/jitsi-meet/commit/05bbfda5bb6b54ff78dec93ede3d5054ad49843a

Regards,
Zalmoxisus

On Wed, Mar 11, 2015 at 2:36 PM, Boris Grozev <boris@jitsi.org> wrote:

On 11/03/15 12:57, Michael Diordiev wrote:

Update: In may case it occurs only when useRtcpMux is enabled.

Does that mean that you have a way to reproduce it, or that you see if
relatively often? AFAIK we only saw this once (indeed with rtcpmux
enabled).

Boris

_______________________________________________
dev mailing list
dev@jitsi.org
Unsubscribe instructions and other list options:
http://lists.jitsi.org/mailman/listinfo/dev

_______________________________________________
dev mailing list
dev@jitsi.org
Unsubscribe instructions and other list options:
http://lists.jitsi.org/mailman/listinfo/dev

_______________________________________________
dev mailing list
dev@jitsi.org
Unsubscribe instructions and other list options:
http://lists.jitsi.org/mailman/listinfo/dev

_______________________________________________
dev mailing list
dev@jitsi.org
Unsubscribe instructions and other list options:
http://lists.jitsi.org/mailman/listinfo/dev


#12

I was seeing this pretty consistently (on the order of 8 or 9 out of 10
times) with our own Jitsi client that we've begun building. Disabling
OpenSSL and going to BouncyCastle got rid of the issue so from what I've
seen it seems likely that it's OpenSSL-related (have also seen
ArrayOutOfBounds issues with BouncyCastle, but don't appear to be
catastrophic). I'm on a slightly older build (~Feb 16th) but could try
updating and/or a patch if you have something you want to try.

-brian

···

On Thu, Mar 12, 2015 at 3:05 PM, Michael Diordiev <zalmoxisus@gmail.com> wrote:

Boris, thank you for the advance. In addition to disabling RtcpMux in
jitsi-meet, I also commented the lines 582-586 from [1], and that
helped. For more than 24 hours I got no errors (as I said, usually it
crashed each hour).

Regards,
Zalmoxisus

[1]
https://github.com/jitsi/jitsi-meet/blob/master/modules/xmpp/strophe.emuc.js#L582

On Thu, Mar 12, 2015 at 11:35 PM, Boris Grozev <boris@jitsi.org> wrote:
> On 12/03/15 20:20, Danny van Heumen wrote:
>>
>> Maybe this is not an issue at all, but just in case.
>>
>> OpenSSL is not thread-safe by default
>> (http://wiki.openssl.org/index.php/Libcrypto_API#Thread_Safety)
>> Could it be that threading issues arise when sufficiently many users are
>> active?
>>
>> Just a quick thought ..
>
>
> That's interesting!
>
> Our native OpenSSL HMAC_CTXs are each tied to a BaseSRTPCryptoContext.
And
> the authenticatePacketHMAC() method is not thread safe regardless of
whether
> we use OpenSSL's HMAC or not[1]. I think this could explain the crash,
if we
> somehow end-up using a single BaseSRTPCryptoContext in more than one
thread.
>
> For different users we definitely use different contexts. Without bundle
and
> without rtcp-mux we have a different DtlsPacketTransformer for each
thread,
> so different contexts again. We always use different contexts for sending
> and receiving. DataChannels don't use SRTP, so they shouldn't be an
issue.
> Ditto for the actual DTLS connect threads.
>
> The only thing I see left is when bundle and rtcp-mux are enabled. In
this
> case we have 4 threads ({audio, video} x {RTP, RTCP}) that share a
> DtlsTransforEngine. So far I don't see how we could end up using the same
> BaseSRTPCryptoContext in more than one thread, but it's complex code and
I
> haven't looked very carefully.
>
>
>
> Michael, if you remove libjnopenssl.so, the bridge will fallback to using
> the BouncyCastle HMAC implementation. If the problem is OpenSSL-specific,
> this should resolve it. If the problem is something like the speculation
> above, this should still help, because you will have single java threads
> dieing instead of the whole jvm. It might also give us some useful
> information.
>
>
>
> Regards,
> Boris
>
>
> [1]
>
https://github.com/jitsi/libjitsi/blob/master/src/org/jitsi/impl/neomedia/transform/srtp/BaseSRTPCryptoContext.java#L261
>
>>
>> Danny
>>
>>
>>
>> On 11-03-15 14:16, Michael Diordiev wrote:
>>>
>>> I wasn't right. Just got this exception with useRtcpMux disabled:
>>>
>>>
>>> #
>>>
>>> # A fatal error has been detected by the Java Runtime Environment:
>>>
>>> #
>>>
>>> # SIGSEGV (0xb) at pc=0x00007f43dbd103c7, pid=24638,
tid=139929327531776
>>>
>>> #
>>>
>>> # JRE version: OpenJDK Runtime Environment (7.0_65-b32) (build
>>> 1.7.0_65-b32)
>>>
>>> # Java VM: OpenJDK 64-Bit Server VM (24.65-b04 mixed mode linux-amd64
>>> compressed oops)
>>>
>>> # Derivative: IcedTea 2.5.2
>>>
>>> # Distribution: Ubuntu 14.04 LTS, package 7u65-2.5.2-3~14.04
>>>
>>> # Problematic frame:
>>>
>>> # C [libcrypto.so.1.0.0+0xea3c7] EVP_MD_CTX_cleanup+0xd7
>>>
>>> #
>>>
>>> # Failed to write core dump. Core dumps have been disabled. To enable
>>> core dumping, try "ulimit -c unlimited" before starting Java again
>>>
>>> #
>>>
>>> # An error report file with more information is saved as:
>>>
>>> # /tmp/hs_err_pid24638.log
>>>
>>> #
>>>
>>> # If you would like to submit a bug report, please include
>>>
>>> # instructions on how to reproduce the bug and visit:
>>>
>>> # http://icedtea.classpath.org/bugzilla
>>>
>>> # The crash happened outside the Java Virtual Machine in native code.
>>>
>>> # See problematic frame for where to report the bug.
>>>
>>> #
>>>
>>> Gak, chk->snd_count:30 >= max:30 - send abort
>>>
>>> /usr/share/jitsi-videobridge/jvb.sh: line 32: 24638 Aborted
>>> (core dumped) LD_LIBRARY_PATH=$libs java
>>> -Xmx$VIDEOBRIDGE_MAX_MEMORY $VIDEOBRIDGE_DEBUG_OPTIONS
>>> -XX:-HeapDumpOnOutOfMemoryError -Djava.library.path=$libs
>>> -Djava.util.logging.config.file=$logging_config -cp $cp $mainClass $@
>>>
>>> Regards,
>>> Zalmoxisus
>>>
>>> On Wed, Mar 11, 2015 at 3:06 PM, Michael Diordiev < > zalmoxisus@gmail.com> > >>> wrote:
>>>>
>>>> Unfortunately, I do not see how to reproduce it. When we have many
>>>> participants (more than 10), it usually occurs once a hour. If there
>>>> are not so many participants, it can occur once a day.
>>>>
>>>> Maybe it is somehow affected when users reconnects using the same jid
>>>> (we do not use anonymousdomain). Four months ago I reported another
>>>> issue for jitsi videobridge with useRtcpMux enabled and
>>>> non-anonymousdomain [1]. Paweldomas fixed it [2], but it still
>>>> sometimes throwed the exception, so I just renounced to use rtcmux.
>>>> Last week I started again to use rtcmux because it is required to
>>>> support firefox [3], and I got these errors.
>>>>
>>>> [1]
>>>>
http://markmail.org/message/j5css5w6tpghd5sp#query:+page:1+mid:vnor7mm3tmsjiklp+state:results
>>>> [2]
>>>>
https://github.com/jitsi/jitsi-videobridge/commit/0edc4009bbbd1b3aeaa08136b08eb9f454e4b410
>>>> [3]
>>>>
https://github.com/jitsi/jitsi-meet/commit/05bbfda5bb6b54ff78dec93ede3d5054ad49843a
>>>>
>>>> Regards,
>>>> Zalmoxisus
>>>>
>>>> On Wed, Mar 11, 2015 at 2:36 PM, Boris Grozev <boris@jitsi.org> > wrote:
>>>>>
>>>>> On 11/03/15 12:57, Michael Diordiev wrote:
>>>>>>
>>>>>> Update: In may case it occurs only when useRtcpMux is enabled.
>>>>>
>>>>>
>>>>> Does that mean that you have a way to reproduce it, or that you see
if
>>>>> relatively often? AFAIK we only saw this once (indeed with rtcpmux
>>>>> enabled).
>>>>>
>>>>>
>>>>> Boris
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> dev mailing list
>>>>> dev@jitsi.org
>>>>> Unsubscribe instructions and other list options:
>>>>> http://lists.jitsi.org/mailman/listinfo/dev
>>>
>>> _______________________________________________
>>> dev mailing list
>>> dev@jitsi.org
>>> Unsubscribe instructions and other list options:
>>> http://lists.jitsi.org/mailman/listinfo/dev
>>
>>
>>
>>
>>
>> _______________________________________________
>> dev mailing list
>> dev@jitsi.org
>> Unsubscribe instructions and other list options:
>> http://lists.jitsi.org/mailman/listinfo/dev
>>
>
>
> _______________________________________________
> dev mailing list
> dev@jitsi.org
> Unsubscribe instructions and other list options:
> http://lists.jitsi.org/mailman/listinfo/dev

_______________________________________________
dev mailing list
dev@jitsi.org
Unsubscribe instructions and other list options:
http://lists.jitsi.org/mailman/listinfo/dev


#13

Hello Brian,

I was seeing this pretty consistently (on the order of 8 or 9 out of 10
times) with our own Jitsi client that we've begun building. Disabling
OpenSSL and going to BouncyCastle got rid of the issue so from what I've
seen it seems likely that it's OpenSSL-related (have also seen
ArrayOutOfBounds issues with BouncyCastle, but don't appear to be
catastrophic).

Could you please share these? They might be quite useful for pinpointing the problem.

Regards,
Boris

···

On 23/03/15 21:13, Brian Baldino wrote:

I'm on a slightly older build (~Feb 16th) but could try
updating and/or a patch if you have something you want to try.

-brian

On Thu, Mar 12, 2015 at 3:05 PM, Michael Diordiev <zalmoxisus@gmail.com > <mailto:zalmoxisus@gmail.com>> wrote:

    Boris, thank you for the advance. In addition to disabling RtcpMux in
    jitsi-meet, I also commented the lines 582-586 from [1], and that
    helped. For more than 24 hours I got no errors (as I said, usually it
    crashed each hour).

    Regards,
    Zalmoxisus

    [1]
    https://github.com/jitsi/jitsi-meet/blob/master/modules/xmpp/strophe.emuc.js#L582

    On Thu, Mar 12, 2015 at 11:35 PM, Boris Grozev <boris@jitsi.org > <mailto:boris@jitsi.org>> wrote:
     > On 12/03/15 20:20, Danny van Heumen wrote:
     >>
     >> Maybe this is not an issue at all, but just in case.
     >>
     >> OpenSSL is not thread-safe by default
     >> (http://wiki.openssl.org/index.php/Libcrypto_API#Thread_Safety)
     >> Could it be that threading issues arise when sufficiently many
    users are
     >> active?
     >>
     >> Just a quick thought ..
     >
     > That's interesting!
     >
     > Our native OpenSSL HMAC_CTXs are each tied to a
    BaseSRTPCryptoContext. And
     > the authenticatePacketHMAC() method is not thread safe regardless
    of whether
     > we use OpenSSL's HMAC or not[1]. I think this could explain the
    crash, if we
     > somehow end-up using a single BaseSRTPCryptoContext in more than
    one thread.
     >
     > For different users we definitely use different contexts. Without
    bundle and
     > without rtcp-mux we have a different DtlsPacketTransformer for
    each thread,
     > so different contexts again. We always use different contexts for
    sending
     > and receiving. DataChannels don't use SRTP, so they shouldn't be
    an issue.
     > Ditto for the actual DTLS connect threads.
     >
     > The only thing I see left is when bundle and rtcp-mux are
    enabled. In this
     > case we have 4 threads ({audio, video} x {RTP, RTCP}) that share a
     > DtlsTransforEngine. So far I don't see how we could end up using
    the same
     > BaseSRTPCryptoContext in more than one thread, but it's complex
    code and I
     > haven't looked very carefully.
     >
     > Michael, if you remove libjnopenssl.so, the bridge will fallback
    to using
     > the BouncyCastle HMAC implementation. If the problem is
    OpenSSL-specific,
     > this should resolve it. If the problem is something like the
    speculation
     > above, this should still help, because you will have single java
    threads
     > dieing instead of the whole jvm. It might also give us some useful
     > information.
     >
     > Regards,
     > Boris
     >
     > [1]
     >
    https://github.com/jitsi/libjitsi/blob/master/src/org/jitsi/impl/neomedia/transform/srtp/BaseSRTPCryptoContext.java#L261
     >
     >>
     >> Danny
     >>
     >> On 11-03-15 14:16, Michael Diordiev wrote:
     >>>
     >>> I wasn't right. Just got this exception with useRtcpMux disabled:
     >>>
     >>> #
     >>>
     >>> # A fatal error has been detected by the Java Runtime Environment:
     >>>
     >>> #
     >>>
     >>> # SIGSEGV (0xb) at pc=0x00007f43dbd103c7, pid=24638,
    tid=139929327531776
     >>>
     >>> #
     >>>
     >>> # JRE version: OpenJDK Runtime Environment (7.0_65-b32) (build
     >>> 1.7.0_65-b32)
     >>>
     >>> # Java VM: OpenJDK 64-Bit Server VM (24.65-b04 mixed mode
    linux-amd64
     >>> compressed oops)
     >>>
     >>> # Derivative: IcedTea 2.5.2
     >>>
     >>> # Distribution: Ubuntu 14.04 LTS, package 7u65-2.5.2-3~14.04
     >>>
     >>> # Problematic frame:
     >>>
     >>> # C [libcrypto.so.1.0.0+0xea3c7] EVP_MD_CTX_cleanup+0xd7
     >>>
     >>> #
     >>>
     >>> # Failed to write core dump. Core dumps have been disabled. To
    enable
     >>> core dumping, try "ulimit -c unlimited" before starting Java again
     >>>
     >>> #
     >>>
     >>> # An error report file with more information is saved as:
     >>>
     >>> # /tmp/hs_err_pid24638.log
     >>>
     >>> #
     >>>
     >>> # If you would like to submit a bug report, please include
     >>>
     >>> # instructions on how to reproduce the bug and visit:
     >>>
     >>> # http://icedtea.classpath.org/bugzilla
     >>>
     >>> # The crash happened outside the Java Virtual Machine in native
    code.
     >>>
     >>> # See problematic frame for where to report the bug.
     >>>
     >>> #
     >>>
     >>> Gak, chk->snd_count:30 >= max:30 - send abort
     >>>
     >>> /usr/share/jitsi-videobridge/jvb.sh: line 32: 24638 Aborted
     >>> (core dumped) LD_LIBRARY_PATH=$libs java
     >>> -Xmx$VIDEOBRIDGE_MAX_MEMORY $VIDEOBRIDGE_DEBUG_OPTIONS
     >>> -XX:-HeapDumpOnOutOfMemoryError -Djava.library.path=$libs
     >>> -Djava.util.logging.config.file=$logging_config -cp $cp
    $mainClass $@
     >>>
     >>> Regards,
     >>> Zalmoxisus
     >>>
     >>> On Wed, Mar 11, 2015 at 3:06 PM, Michael Diordiev > <zalmoxisus@gmail.com <mailto:zalmoxisus@gmail.com>> > >>> wrote:
     >>>>
     >>>> Unfortunately, I do not see how to reproduce it. When we have many
     >>>> participants (more than 10), it usually occurs once a hour. If
    there
     >>>> are not so many participants, it can occur once a day.
     >>>>
     >>>> Maybe it is somehow affected when users reconnects using the
    same jid
     >>>> (we do not use anonymousdomain). Four months ago I reported
    another
     >>>> issue for jitsi videobridge with useRtcpMux enabled and
     >>>> non-anonymousdomain [1]. Paweldomas fixed it [2], but it still
     >>>> sometimes throwed the exception, so I just renounced to use
    rtcmux.
     >>>> Last week I started again to use rtcmux because it is required to
     >>>> support firefox [3], and I got these errors.
     >>>>
     >>>> [1]
     >>>>
    http://markmail.org/message/j5css5w6tpghd5sp#query:+page:1+mid:vnor7mm3tmsjiklp+state:results
     >>>> [2]
     >>>>
    https://github.com/jitsi/jitsi-videobridge/commit/0edc4009bbbd1b3aeaa08136b08eb9f454e4b410
     >>>> [3]
     >>>>
    https://github.com/jitsi/jitsi-meet/commit/05bbfda5bb6b54ff78dec93ede3d5054ad49843a
     >>>>
     >>>> Regards,
     >>>> Zalmoxisus
     >>>>
     >>>> On Wed, Mar 11, 2015 at 2:36 PM, Boris Grozev <boris@jitsi.org > <mailto:boris@jitsi.org>> wrote:
     >>>>>
     >>>>> On 11/03/15 12:57, Michael Diordiev wrote:
     >>>>>>
     >>>>>> Update: In may case it occurs only when useRtcpMux is enabled.
     >>>>>
     >>>>> Does that mean that you have a way to reproduce it, or that
    you see if
     >>>>> relatively often? AFAIK we only saw this once (indeed with
    rtcpmux
     >>>>> enabled).
     >>>>>
     >>>>> Boris
     >>>>>
     >>>>> _______________________________________________
     >>>>> dev mailing list
     >>>>> dev@jitsi.org <mailto:dev@jitsi.org>
     >>>>> Unsubscribe instructions and other list options:
     >>>>> http://lists.jitsi.org/mailman/listinfo/dev
     >>>
     >>> _______________________________________________
     >>> dev mailing list
     >>> dev@jitsi.org <mailto:dev@jitsi.org>
     >>> Unsubscribe instructions and other list options:
     >>> http://lists.jitsi.org/mailman/listinfo/dev
     >>
     >> _______________________________________________
     >> dev mailing list
     >> dev@jitsi.org <mailto:dev@jitsi.org>
     >> Unsubscribe instructions and other list options:
     >> http://lists.jitsi.org/mailman/listinfo/dev
     >>
     >
     > _______________________________________________
     > dev mailing list
     > dev@jitsi.org <mailto:dev@jitsi.org>
     > Unsubscribe instructions and other list options:
     > http://lists.jitsi.org/mailman/listinfo/dev

    _______________________________________________
    dev mailing list
    dev@jitsi.org <mailto:dev@jitsi.org>
    Unsubscribe instructions and other list options:
    http://lists.jitsi.org/mailman/listinfo/dev

_______________________________________________
dev mailing list
dev@jitsi.org
Unsubscribe instructions and other list options:
http://lists.jitsi.org/mailman/listinfo/dev


#14

Here's what I see in stdout:'

hs_err_pid24793.log (101 KB)

···

#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x0000000000000000, pid=24793, tid=140230720898816
#
# JRE version: OpenJDK Runtime Environment (7.0_65-b32) (build 1.7.0_65-b32)
# Java VM: OpenJDK 64-Bit Server VM (24.65-b04 mixed mode linux-amd64
compressed oops)
# Derivative: IcedTea 2.5.3
# Distribution: Ubuntu 14.04 LTS, package 7u71-2.5.3-0ubuntu0.14.04.1
# Problematic frame:
# C 0x0000000000000000
#
# Failed to write core dump. Core dumps have been disabled. To enable core
dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /home/maryam/jitsi/jitsi-videobridge/hs_err_pid24793.log
#
# If you would like to submit a bug report, please include
# instructions on how to reproduce the bug and visit:
# http://icedtea.classpath.org/bugzilla
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#
./jvb.sh: line 32: 24793 Aborted (core dumped)
LD_LIBRARY_PATH=$libs java -Xmx$VIDEOBRIDGE_MAX_MEMORY
$VIDEOBRIDGE_DEBUG_OPTIONS -XX:-HeapDumpOnOutOfMemoryError
-Djava.library.path=$libs -Djava.util.logging.config.file=$logging_config
-cp $cp $mainClass $@

Attached the log. Core dump won't do you much good (I've got some local
changes) but I'll try and get it repro'd with a vanilla jitsi version.
Hopefully these are a good start.

On Tue, Mar 24, 2015 at 1:18 AM, Boris Grozev <boris@jitsi.org> wrote:

Hello Brian,

On 23/03/15 21:13, Brian Baldino wrote:

I was seeing this pretty consistently (on the order of 8 or 9 out of 10
times) with our own Jitsi client that we've begun building. Disabling
OpenSSL and going to BouncyCastle got rid of the issue so from what I've
seen it seems likely that it's OpenSSL-related (have also seen
ArrayOutOfBounds issues with BouncyCastle, but don't appear to be
catastrophic).

Could you please share these? They might be quite useful for pinpointing
the problem.

Regards,
Boris

  I'm on a slightly older build (~Feb 16th) but could try

updating and/or a patch if you have something you want to try.

-brian

On Thu, Mar 12, 2015 at 3:05 PM, Michael Diordiev <zalmoxisus@gmail.com >> <mailto:zalmoxisus@gmail.com>> wrote:

    Boris, thank you for the advance. In addition to disabling RtcpMux in
    jitsi-meet, I also commented the lines 582-586 from [1], and that
    helped. For more than 24 hours I got no errors (as I said, usually it
    crashed each hour).

    Regards,
    Zalmoxisus

    [1]
    https://github.com/jitsi/jitsi-meet/blob/master/
modules/xmpp/strophe.emuc.js#L582

    On Thu, Mar 12, 2015 at 11:35 PM, Boris Grozev <boris@jitsi.org >> <mailto:boris@jitsi.org>> wrote:
     > On 12/03/15 20:20, Danny van Heumen wrote:
     >>
     >> Maybe this is not an issue at all, but just in case.
     >>
     >> OpenSSL is not thread-safe by default
     >> (http://wiki.openssl.org/index.php/Libcrypto_API#Thread_Safety)
     >> Could it be that threading issues arise when sufficiently many
    users are
     >> active?
     >>
     >> Just a quick thought ..
     >
     >
     > That's interesting!
     >
     > Our native OpenSSL HMAC_CTXs are each tied to a
    BaseSRTPCryptoContext. And
     > the authenticatePacketHMAC() method is not thread safe regardless
    of whether
     > we use OpenSSL's HMAC or not[1]. I think this could explain the
    crash, if we
     > somehow end-up using a single BaseSRTPCryptoContext in more than
    one thread.
     >
     > For different users we definitely use different contexts. Without
    bundle and
     > without rtcp-mux we have a different DtlsPacketTransformer for
    each thread,
     > so different contexts again. We always use different contexts for
    sending
     > and receiving. DataChannels don't use SRTP, so they shouldn't be
    an issue.
     > Ditto for the actual DTLS connect threads.
     >
     > The only thing I see left is when bundle and rtcp-mux are
    enabled. In this
     > case we have 4 threads ({audio, video} x {RTP, RTCP}) that share a
     > DtlsTransforEngine. So far I don't see how we could end up using
    the same
     > BaseSRTPCryptoContext in more than one thread, but it's complex
    code and I
     > haven't looked very carefully.
     >
     >
     >
     > Michael, if you remove libjnopenssl.so, the bridge will fallback
    to using
     > the BouncyCastle HMAC implementation. If the problem is
    OpenSSL-specific,
     > this should resolve it. If the problem is something like the
    speculation
     > above, this should still help, because you will have single java
    threads
     > dieing instead of the whole jvm. It might also give us some useful
     > information.
     >
     >
     >
     > Regards,
     > Boris
     >
     >
     > [1]
     >
    https://github.com/jitsi/libjitsi/blob/master/src/org/
jitsi/impl/neomedia/transform/srtp/BaseSRTPCryptoContext.java#L261
     >
     >>
     >> Danny
     >>
     >>
     >>
     >> On 11-03-15 14:16, Michael Diordiev wrote:
     >>>
     >>> I wasn't right. Just got this exception with useRtcpMux disabled:
     >>>
     >>>
     >>> #
     >>>
     >>> # A fatal error has been detected by the Java Runtime
Environment:
     >>>
     >>> #
     >>>
     >>> # SIGSEGV (0xb) at pc=0x00007f43dbd103c7, pid=24638,
    tid=139929327531776
     >>>
     >>> #
     >>>
     >>> # JRE version: OpenJDK Runtime Environment (7.0_65-b32) (build
     >>> 1.7.0_65-b32)
     >>>
     >>> # Java VM: OpenJDK 64-Bit Server VM (24.65-b04 mixed mode
    linux-amd64
     >>> compressed oops)
     >>>
     >>> # Derivative: IcedTea 2.5.2
     >>>
     >>> # Distribution: Ubuntu 14.04 LTS, package 7u65-2.5.2-3~14.04
     >>>
     >>> # Problematic frame:
     >>>
     >>> # C [libcrypto.so.1.0.0+0xea3c7] EVP_MD_CTX_cleanup+0xd7
     >>>
     >>> #
     >>>
     >>> # Failed to write core dump. Core dumps have been disabled. To
    enable
     >>> core dumping, try "ulimit -c unlimited" before starting Java
again
     >>>
     >>> #
     >>>
     >>> # An error report file with more information is saved as:
     >>>
     >>> # /tmp/hs_err_pid24638.log
     >>>
     >>> #
     >>>
     >>> # If you would like to submit a bug report, please include
     >>>
     >>> # instructions on how to reproduce the bug and visit:
     >>>
     >>> # http://icedtea.classpath.org/bugzilla
     >>>
     >>> # The crash happened outside the Java Virtual Machine in native
    code.
     >>>
     >>> # See problematic frame for where to report the bug.
     >>>
     >>> #
     >>>
     >>> Gak, chk->snd_count:30 >= max:30 - send abort
     >>>
     >>> /usr/share/jitsi-videobridge/jvb.sh: line 32: 24638 Aborted
     >>> (core dumped) LD_LIBRARY_PATH=$libs java
     >>> -Xmx$VIDEOBRIDGE_MAX_MEMORY $VIDEOBRIDGE_DEBUG_OPTIONS
     >>> -XX:-HeapDumpOnOutOfMemoryError -Djava.library.path=$libs
     >>> -Djava.util.logging.config.file=$logging_config -cp $cp
    $mainClass $@
     >>>
     >>> Regards,
     >>> Zalmoxisus
     >>>
     >>> On Wed, Mar 11, 2015 at 3:06 PM, Michael Diordiev >> <zalmoxisus@gmail.com <mailto:zalmoxisus@gmail.com>> >> >>> wrote:
     >>>>
     >>>> Unfortunately, I do not see how to reproduce it. When we have
many
     >>>> participants (more than 10), it usually occurs once a hour. If
    there
     >>>> are not so many participants, it can occur once a day.
     >>>>
     >>>> Maybe it is somehow affected when users reconnects using the
    same jid
     >>>> (we do not use anonymousdomain). Four months ago I reported
    another
     >>>> issue for jitsi videobridge with useRtcpMux enabled and
     >>>> non-anonymousdomain [1]. Paweldomas fixed it [2], but it still
     >>>> sometimes throwed the exception, so I just renounced to use
    rtcmux.
     >>>> Last week I started again to use rtcmux because it is required
to
     >>>> support firefox [3], and I got these errors.
     >>>>
     >>>> [1]
     >>>>
    http://markmail.org/message/j5css5w6tpghd5sp#query:+page:
1+mid:vnor7mm3tmsjiklp+state:results
     >>>> [2]
     >>>>
    https://github.com/jitsi/jitsi-videobridge/commit/
0edc4009bbbd1b3aeaa08136b08eb9f454e4b410
     >>>> [3]
     >>>>
    https://github.com/jitsi/jitsi-meet/commit/
05bbfda5bb6b54ff78dec93ede3d5054ad49843a
     >>>>
     >>>> Regards,
     >>>> Zalmoxisus
     >>>>
     >>>> On Wed, Mar 11, 2015 at 2:36 PM, Boris Grozev <boris@jitsi.org >> <mailto:boris@jitsi.org>> wrote:
     >>>>>
     >>>>> On 11/03/15 12:57, Michael Diordiev wrote:
     >>>>>>
     >>>>>> Update: In may case it occurs only when useRtcpMux is enabled.
     >>>>>
     >>>>>
     >>>>> Does that mean that you have a way to reproduce it, or that
    you see if
     >>>>> relatively often? AFAIK we only saw this once (indeed with
    rtcpmux
     >>>>> enabled).
     >>>>>
     >>>>>
     >>>>> Boris
     >>>>>
     >>>>>
     >>>>> _______________________________________________
     >>>>> dev mailing list
     >>>>> dev@jitsi.org <mailto:dev@jitsi.org>
     >>>>> Unsubscribe instructions and other list options:
     >>>>> http://lists.jitsi.org/mailman/listinfo/dev
     >>>
     >>> _______________________________________________
     >>> dev mailing list
     >>> dev@jitsi.org <mailto:dev@jitsi.org>
     >>> Unsubscribe instructions and other list options:
     >>> http://lists.jitsi.org/mailman/listinfo/dev
     >>
     >>
     >>
     >>
     >>
     >> _______________________________________________
     >> dev mailing list
     >> dev@jitsi.org <mailto:dev@jitsi.org>
     >> Unsubscribe instructions and other list options:
     >> http://lists.jitsi.org/mailman/listinfo/dev
     >>
     >
     >
     > _______________________________________________
     > dev mailing list
     > dev@jitsi.org <mailto:dev@jitsi.org>
     > Unsubscribe instructions and other list options:
     > http://lists.jitsi.org/mailman/listinfo/dev

    _______________________________________________
    dev mailing list
    dev@jitsi.org <mailto:dev@jitsi.org>
    Unsubscribe instructions and other list options:
    http://lists.jitsi.org/mailman/listinfo/dev

_______________________________________________
dev mailing list
dev@jitsi.org
Unsubscribe instructions and other list options:
http://lists.jitsi.org/mailman/listinfo/dev

_______________________________________________
dev mailing list
dev@jitsi.org
Unsubscribe instructions and other list options:
http://lists.jitsi.org/mailman/listinfo/dev


#15

Maybe another data point:
When we saw these crashes, we were using a client that was sending h264.
After switching to bouncy castle, we no longer saw that crash but continued
to see ArrayOutOfBoundsException for video packets, which we're fairly
certain we've tracked down due to packet size. We're doing some
experimenting with sending smaller packets with h264 to see if we can
verify that doing so eliminates the out of bounds issue (we'd also like to
try larger packets with VP8 but their packetization may make that
difficult). I also want to try a Firefox client connecting to jitsi using
h264 to see if that replicates the issue as well, but haven't gotten to
that yet (firefox nightly, which jitsi needs, doesn't seem to be very
stable).

I've pasted some of the array out of bounds stacks we've seen below.

java.lang.ArrayIndexOutOfBoundsException: 80
        at org.bouncycastle.crypto.digests.SHA1Digest.processWord(Unknown
Source)
        at org.bouncycastle.crypto.digests.GeneralDigest.update(Unknown
Source)
        at org.bouncycastle.crypto.macs.HMac.update(Unknown Source)
        at
org.jitsi.impl.neomedia.transform.srtp.BaseSRTPCryptoContext.authenticatePacketHMAC(BaseSRTPCryptoContext.java:263)
        at
org.jitsi.impl.neomedia.transform.srtp.SRTPCryptoContext.authenticatePacket(SRTPCryptoContext.java:246)
        at
org.jitsi.impl.neomedia.transform.srtp.SRTPCryptoContext.reverseTransformPacket(SRTPCryptoContext.java:594)
        at
org.jitsi.impl.neomedia.transform.srtp.SRTPTransformer.reverseTransform(SRTPTransformer.java:189)
        at
org.jitsi.impl.neomedia.transform.dtls.DtlsPacketTransformer.reverseTransform(DtlsPacketTransformer.java:797)
        at
org.jitsi.impl.neomedia.transform.SinglePacketTransformer.reverseTransform(SinglePacketTransformer.java:129)
        at
org.jitsi.impl.neomedia.transform.TransformEngineChain$PacketTransformerChain.reverseTransform(TransformEngineChain.java:258)
        at
org.jitsi.impl.neomedia.transform.TransformInputStream.createRawPacket(TransformInputStream.java:75)
        at
org.jitsi.impl.neomedia.RTPConnectorInputStream.runInReceiveThread(RTPConnectorInputStream.java:830)
        at
org.jitsi.impl.neomedia.RTPConnectorInputStream.access$000(RTPConnectorInputStream.java:33)
        at
org.jitsi.impl.neomedia.RTPConnectorInputStream$3.run(RTPConnectorInputStream.java:630)
17:03:34.254 SEVERE: [85]
org.jitsi.impl.neomedia.transform.SinglePacketTransformer.error() Failed to
reverse-transform RawPacket(s)!
java.lang.ArrayIndexOutOfBoundsException: 80
        at org.bouncycastle.crypto.digests.SHA1Digest.processWord(Unknown
Source)
        at org.bouncycastle.crypto.digests.GeneralDigest.update(Unknown
Source)
        at org.bouncycastle.crypto.macs.HMac.update(Unknown Source)
        at
org.jitsi.impl.neomedia.transform.srtp.BaseSRTPCryptoContext.authenticatePacketHMAC(BaseSRTPCryptoContext.java:263)
        at
org.jitsi.impl.neomedia.transform.srtp.SRTPCryptoContext.authenticatePacket(SRTPCryptoContext.java:246)
        at
org.jitsi.impl.neomedia.transform.srtp.SRTPCryptoContext.reverseTransformPacket(SRTPCryptoContext.java:594)
        at
org.jitsi.impl.neomedia.transform.srtp.SRTPTransformer.reverseTransform(SRTPTransformer.java:189)
        at
org.jitsi.impl.neomedia.transform.dtls.DtlsPacketTransformer.reverseTransform(DtlsPacketTransformer.java:797)
        at
org.jitsi.impl.neomedia.transform.SinglePacketTransformer.reverseTransform(SinglePacketTransformer.java:129)
        at
org.jitsi.impl.neomedia.transform.TransformEngineChain$PacketTransformerChain.reverseTransform(TransformEngineChain.java:258)
        at
org.jitsi.impl.neomedia.transform.TransformInputStream.createRawPacket(TransformInputStream.java:75)
        at
org.jitsi.impl.neomedia.RTPConnectorInputStream.runInReceiveThread(RTPConnectorInputStream.java:830)
        at
org.jitsi.impl.neomedia.RTPConnectorInputStream.access$000(RTPConnectorInputStream.java:33)
        at
org.jitsi.impl.neomedia.RTPConnectorInputStream$3.run(RTPConnectorInputStream.java:630)
17:03:34.254 SEVERE: [94] util.UtilActivator.uncaughtException().108 An
uncaught exception occurred in
thread=Thread[org.jitsi.impl.neomedia.RTPConnectorInputStream.rece
iveThread,6,main] and message was: 80
java.lang.ArrayIndexOutOfBoundsException: 80
        at org.bouncycastle.crypto.digests.SHA1Digest.processWord(Unknown
Source)
        at org.bouncycastle.crypto.digests.GeneralDigest.update(Unknown
Source)
        at org.bouncycastle.crypto.macs.HMac.update(Unknown Source)
        at
org.jitsi.impl.neomedia.transform.srtp.BaseSRTPCryptoContext.authenticatePacketHMAC(BaseSRTPCryptoContext.java:263)
        at
org.jitsi.impl.neomedia.transform.srtp.SRTPCryptoContext.authenticatePacket(SRTPCryptoContext.java:246)
        at
org.jitsi.impl.neomedia.transform.srtp.SRTPCryptoContext.reverseTransformPacket(SRTPCryptoContext.java:594)
        at
org.jitsi.impl.neomedia.transform.srtp.SRTPTransformer.reverseTransform(SRTPTransformer.java:189)
        at
org.jitsi.impl.neomedia.transform.dtls.DtlsPacketTransformer.reverseTransform(DtlsPacketTransformer.java:797)
        at
org.jitsi.impl.neomedia.transform.SinglePacketTransformer.reverseTransform(SinglePacketTransformer.java:129)
        at
org.jitsi.impl.neomedia.transform.TransformEngineChain$PacketTransformerChain.reverseTransform(TransformEngineChain.java:258)
        at
org.jitsi.impl.neomedia.transform.TransformInputStream.createRawPacket(TransformInputStream.java:75)
        at
org.jitsi.impl.neomedia.RTPConnectorInputStream.runInReceiveThread(RTPConnectorInputStream.java:830)
        at
org.jitsi.impl.neomedia.RTPConnectorInputStream.access$000(RTPConnectorInputStream.java:33)
        at
org.jitsi.impl.neomedia.RTPConnectorInputStream$3.run(RTPConnectorInputStream.java:630)
17:03:34.255 SEVERE: [85] util.UtilActivator.uncaughtException().108 An
uncaught exception occurred in
thread=Thread[org.jitsi.impl.neomedia.RTPConnectorInputStream.rece
iveThread,6,main] and message was: 80

···

On Tue, Mar 24, 2015 at 11:11 AM, Brian Baldino <brian@highfive.com> wrote:

Here's what I see in stdout:'

#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x0000000000000000, pid=24793, tid=140230720898816
#
# JRE version: OpenJDK Runtime Environment (7.0_65-b32) (build
1.7.0_65-b32)
# Java VM: OpenJDK 64-Bit Server VM (24.65-b04 mixed mode linux-amd64
compressed oops)
# Derivative: IcedTea 2.5.3
# Distribution: Ubuntu 14.04 LTS, package 7u71-2.5.3-0ubuntu0.14.04.1
# Problematic frame:
# C 0x0000000000000000
#
# Failed to write core dump. Core dumps have been disabled. To enable core
dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /home/maryam/jitsi/jitsi-videobridge/hs_err_pid24793.log
#
# If you would like to submit a bug report, please include
# instructions on how to reproduce the bug and visit:
# http://icedtea.classpath.org/bugzilla
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#
./jvb.sh: line 32: 24793 Aborted (core dumped)
LD_LIBRARY_PATH=$libs java -Xmx$VIDEOBRIDGE_MAX_MEMORY
$VIDEOBRIDGE_DEBUG_OPTIONS -XX:-HeapDumpOnOutOfMemoryError
-Djava.library.path=$libs -Djava.util.logging.config.file=$logging_config
-cp $cp $mainClass $@

Attached the log. Core dump won't do you much good (I've got some local
changes) but I'll try and get it repro'd with a vanilla jitsi version.
Hopefully these are a good start.

On Tue, Mar 24, 2015 at 1:18 AM, Boris Grozev <boris@jitsi.org> wrote:

Hello Brian,

On 23/03/15 21:13, Brian Baldino wrote:

I was seeing this pretty consistently (on the order of 8 or 9 out of 10
times) with our own Jitsi client that we've begun building. Disabling
OpenSSL and going to BouncyCastle got rid of the issue so from what I've
seen it seems likely that it's OpenSSL-related (have also seen
ArrayOutOfBounds issues with BouncyCastle, but don't appear to be
catastrophic).

Could you please share these? They might be quite useful for pinpointing
the problem.

Regards,
Boris

  I'm on a slightly older build (~Feb 16th) but could try

updating and/or a patch if you have something you want to try.

-brian

On Thu, Mar 12, 2015 at 3:05 PM, Michael Diordiev <zalmoxisus@gmail.com >>> <mailto:zalmoxisus@gmail.com>> wrote:

    Boris, thank you for the advance. In addition to disabling RtcpMux in
    jitsi-meet, I also commented the lines 582-586 from [1], and that
    helped. For more than 24 hours I got no errors (as I said, usually it
    crashed each hour).

    Regards,
    Zalmoxisus

    [1]
    https://github.com/jitsi/jitsi-meet/blob/master/
modules/xmpp/strophe.emuc.js#L582

    On Thu, Mar 12, 2015 at 11:35 PM, Boris Grozev <boris@jitsi.org >>> <mailto:boris@jitsi.org>> wrote:
     > On 12/03/15 20:20, Danny van Heumen wrote:
     >>
     >> Maybe this is not an issue at all, but just in case.
     >>
     >> OpenSSL is not thread-safe by default
     >> (http://wiki.openssl.org/index.php/Libcrypto_API#Thread_Safety)
     >> Could it be that threading issues arise when sufficiently many
    users are
     >> active?
     >>
     >> Just a quick thought ..
     >
     >
     > That's interesting!
     >
     > Our native OpenSSL HMAC_CTXs are each tied to a
    BaseSRTPCryptoContext. And
     > the authenticatePacketHMAC() method is not thread safe regardless
    of whether
     > we use OpenSSL's HMAC or not[1]. I think this could explain the
    crash, if we
     > somehow end-up using a single BaseSRTPCryptoContext in more than
    one thread.
     >
     > For different users we definitely use different contexts. Without
    bundle and
     > without rtcp-mux we have a different DtlsPacketTransformer for
    each thread,
     > so different contexts again. We always use different contexts for
    sending
     > and receiving. DataChannels don't use SRTP, so they shouldn't be
    an issue.
     > Ditto for the actual DTLS connect threads.
     >
     > The only thing I see left is when bundle and rtcp-mux are
    enabled. In this
     > case we have 4 threads ({audio, video} x {RTP, RTCP}) that share a
     > DtlsTransforEngine. So far I don't see how we could end up using
    the same
     > BaseSRTPCryptoContext in more than one thread, but it's complex
    code and I
     > haven't looked very carefully.
     >
     >
     >
     > Michael, if you remove libjnopenssl.so, the bridge will fallback
    to using
     > the BouncyCastle HMAC implementation. If the problem is
    OpenSSL-specific,
     > this should resolve it. If the problem is something like the
    speculation
     > above, this should still help, because you will have single java
    threads
     > dieing instead of the whole jvm. It might also give us some useful
     > information.
     >
     >
     >
     > Regards,
     > Boris
     >
     >
     > [1]
     >
    https://github.com/jitsi/libjitsi/blob/master/src/org/
jitsi/impl/neomedia/transform/srtp/BaseSRTPCryptoContext.java#L261
     >
     >>
     >> Danny
     >>
     >>
     >>
     >> On 11-03-15 14:16, Michael Diordiev wrote:
     >>>
     >>> I wasn't right. Just got this exception with useRtcpMux
disabled:
     >>>
     >>>
     >>> #
     >>>
     >>> # A fatal error has been detected by the Java Runtime
Environment:
     >>>
     >>> #
     >>>
     >>> # SIGSEGV (0xb) at pc=0x00007f43dbd103c7, pid=24638,
    tid=139929327531776
     >>>
     >>> #
     >>>
     >>> # JRE version: OpenJDK Runtime Environment (7.0_65-b32) (build
     >>> 1.7.0_65-b32)
     >>>
     >>> # Java VM: OpenJDK 64-Bit Server VM (24.65-b04 mixed mode
    linux-amd64
     >>> compressed oops)
     >>>
     >>> # Derivative: IcedTea 2.5.2
     >>>
     >>> # Distribution: Ubuntu 14.04 LTS, package 7u65-2.5.2-3~14.04
     >>>
     >>> # Problematic frame:
     >>>
     >>> # C [libcrypto.so.1.0.0+0xea3c7] EVP_MD_CTX_cleanup+0xd7
     >>>
     >>> #
     >>>
     >>> # Failed to write core dump. Core dumps have been disabled. To
    enable
     >>> core dumping, try "ulimit -c unlimited" before starting Java
again
     >>>
     >>> #
     >>>
     >>> # An error report file with more information is saved as:
     >>>
     >>> # /tmp/hs_err_pid24638.log
     >>>
     >>> #
     >>>
     >>> # If you would like to submit a bug report, please include
     >>>
     >>> # instructions on how to reproduce the bug and visit:
     >>>
     >>> # http://icedtea.classpath.org/bugzilla
     >>>
     >>> # The crash happened outside the Java Virtual Machine in native
    code.
     >>>
     >>> # See problematic frame for where to report the bug.
     >>>
     >>> #
     >>>
     >>> Gak, chk->snd_count:30 >= max:30 - send abort
     >>>
     >>> /usr/share/jitsi-videobridge/jvb.sh: line 32: 24638 Aborted
     >>> (core dumped) LD_LIBRARY_PATH=$libs java
     >>> -Xmx$VIDEOBRIDGE_MAX_MEMORY $VIDEOBRIDGE_DEBUG_OPTIONS
     >>> -XX:-HeapDumpOnOutOfMemoryError -Djava.library.path=$libs
     >>> -Djava.util.logging.config.file=$logging_config -cp $cp
    $mainClass $@
     >>>
     >>> Regards,
     >>> Zalmoxisus
     >>>
     >>> On Wed, Mar 11, 2015 at 3:06 PM, Michael Diordiev >>> <zalmoxisus@gmail.com <mailto:zalmoxisus@gmail.com>> >>> >>> wrote:
     >>>>
     >>>> Unfortunately, I do not see how to reproduce it. When we have
many
     >>>> participants (more than 10), it usually occurs once a hour. If
    there
     >>>> are not so many participants, it can occur once a day.
     >>>>
     >>>> Maybe it is somehow affected when users reconnects using the
    same jid
     >>>> (we do not use anonymousdomain). Four months ago I reported
    another
     >>>> issue for jitsi videobridge with useRtcpMux enabled and
     >>>> non-anonymousdomain [1]. Paweldomas fixed it [2], but it still
     >>>> sometimes throwed the exception, so I just renounced to use
    rtcmux.
     >>>> Last week I started again to use rtcmux because it is required
to
     >>>> support firefox [3], and I got these errors.
     >>>>
     >>>> [1]
     >>>>
    http://markmail.org/message/j5css5w6tpghd5sp#query:+page:
1+mid:vnor7mm3tmsjiklp+state:results
     >>>> [2]
     >>>>
    https://github.com/jitsi/jitsi-videobridge/commit/
0edc4009bbbd1b3aeaa08136b08eb9f454e4b410
     >>>> [3]
     >>>>
    https://github.com/jitsi/jitsi-meet/commit/
05bbfda5bb6b54ff78dec93ede3d5054ad49843a
     >>>>
     >>>> Regards,
     >>>> Zalmoxisus
     >>>>
     >>>> On Wed, Mar 11, 2015 at 2:36 PM, Boris Grozev <boris@jitsi.org >>> <mailto:boris@jitsi.org>> wrote:
     >>>>>
     >>>>> On 11/03/15 12:57, Michael Diordiev wrote:
     >>>>>>
     >>>>>> Update: In may case it occurs only when useRtcpMux is
enabled.
     >>>>>
     >>>>>
     >>>>> Does that mean that you have a way to reproduce it, or that
    you see if
     >>>>> relatively often? AFAIK we only saw this once (indeed with
    rtcpmux
     >>>>> enabled).
     >>>>>
     >>>>>
     >>>>> Boris
     >>>>>
     >>>>>
     >>>>> _______________________________________________
     >>>>> dev mailing list
     >>>>> dev@jitsi.org <mailto:dev@jitsi.org>
     >>>>> Unsubscribe instructions and other list options:
     >>>>> http://lists.jitsi.org/mailman/listinfo/dev
     >>>
     >>> _______________________________________________
     >>> dev mailing list
     >>> dev@jitsi.org <mailto:dev@jitsi.org>
     >>> Unsubscribe instructions and other list options:
     >>> http://lists.jitsi.org/mailman/listinfo/dev
     >>
     >>
     >>
     >>
     >>
     >> _______________________________________________
     >> dev mailing list
     >> dev@jitsi.org <mailto:dev@jitsi.org>
     >> Unsubscribe instructions and other list options:
     >> http://lists.jitsi.org/mailman/listinfo/dev
     >>
     >
     >
     > _______________________________________________
     > dev mailing list
     > dev@jitsi.org <mailto:dev@jitsi.org>
     > Unsubscribe instructions and other list options:
     > http://lists.jitsi.org/mailman/listinfo/dev

    _______________________________________________
    dev mailing list
    dev@jitsi.org <mailto:dev@jitsi.org>
    Unsubscribe instructions and other list options:
    http://lists.jitsi.org/mailman/listinfo/dev

_______________________________________________
dev mailing list
dev@jitsi.org
Unsubscribe instructions and other list options:
http://lists.jitsi.org/mailman/listinfo/dev

_______________________________________________
dev mailing list
dev@jitsi.org
Unsubscribe instructions and other list options:
http://lists.jitsi.org/mailman/listinfo/dev


#16

Hello,

Maybe another data point:
When we saw these crashes, we were using a client that was sending
h264. After switching to bouncy castle, we no longer saw that crash but
continued to see ArrayOutOfBoundsException for video packets, which
we're fairly certain we've tracked down due to packet size. We're doing
some experimenting with sending smaller packets with h264 to see if we
can verify that doing so eliminates the out of bounds issue (we'd also
like to try larger packets with VP8 but their packetization may make
that difficult). I also want to try a Firefox client connecting to
jitsi using h264 to see if that replicates the issue as well, but
haven't gotten to that yet (firefox nightly, which jitsi needs, doesn't
seem to be very stable).

I've pasted some of the array out of bounds stacks we've seen below.

Thanks, this is very useful!

In your setup, are you using bundle (e.g. COLIBRI channel-bundles)?

I think that this is a synchronization issue. I see (very rarely) one BaseSRTPCryptoContext being used by two different threads, and I think this explains both the OpenSSL and BC cases. But I don't yet know what triggers it. I don't think codec or packet size is a problem in this case.

There are two potential workarounds:
1. Disable bundle (from config.js in the case of Jitsi Meet). Note that this breaks TCP support.
2. Make BaseSRTPCryptoContext#authenticatePacketHMAC() synchronized

Of course once we figure out exactly what causes it, it will be fixed.

Boris

···

On 25/03/15 01:13, Brian Baldino wrote:


#17

Thanks for the response, Boris. I think we've got further proof that this
is what's going on here.

First of all, I agree the packet size seemed to be a red herring. But I
was able to build bouncycastle from source and add some prints, and can
verify that there are multiple threads accessing the same digest instance.
The eventual cause of the exception is here (in
bc-java-r1rv51/core/src/main/java/org/bouncycastle/crypto/digests/SHA1Digest.java):

protected void processWord(

        byte[] in,

        int inOff)

    {

        // Note: Inlined for performance

// X[xOff] = Pack.bigEndianToInt(in, inOff);

        int n = in[ inOff] << 24;

        n |= (in[++inOff] & 0xff) << 16;

        n |= (in[++inOff] & 0xff) << 8;

        n |= (in[++inOff] & 0xff);

        if (xOff == 80) {

          System.out.println("processWord accessing array[" + xOff + "]");

        }

        X[xOff] = n;

        ++xOff;

        System.out.println("Thread " + Thread.currentThread().getId() + ",
instance " + this.hashCode() + " processWord: (above processBlock) Setting
xOff to " + xOff);
        System.out.println("xOff == 16? " + (xOff == 16));

        //if (++xOff == 16)

        if (xOff == 16)

        {

            System.out.println("Thread " + Thread.currentThread().getId() +
", instance " + this.hashCode() + " processing block");

            processBlock();

        }

    }

I can see (from my prints added above) that the same instance gets accessed
from multiple threads, which causes xOff to not be equal to 16 when it does
the check for processBlock, so it never ends up calling processBlock (which
resets it) and then just keeps increasing until it eventually hits array
out of bounds.

About to try now with making BaseSRTPCryptoContext#authenticatePacketHMAC
synchronized.

-brian

···

On Wed, Mar 25, 2015 at 3:03 AM, Boris Grozev <boris@jitsi.org> wrote:

Hello,

On 25/03/15 01:13, Brian Baldino wrote:

Maybe another data point:
When we saw these crashes, we were using a client that was sending
h264. After switching to bouncy castle, we no longer saw that crash but
continued to see ArrayOutOfBoundsException for video packets, which
we're fairly certain we've tracked down due to packet size. We're doing
some experimenting with sending smaller packets with h264 to see if we
can verify that doing so eliminates the out of bounds issue (we'd also
like to try larger packets with VP8 but their packetization may make
that difficult). I also want to try a Firefox client connecting to
jitsi using h264 to see if that replicates the issue as well, but
haven't gotten to that yet (firefox nightly, which jitsi needs, doesn't
seem to be very stable).

I've pasted some of the array out of bounds stacks we've seen below.

Thanks, this is very useful!

In your setup, are you using bundle (e.g. COLIBRI channel-bundles)?

I think that this is a synchronization issue. I see (very rarely) one
BaseSRTPCryptoContext being used by two different threads, and I think this
explains both the OpenSSL and BC cases. But I don't yet know what triggers
it. I don't think codec or packet size is a problem in this case.

There are two potential workarounds:
1. Disable bundle (from config.js in the case of Jitsi Meet). Note that
this breaks TCP support.
2. Make BaseSRTPCryptoContext#authenticatePacketHMAC() synchronized

Of course once we figure out exactly what causes it, it will be fixed.

Boris

_______________________________________________
dev mailing list
dev@jitsi.org
Unsubscribe instructions and other list options:
http://lists.jitsi.org/mailman/listinfo/dev


#18

I am about to push a more proper fix (pretty much the same synchronization added, with additional changes to videobridge to prevent this accidental access from multiple threads). I just want to run a few quick tests first.

Boris

···

On 25/03/15 18:26, Brian Baldino wrote:

Thanks for the response, Boris. I think we've got further proof that
this is what's going on here.

First of all, I agree the packet size seemed to be a red herring. But I
was able to build bouncycastle from source and add some prints, and can
verify that there are multiple threads accessing the same digest
instance. The eventual cause of the exception is here (in
bc-java-r1rv51/core/src/main/java/org/bouncycastle/crypto/digests/SHA1Digest.java):

protected void processWord(
         byte[] in,
         int inOff)
     {
         // Note: Inlined for performance
// X[xOff] = Pack.bigEndianToInt(in, inOff);
         int n = in[ inOff] << 24;
         n |= (in[++inOff] & 0xff) << 16;
         n |= (in[++inOff] & 0xff) << 8;
         n |= (in[++inOff] & 0xff);
         if (xOff == 80) {
           System.out.println("processWord accessing array[" + xOff + "]");
         }
         X[xOff] = n;
         ++xOff;
         System.out.println("Thread " + Thread.currentThread().getId() +
", instance " + this.hashCode() + " processWord: (above processBlock)
Setting xOff to " + xOff);
         System.out.println("xOff == 16? " + (xOff == 16));
         //if (++xOff == 16)
         if (xOff == 16)
         {
             System.out.println("Thread " +
Thread.currentThread().getId() + ", instance " + this.hashCode() + "
processing block");
             processBlock();
         }
     }

I can see (from my prints added above) that the same instance gets
accessed from multiple threads, which causes xOff to not be equal to 16
when it does the check for processBlock, so it never ends up calling
processBlock (which resets it) and then just keeps increasing until it
eventually hits array out of bounds.

About to try now with making
BaseSRTPCryptoContext#__authenticatePacketHMAC synchronized.


#19

Hey Boris,
Wanted to let you know that making authenticatePacketHMAC synchronized
seems so far to fix the issues with both openssl and bouncycastle.

···

On Wed, Mar 25, 2015 at 10:33 AM, Boris Grozev <boris@jitsi.org> wrote:

On 25/03/15 18:26, Brian Baldino wrote:

Thanks for the response, Boris. I think we've got further proof that
this is what's going on here.

First of all, I agree the packet size seemed to be a red herring. But I
was able to build bouncycastle from source and add some prints, and can
verify that there are multiple threads accessing the same digest
instance. The eventual cause of the exception is here (in
bc-java-r1rv51/core/src/main/java/org/bouncycastle/crypto/
digests/SHA1Digest.java):

protected void processWord(
         byte[] in,
         int inOff)
     {
         // Note: Inlined for performance
// X[xOff] = Pack.bigEndianToInt(in, inOff);
         int n = in[ inOff] << 24;
         n |= (in[++inOff] & 0xff) << 16;
         n |= (in[++inOff] & 0xff) << 8;
         n |= (in[++inOff] & 0xff);
         if (xOff == 80) {
           System.out.println("processWord accessing array[" + xOff +
"]");
         }
         X[xOff] = n;
         ++xOff;
         System.out.println("Thread " + Thread.currentThread().getId() +
", instance " + this.hashCode() + " processWord: (above processBlock)
Setting xOff to " + xOff);
         System.out.println("xOff == 16? " + (xOff == 16));
         //if (++xOff == 16)
         if (xOff == 16)
         {
             System.out.println("Thread " +
Thread.currentThread().getId() + ", instance " + this.hashCode() + "
processing block");
             processBlock();
         }
     }

I can see (from my prints added above) that the same instance gets
accessed from multiple threads, which causes xOff to not be equal to 16
when it does the check for processBlock, so it never ends up calling
processBlock (which resets it) and then just keeps increasing until it
eventually hits array out of bounds.

About to try now with making
BaseSRTPCryptoContext#__authenticatePacketHMAC synchronized.

I am about to push a more proper fix (pretty much the same synchronization
added, with additional changes to videobridge to prevent this accidental
access from multiple threads). I just want to run a few quick tests first.

Boris

_______________________________________________
dev mailing list
dev@jitsi.org
Unsubscribe instructions and other list options:
http://lists.jitsi.org/mailman/listinfo/dev


#20

Thanks for confirming!

Videobridge build 486 (and the latest github master) contains a fix. Please let us know if you experience the problem with it.

Regards,
Boris

···

On 25/03/15 21:35, Brian Baldino wrote:

Hey Boris,
Wanted to let you know that making authenticatePacketHMAC synchronized
seems so far to fix the issues with both openssl and bouncycastle.