[jitsi-users] Jitsi-Meet Recovery during network outtage


#1

I'm currently testing using my own infrastructure, but I'm curious to what
the methodology currently is for recovering a connection for a long network
outage.

The way I am testing is by removing my network connection physically, or by
turning off my Wireless adapter when using Wifi.

It seems usually everything recovers well except for Strophe.

Strophe seems to break down intermittently and subsequent http-bosh
requests after a network outage resolve as:

<body xmlns:stream='http://etherx.jabber.org/streams' type='terminate'
condition='improper-addressing' xmlns='http://jabber.org/protocol/httpbind'>

I'm using chat to test sending XMPP chat messages.

Looking in Prosody's http-bosh module, this improper-addressing error seems
to be produced when there is no valid 'to' field.

However, there is in the request:
<message to='roomName@conference.my.video.bridge' type='groupchat'
xmlns='jabber:client'><body>Test</body></message>

Which seems to have a valid 'to' field.

I don't see subsequent http-bind attempts, and the normal presence messages
being sent anymore after a network loss either.

Looking at Strophe it appears that there should be a Connection Manager
that handles disconnecting/reconnecting and manages all of this stuff.

My version of Meet is from July this year- so it's possible I have out of
date connection recovery code, but I seem to remember it handling this a
lot better even a year or more ago.

I think maybe it has more to do with the fact I'm using external-auth, and
locking rooms using Prosody.

Any help or thoughts would be greatly appreciated.

···

--
- Jason Thomas
   http://jasonthom.as


#2

I've spent a lot more time looking into this now, and it seems like there
are multiple issues with the Strophe connection handler that cause it to
get into an unrecoverable state.

The main issue seems to be the strophe.ping.js module in XMPP that should
fail after 30+ seconds of ping failures, but seems to call disconnect
prematurely, at least in my setup.

This leaves the rid/sid of the Strophe connection empty, and causes
connection re-establishment to fail.

The other seems to be in Strophe itself where if there is a pending XHR
request that occurs at a particular time during the disconnect the it fires
an exception in req.xhr.send that seems to break the request chain and
cause future requests not to be fired.

I'm trying to figure out workarounds, or maybe modifications to Strophe to
fix these issues.

Any help would be greatly appreciated.

···

On Mon, Nov 14, 2016 at 11:18 AM, Jason Thomas <mail@jasonthom.as> wrote:

I'm currently testing using my own infrastructure, but I'm curious to what
the methodology currently is for recovering a connection for a long network
outage.

The way I am testing is by removing my network connection physically, or
by turning off my Wireless adapter when using Wifi.

It seems usually everything recovers well except for Strophe.

Strophe seems to break down intermittently and subsequent http-bosh
requests after a network outage resolve as:

<body xmlns:stream='http://etherx.jabber.org/streams' type='terminate'
condition='improper-addressing' xmlns='http://jabber.org/protocol/httpbind
'>

I'm using chat to test sending XMPP chat messages.

Looking in Prosody's http-bosh module, this improper-addressing error
seems to be produced when there is no valid 'to' field.

However, there is in the request:
<message to='roomName@conference.my.video.bridge' type='groupchat'
xmlns='jabber:client'><body>Test</body></message>

Which seems to have a valid 'to' field.

I don't see subsequent http-bind attempts, and the normal presence
messages being sent anymore after a network loss either.

Looking at Strophe it appears that there should be a Connection Manager
that handles disconnecting/reconnecting and manages all of this stuff.

My version of Meet is from July this year- so it's possible I have out of
date connection recovery code, but I seem to remember it handling this a
lot better even a year or more ago.

I think maybe it has more to do with the fact I'm using external-auth, and
locking rooms using Prosody.

Any help or thoughts would be greatly appreciated.

--
- Jason Thomas
   http://jasonthom.as

--
- Jason Thomas
   http://jasonthom.as


#3

Hi,

There were several changes the past few months, in the way we handle
disconnects. So you better update to latest jitsi-meet and
lib-jitsi-meet and test with that.

Regards
damencho

···

On Tue, Nov 15, 2016 at 5:03 PM, Jason Thomas <mail@jasonthom.as> wrote:

I've spent a lot more time looking into this now, and it seems like there
are multiple issues with the Strophe connection handler that cause it to get
into an unrecoverable state.

The main issue seems to be the strophe.ping.js module in XMPP that should
fail after 30+ seconds of ping failures, but seems to call disconnect
prematurely, at least in my setup.

This leaves the rid/sid of the Strophe connection empty, and causes
connection re-establishment to fail.

The other seems to be in Strophe itself where if there is a pending XHR
request that occurs at a particular time during the disconnect the it fires
an exception in req.xhr.send that seems to break the request chain and cause
future requests not to be fired.

I'm trying to figure out workarounds, or maybe modifications to Strophe to
fix these issues.

Any help would be greatly appreciated.

On Mon, Nov 14, 2016 at 11:18 AM, Jason Thomas <mail@jasonthom.as> wrote:

I'm currently testing using my own infrastructure, but I'm curious to what
the methodology currently is for recovering a connection for a long network
outage.

The way I am testing is by removing my network connection physically, or
by turning off my Wireless adapter when using Wifi.

It seems usually everything recovers well except for Strophe.

Strophe seems to break down intermittently and subsequent http-bosh
requests after a network outage resolve as:

<body xmlns:stream='http://etherx.jabber.org/streams' type='terminate'
condition='improper-addressing' xmlns='http://jabber.org/protocol/httpbind'>

I'm using chat to test sending XMPP chat messages.

Looking in Prosody's http-bosh module, this improper-addressing error
seems to be produced when there is no valid 'to' field.

However, there is in the request:
<message to='roomName@conference.my.video.bridge' type='groupchat'
xmlns='jabber:client'><body>Test</body></message>

Which seems to have a valid 'to' field.

I don't see subsequent http-bind attempts, and the normal presence
messages being sent anymore after a network loss either.

Looking at Strophe it appears that there should be a Connection Manager
that handles disconnecting/reconnecting and manages all of this stuff.

My version of Meet is from July this year- so it's possible I have out of
date connection recovery code, but I seem to remember it handling this a lot
better even a year or more ago.

I think maybe it has more to do with the fact I'm using external-auth, and
locking rooms using Prosody.

Any help or thoughts would be greatly appreciated.

--
- Jason Thomas
   http://jasonthom.as

--
- Jason Thomas
   http://jasonthom.as

_______________________________________________
users mailing list
users@jitsi.org
Unsubscribe instructions and other list options:
http://lists.jitsi.org/mailman/listinfo/users