[jitsi-dev] Deadlock between Jabber ProtocolProvider and ReconnectPlugin


#1

Hey

The Jabber Plugin ran into a deadlock this morning after returning from standby:
Found one Java-level deadlock:

jabber_reconnectplugin_deadlock.tdump (38.5 KB)

···

=============================
"Thread-636":
  waiting to lock monitor 0x1856587c (object 0x093a4438, a net.java.sip.communicator.impl.protocol.jabber.ProtocolProviderServiceJabberImpl),
  which is held by "Reconnect timer"
"Reconnect timer":
  waiting to lock monitor 0x18047abc (object 0x0937dce0, a net.java.sip.communicator.plugin.reconnectplugin.ReconnectPluginActivator),
  which is held by "Thread-11"
"Thread-11":
  waiting to lock monitor 0x1857a97c (object 0x093afdb8, a java.lang.Object),
  which is held by "Reconnect timer"

Attached is the full Thread-Dump of the above excerpt (plain text file from VisualVM). My first dumb guess is that ProtocolProviderServiceJabberImpl.connectAndLogin (~line 378) doesn't need to be synchronized as it uses custom locks inside anyway.

Regards,
Ingo


#2

Hi Ingo,

thanks for the report, actually there seems to be more problems than
it seems :slight_smile: We discussed this offline with Lubomir, cause while ago we
were fixing another deadlock there. Seems the problem is the 'this'
synchronize of connectAndLogin method and then all the fire events
must be outside the synchronize of initializationLock. A lot of
changes and a lot of things to think of cause there are some
disconnects in the login process of the jabber protocol concerning
certificate accept/deny.
Any way, I'll be thinking of this changes this days.

Thanks for reporting it and for the useful thread dump :slight_smile:
damencho

···

On Wed, Mar 30, 2011 at 10:46 AM, Bauersachs Ingo <ingo.bauersachs@fhnw.ch> wrote:

Hey

The Jabber Plugin ran into a deadlock this morning after returning from standby:
Found one Java-level deadlock:

"Thread-636":
waiting to lock monitor 0x1856587c (object 0x093a4438, a net.java.sip.communicator.impl.protocol.jabber.ProtocolProviderServiceJabberImpl),
which is held by "Reconnect timer"
"Reconnect timer":
waiting to lock monitor 0x18047abc (object 0x0937dce0, a net.java.sip.communicator.plugin.reconnectplugin.ReconnectPluginActivator),
which is held by "Thread-11"
"Thread-11":
waiting to lock monitor 0x1857a97c (object 0x093afdb8, a java.lang.Object),
which is held by "Reconnect timer"

Attached is the full Thread-Dump of the above excerpt (plain text file from VisualVM). My first dumb guess is that ProtocolProviderServiceJabberImpl.connectAndLogin (~line 378) doesn't need to be synchronized as it uses custom locks inside anyway.

Regards,
Ingo


#3

thanks for the report, actually there seems to be more problems than
it seems :slight_smile: We discussed this offline with Lubomir, cause while ago we
were fixing another deadlock there. Seems the problem is the 'this'
synchronize of connectAndLogin method

That was what I meant by connectAndLogin doesn't need to be synchronized :slight_smile:

and then all the fire events
must be outside the synchronize of initializationLock. A lot of
changes and a lot of things to think of cause there are some
disconnects in the login process of the jabber protocol concerning
certificate accept/deny.

Yupp, although I didn't look very deep, the explanation makes sense.

Any way, I'll be thinking of this changes this days.

Thanks for reporting it and for the useful thread dump :slight_smile: damencho

Maybe you are able to reproduce it this way:
- Start Jitsi and connect
- Put the computer into standby
- Wakeup: Jitsi doesn't reconnect (but is not yet in a deadlock)
  - The logfile mentions lots of impl.netaddr.NetworkAddressManagerServiceImpl.getLocalHost().154 Failed to get localhost
  - and before it dies:
    impl.protocol.jabber.OperationSetBasicTelephonyJabberImpl.registrationStateChanged().95 Jingle : ON
    impl.protocol.jabber.ProtocolProviderServiceJabberImpl.connectionClosedOnError().1440 connectionClosedOnError no more data available - expected end tag </stream:stream> to close start tag <stream:stream> from line 1, parser stopped on END_TAG seen ...</stream:features> ... @1:336

- From the GUI, select globally online (there was no activity on the jabber protocol)
- Wait a few moments as it was unable to connect
- Select global offline to abort connecting -> Deadlock

Regards and thanks,
Ingo