[jitsi-dev] Re: Race condition


#1

We saw this too. Are you using a "0" conversation number ?

We are thinking of the best way to improve the stun filtering, but in the
mean, you should be able to get by, by setting a conversation number that
starts with 10 (2) in the two leftmost bits.

Emil

--sent from my mobile

···

On Aug 10, 2012 10:10 AM, "Carl Hasselskog" <carl@degoo.com> wrote:

Hi,
I've now tried using selectedPair.getLocalCandidate().getDatagramSocket()
to get the DatagramSocket and not calling iceAgent.free(). Unfortunately I
get timeouts in Accept and Connect in PseudoTCPSocket (see attached log).
To me it looks like the TCP_SYN_SENT is being directed to StunStack instead
of the PseudoTCPSocket. I base that on this part of the log:

10:02:57.697 Fin: org.ice4j.pseudotcp.PseudoTCPBase.Connect() State:
TCP_SYN_SENT
10:02:57.698 Finaste: org.ice4j.pseudotcp.PseudoTCPBase.queue() enqueued
send segment seq: 0 len: 4
10:02:57.700 Fin: org.ice4j.pseudotcp.PseudoTCPBase.attemptSend() [cwnd:
360 nWindow: 4 nInFlight: 0 nAvailable: 4 nQueued: 4 nEmpty: 92156
ssthresh: 61440]
10:02:57.700 Finaste: org.ice4j.pseudotcp.PseudoTCPBase.attemptSend()
TRANSMIT SEGMENT seq: 0 len: 4
10:02:57.700 Fin: org.ice4j.pseudotcp.PseudoTCPBase.packet() <--
<CONV=0><FLG=2><SEQ=0:4><ACK=0><WND=61440><SCALE=0><TS=261014052><TSR=0><LEN=4>
10:02:57.701 Finaste: org.ice4j.pseudotcp.PseudoTCPBase.TcpWritePacket()
write packet to network 28
10:02:57.701 Finaste: org.ice4j.stack.Connector.run() received datagram
10:02:57.705 Finaste: org.ice4j.pseudotcp.PseudoTCPBase.attemptSend()
nAvailable == 0: quit
10:02:57.705 Finaste: org.ice4j.stack.MessageQueue.add() Adding raw
message to queue.
10:02:57.706 Finaste: org.ice4j.stack.MessageProcessor.run() Dispatching a
StunMessageEvent.
10:02:57.706 Finaste: org.ice4j.stack.StunStack.handleMessageEvent()
Received a message on /2001:6b0:1:1041:d843:8fd0:ab52:d8f4:3031/udp of
type:0
10:02:57.706 Finaste: org.ice4j.stack.StunStack.handleMessageEvent()
parsing request
10:02:57.707 Finaste: org.ice4j.stack.StunStack.handleMessageEvent()
existing transaction not found

Any idea on what might be causing this?

Regards
Carl

-----Original Message-----
From: Emil Ivov [mailto:emcho@jitsi.org]
Sent: den 9 augusti 2012 18:01
To: dev@jitsi.java.net
Subject: [jitsi-dev] Re: Race condition

Hey Carl,
Hey Carl,

On Thu, Aug 9, 2012 at 12:58 PM, Carl Hasselskog <carl@degoo.com> wrote:
> Hi,
>
> I think I've run into a race condition when trying create a
> PseudoTCP-stream. I've created XMPP-signalling and both the local and
> the remote Agent successfully enters both COMPLETED and TERMINATED
> state. The problem occurs when I try to create the PseudoTCP-streams
> after entering TERMINATED state. I've tried to create it the same way
> as it is done in IcePseudoTcp.

Unfortunately that example does not do it properly. Sorry about that.
Pawel is currently working on correcting it (and all the bugs he's
stumbling upon while doing so :wink: ). The thing is that, in order to use
ice4j, one has to use the sockets that it creates. ice4j also creates a
master socket and a stun filtering socket. It then uses the stun socket so
that it can continue ICE processing while the application continues
exchanging data.

> I.e. by calling agent.free() and then creating new
> DatagramSocket-instances using the transport addresses in the
> candidate pair.

You should only call free() once you are done with your session and you
are not going to use the sockets any more.

I believe the above are addressing all your comments below and your next
mail..

Could you please let me know if I missed something?

Thanks,
Emil

> When I try to do this it works sometimes and sometimes not, which
> leads me to believe that there's some concurrency issue.
>
> I haven't found the exact problem but my theory is that there's a race
> condition in the call of free() and the creation of the new
DatagramSocket.
> Something like this:
>
> 1. The remote peer enters TERMINATED state and calls free(),
creates a
> new DatagramSocket and then calls Connect on the new PseudoTcpSocket.
>
> 2. The local peer hasn't called free() yet and therefore receives
the
> TCP_SYN_SENT from the local peer on the DatagramSocket instance used
> by the connectivity checker.
>
> 3. The local peer enters TERMINATED state, closes the old
> DatagramSocket (through free), creates a new one and calls Accept on
> the new PseudoTcpSocket.
>
> 4. Both peers timeout because the TCP_SYN_SENT has been lost.
>
>
>
> I've attached a log-file for one of the execution attempts that failed.
>
>
>
> When I sleep for 1s between the call to free() and the creation of the
> PseudoTcpSocket it works almost all the time, which strengthens the
> theory that there's a race condition. However, this feels like a very
> hackish way of solving it
>
>
>
> I've also tried not calling free() and just trying to re-use
> selectedPair.getLocalCandidate().getDatagramSocket() but doesn't work
> either. My theory is that PseudoTCPSocket then mixes the stun messages
> with the TCP_SYN_SENT somehow (I guess there must be a reason why
> IcePseudoTcp calls free() and then creates a new DatagramSocket).
>
>
>
> This is just a guess from my side and I might very well be wrong. My
> questions are:
>
> 1. Could my theory be correct or have I missed something
fundamentally
> on how ICE works?
>
> 2. How would you recommend me solving this? I need to somehow
> synchronize the creation of the PseudoTcpSocket.
>
>
>
> Thanks in advance!
>
>
>
> Kind Regards
>
> Carl Hasselskog
>
> Degoo Backup AB
>
> carl@degoo.com
>
> Phone: +46 73 070 1821
>
> http://degoo.com
>
> http://twitter.com/#!/CarlHasselskog
>
>

--
Emil Ivov, Ph.D. 67000 Strasbourg,
Project Lead France
Jitsi
emcho@jitsi.org PHONE: +33.1.77.62.43.30
http://jitsi.org FAX: +33.1.77.62.47.31