[jitsi-dev] RTC over TCP - some thoughts, a heads up, and a proposal


#1

Hi all,

during the implementation and integration of SRTCP I stumbled over
some modifications that were checked in on July, 20th that enable
usage of RTP over TCP. In my not so humble opinion that is a bad idea
and is prone to fail or generate serious problems.

UDP is a datagram transport:
- if the receiver reads data then it gets the whole datagram (data)
  in _one_ read - in the same way the sender sent it.
- it returns the correct length of the data that was sent by the
  sender in _one_ write operation.

With UDP you either receive all the data that the other party sent
with with a datagram at once or you receive nothing, also the order
is not guaranteed in UPD.

In contrast to this TCP is a stream and thus it behaves different:

Using TCP the client may send 200 bytes and then, sometimes later another
200 bytes, and so on. Due to cirumstances that the client cannot control
the TCP implementation may decide to send:
- 2 times 200 bytes (as one would expect)
- it may also combine the data of both write operations and send 400 bytes
  at once, and the receiver can read 400 bytes at once, not two times
  200 bytes
- it may send 270 and 130, etc.

How a TCP implementation sends data depends on the implementation of TCP,
available resources, etc. Such scenarios may happen, for example:

- if the time between the two write operations is short (may happen in
  Jitsi due to some JMF timing)
- some resources are not available to sent the first bunch of data
  immediatley and is therefore delayed, for example in WLAN environments
  where we have seen "bursts" of several UDP packets in a very short
  timeframe. The TCP implementation may decide to combine these and optimize
  the send stream
- many other circumstances where TCP may decide to send more or less data

Fact: reading from a TCP stream does not gurantee that you read the data in
the same way the sender sent it - in terms of packet/message size.
That's one big difference between a stream and a datagram transport.

Have a look at: http://en.wikipedia.org/wiki/User_Datagram_Protocol
and scroll down to the chapter "Comparison of UDP and TCP"

== Implications for RTP and RTCP ==

(S)RTP packets do not carry a length field (RFC 3550, chap 5.1). RTP
determines the length implicitly because it uses UDP and UDP informs the
receiver about the length of the datagram packet. A streaming transport
does not support this feature because it was not designed for that.

(S)RTCP carries length fields: one for each part of a compound RTCP packet.
RTCP does not have an indication how many parts a compund packet contains,
thus a TCP receiver cannot really decide when to stop reading. In SRTCP
only the very first length field is usable - all others are encrypted :slight_smile: .

RFC 3550 uses UDP as transport, refer chapter 2 and it subchaptes.

== Proposal ==
I would ask to remove the TCP support for RTP to avoid any problems which
may cause non-deterministic errors and wrong behaviour.

Best regards,
Werner


#2

skype works over tcp, let user decide what to use. User might be forced to use suboptimal settings in his environment. I don't see a reason to remove something that works... does it?

Werner Dittmann wrote, On 11.10.2011 21:40 (EEST):

···

Hi all,

during the implementation and integration of SRTCP I stumbled over
some modifications that were checked in on July, 20th that enable
usage of RTP over TCP. In my not so humble opinion that is a bad idea
and is prone to fail or generate serious problems.

UDP is a datagram transport:
- if the receiver reads data then it gets the whole datagram (data)
   in _one_ read - in the same way the sender sent it.
- it returns the correct length of the data that was sent by the
   sender in _one_ write operation.

With UDP you either receive all the data that the other party sent
with with a datagram at once or you receive nothing, also the order
is not guaranteed in UPD.

In contrast to this TCP is a stream and thus it behaves different:

Using TCP the client may send 200 bytes and then, sometimes later another
200 bytes, and so on. Due to cirumstances that the client cannot control
the TCP implementation may decide to send:
- 2 times 200 bytes (as one would expect)
- it may also combine the data of both write operations and send 400 bytes
   at once, and the receiver can read 400 bytes at once, not two times
   200 bytes
- it may send 270 and 130, etc.

How a TCP implementation sends data depends on the implementation of TCP,
available resources, etc. Such scenarios may happen, for example:

- if the time between the two write operations is short (may happen in
   Jitsi due to some JMF timing)
- some resources are not available to sent the first bunch of data
   immediatley and is therefore delayed, for example in WLAN environments
   where we have seen "bursts" of several UDP packets in a very short
   timeframe. The TCP implementation may decide to combine these and optimize
   the send stream
- many other circumstances where TCP may decide to send more or less data

Fact: reading from a TCP stream does not gurantee that you read the data in
the same way the sender sent it - in terms of packet/message size.
That's one big difference between a stream and a datagram transport.

Have a look at: http://en.wikipedia.org/wiki/User_Datagram_Protocol
and scroll down to the chapter "Comparison of UDP and TCP"

== Implications for RTP and RTCP ==

(S)RTP packets do not carry a length field (RFC 3550, chap 5.1). RTP
determines the length implicitly because it uses UDP and UDP informs the
receiver about the length of the datagram packet. A streaming transport
does not support this feature because it was not designed for that.

(S)RTCP carries length fields: one for each part of a compound RTCP packet.
RTCP does not have an indication how many parts a compund packet contains,
thus a TCP receiver cannot really decide when to stop reading. In SRTCP
only the very first length field is usable - all others are encrypted :slight_smile: .

RFC 3550 uses UDP as transport, refer chapter 2 and it subchaptes.

== Proposal ==
I would ask to remove the TCP support for RTP to avoid any problems which
may cause non-deterministic errors and wrong behaviour.

Best regards,
Werner


#3

Hi Emil,

thanks for the clarification. However, can somebody check the implementation
of RTPConnectorTCPInputStream.

I've checked RTPConnectorTCPOutputStream after the clarification - and see
some code that perform framing, but no counterpart at the receiving end.

In RTPConnectorTCPInputStream I can only see one read and no check if the
received length matches the expected length. What do I miss in the
implementation of receivePacket? The standard InputStream does not
perform such a check AFAIK - do I miss some "framing" here?

Best regards,
Werner

Code snippet from RTPConnectorTCPInputStream:

    /**
     * Receive packet.

···

*
     * @param p packet for receiving
     * @throws IOException if something goes wrong during receiving
     */
    protected void receivePacket(DatagramPacket p)
        throws IOException
    {
        int len = -1;
        byte data[] = null;

        try
        {
            data = p.getData();
            InputStream stream = socket.getInputStream();

            //stream.skip(2);
            len = stream.read(data);
        }
        catch(Exception e)
        {
            logger.info("problem read: " + e);
        }

        if(len > 0)
        {
            p.setData(data);
            p.setLength(len);
            p.setAddress(socket.getInetAddress());
            p.setPort(socket.getPort());
        }
        else
        {
            throw new IOException("Failed to read on TCP socket");
        }
    }
}

Am 11.10.2011 21:31, schrieb Emil Ivov:

Hey Werner,

The immediate reason we introduced TCP support for RTP was mostly GTalk compatibility. Other than that however, while it is indeed suboptimal for VoIP, TCP can save the day in a number of situations where UDP would fail (the most common reason being administrative prohibition). Note that we only use TCP with ICE and only in cases where UDP has failed. Our tests have shown that when we do fall back to TCP, call quality tends to be quite acceptable.

As Aleksandar mentioned, Skype is another popular implementation of RTP over TCP.

The fragmentation issues that you talk about would indeed apply which is why the IETF defined a framing mechanism for RTP in RFC4571:

http://tools.ietf.org/html/rfc4571

RTP packets sent by Jitsi over TCP use such framing.

Hope this clears it out,

Emil

--sent from my mobile

On 11 oct. 2011, at 20:40, Werner Dittmann <Werner.Dittmann@t-online.de> wrote:

Hi all,

during the implementation and integration of SRTCP I stumbled over
some modifications that were checked in on July, 20th that enable
usage of RTP over TCP. In my not so humble opinion that is a bad idea
and is prone to fail or generate serious problems.

UDP is a datagram transport:
- if the receiver reads data then it gets the whole datagram (data)
in _one_ read - in the same way the sender sent it.
- it returns the correct length of the data that was sent by the
sender in _one_ write operation.

With UDP you either receive all the data that the other party sent
with with a datagram at once or you receive nothing, also the order
is not guaranteed in UPD.

In contrast to this TCP is a stream and thus it behaves different:

Using TCP the client may send 200 bytes and then, sometimes later another
200 bytes, and so on. Due to cirumstances that the client cannot control
the TCP implementation may decide to send:
- 2 times 200 bytes (as one would expect)
- it may also combine the data of both write operations and send 400 bytes
at once, and the receiver can read 400 bytes at once, not two times
200 bytes
- it may send 270 and 130, etc.

How a TCP implementation sends data depends on the implementation of TCP,
available resources, etc. Such scenarios may happen, for example:

- if the time between the two write operations is short (may happen in
Jitsi due to some JMF timing)
- some resources are not available to sent the first bunch of data
immediatley and is therefore delayed, for example in WLAN environments
where we have seen "bursts" of several UDP packets in a very short
timeframe. The TCP implementation may decide to combine these and optimize
the send stream
- many other circumstances where TCP may decide to send more or less data

Fact: reading from a TCP stream does not gurantee that you read the data in
the same way the sender sent it - in terms of packet/message size.
That's one big difference between a stream and a datagram transport.

Have a look at: http://en.wikipedia.org/wiki/User_Datagram_Protocol
and scroll down to the chapter "Comparison of UDP and TCP"

== Implications for RTP and RTCP ==

(S)RTP packets do not carry a length field (RFC 3550, chap 5.1). RTP
determines the length implicitly because it uses UDP and UDP informs the
receiver about the length of the datagram packet. A streaming transport
does not support this feature because it was not designed for that.

(S)RTCP carries length fields: one for each part of a compound RTCP packet.
RTCP does not have an indication how many parts a compund packet contains,
thus a TCP receiver cannot really decide when to stop reading. In SRTCP
only the very first length field is usable - all others are encrypted :slight_smile: .

RFC 3550 uses UDP as transport, refer chapter 2 and it subchaptes.

== Proposal ==
I would ask to remove the TCP support for RTP to avoid any problems which
may cause non-deterministic errors and wrong behaviour.

Best regards,
Werner


#4

Hi Werner,

Le 11/10/11 20:40, Werner Dittmann a �crit :

Hi all,

during the implementation and integration of SRTCP I stumbled over
some modifications that were checked in on July, 20th that enable
usage of RTP over TCP. In my not so humble opinion that is a bad idea
and is prone to fail or generate serious problems.

UDP is a datagram transport:
- if the receiver reads data then it gets the whole datagram (data)
   in _one_ read - in the same way the sender sent it.
- it returns the correct length of the data that was sent by the
   sender in _one_ write operation.

With UDP you either receive all the data that the other party sent
with with a datagram at once or you receive nothing, also the order
is not guaranteed in UPD.

In contrast to this TCP is a stream and thus it behaves different:

Using TCP the client may send 200 bytes and then, sometimes later another
200 bytes, and so on. Due to cirumstances that the client cannot control
the TCP implementation may decide to send:
- 2 times 200 bytes (as one would expect)
- it may also combine the data of both write operations and send 400 bytes
   at once, and the receiver can read 400 bytes at once, not two times
   200 bytes
- it may send 270 and 130, etc.

Yes it is true if TCP naggle algorithm is used. But Jitsi as well as Google's libjingle explicitely disable it (socket option TCP_NODELAY). So in your example without naggle, the stack will always send two times 200 bytes.

According to http://en.wikipedia.org/wiki/Nagle's_algorithm#Interactions_with_real-time_systems, it is better to disable naggle algorithms for real-time applications.

···

--
Seb

How a TCP implementation sends data depends on the implementation of TCP,
available resources, etc. Such scenarios may happen, for example:

- if the time between the two write operations is short (may happen in
   Jitsi due to some JMF timing)
- some resources are not available to sent the first bunch of data
   immediatley and is therefore delayed, for example in WLAN environments
   where we have seen "bursts" of several UDP packets in a very short
   timeframe. The TCP implementation may decide to combine these and optimize
   the send stream
- many other circumstances where TCP may decide to send more or less data

Fact: reading from a TCP stream does not gurantee that you read the data in
the same way the sender sent it - in terms of packet/message size.
That's one big difference between a stream and a datagram transport.

Have a look at: http://en.wikipedia.org/wiki/User_Datagram_Protocol
and scroll down to the chapter "Comparison of UDP and TCP"

== Implications for RTP and RTCP ==

(S)RTP packets do not carry a length field (RFC 3550, chap 5.1). RTP
determines the length implicitly because it uses UDP and UDP informs the
receiver about the length of the datagram packet. A streaming transport
does not support this feature because it was not designed for that.

(S)RTCP carries length fields: one for each part of a compound RTCP packet.
RTCP does not have an indication how many parts a compund packet contains,
thus a TCP receiver cannot really decide when to stop reading. In SRTCP
only the very first length field is usable - all others are encrypted :slight_smile: .

RFC 3550 uses UDP as transport, refer chapter 2 and it subchaptes.

== Proposal ==
I would ask to remove the TCP support for RTP to avoid any problems which
may cause non-deterministic errors and wrong behaviour.

Best regards,
Werner


#5

Hey

In RTPConnectorTCPInputStream I can only see one read and no check if the
received length matches the expected length. What do I miss in the
implementation of receivePacket? The standard InputStream does not
perform such a check AFAIK - do I miss some "framing" here?

I don't see any framing too, and the commented stream.skip(2) shows that the length prefix of RFC4571 was ignored once, but is now delivered as part of the packet. IMHO it should look something like this (completely untested!):

    protected void receivePacket(DatagramPacket p)
        throws IOException
    {
        byte data[] = null;

        try
        {
            data = p.getData();
            InputStream stream = socket.getInputStream();

            int packetLen = (stream.read() << 8) | stream.read();
            int offset = 0;
            int readLen = 0;
            while(readLen < packetLen)
            {
                int len = stream.read(data, offset, packetLen - offset);
                if(len < 0)
                    throw new IOException("unable to read expected no bytes.");
                offset += len;
                readLen += len;
            }

            p.setData(data);
            p.setLength(packetLen);
            p.setAddress(socket.getInetAddress());
            p.setPort(socket.getPort());
        }
        catch(IOException e)
        {
            logger.info("problem read: " + e);
            throw e;
        }
    }
}

Ingo


#6

Hi Werner,

After a quick read in source, it is in ice4j's DelegatingSocket that do effectively the add/remove of TCP framing. I cannot remember exactly why it is in ice4j rather than Jitsi. I need to check more on this. Also there is no additionnal verification on receive buffer length (mainly because Jitsi and Gmail/libjingle use TCP_NODELAY), we will add additionnal checks.

Regards,

···

--
Seb

Le 12/10/11 07:16, Werner Dittmann a �crit :

Hi Emil,

thanks for the clarification. However, can somebody check the implementation
of RTPConnectorTCPInputStream.

I've checked RTPConnectorTCPOutputStream after the clarification - and see
some code that perform framing, but no counterpart at the receiving end.

In RTPConnectorTCPInputStream I can only see one read and no check if the
received length matches the expected length. What do I miss in the
implementation of receivePacket? The standard InputStream does not
perform such a check AFAIK - do I miss some "framing" here?

Best regards,
Werner

Code snippet from RTPConnectorTCPInputStream:

     /**
      * Receive packet.
      *
      * @param p packet for receiving
      * @throws IOException if something goes wrong during receiving
      */
     protected void receivePacket(DatagramPacket p)
         throws IOException
     {
         int len = -1;
         byte data[] = null;

         try
         {
             data = p.getData();
             InputStream stream = socket.getInputStream();

             //stream.skip(2);
             len = stream.read(data);
         }
         catch(Exception e)
         {
             logger.info("problem read: " + e);
         }

         if(len> 0)
         {
             p.setData(data);
             p.setLength(len);
             p.setAddress(socket.getInetAddress());
             p.setPort(socket.getPort());
         }
         else
         {
             throw new IOException("Failed to read on TCP socket");
         }
     }
}

Am 11.10.2011 21:31, schrieb Emil Ivov:

Hey Werner,

The immediate reason we introduced TCP support for RTP was mostly GTalk compatibility. Other than that however, while it is indeed suboptimal for VoIP, TCP can save the day in a number of situations where UDP would fail (the most common reason being administrative prohibition). Note that we only use TCP with ICE and only in cases where UDP has failed. Our tests have shown that when we do fall back to TCP, call quality tends to be quite acceptable.

As Aleksandar mentioned, Skype is another popular implementation of RTP over TCP.

The fragmentation issues that you talk about would indeed apply which is why the IETF defined a framing mechanism for RTP in RFC4571:

http://tools.ietf.org/html/rfc4571

RTP packets sent by Jitsi over TCP use such framing.

Hope this clears it out,

Emil

--sent from my mobile

On 11 oct. 2011, at 20:40, Werner Dittmann<Werner.Dittmann@t-online.de> wrote:

Hi all,

during the implementation and integration of SRTCP I stumbled over
some modifications that were checked in on July, 20th that enable
usage of RTP over TCP. In my not so humble opinion that is a bad idea
and is prone to fail or generate serious problems.

UDP is a datagram transport:
- if the receiver reads data then it gets the whole datagram (data)
  in _one_ read - in the same way the sender sent it.
- it returns the correct length of the data that was sent by the
  sender in _one_ write operation.

With UDP you either receive all the data that the other party sent
with with a datagram at once or you receive nothing, also the order
is not guaranteed in UPD.

In contrast to this TCP is a stream and thus it behaves different:

Using TCP the client may send 200 bytes and then, sometimes later another
200 bytes, and so on. Due to cirumstances that the client cannot control
the TCP implementation may decide to send:
- 2 times 200 bytes (as one would expect)
- it may also combine the data of both write operations and send 400 bytes
  at once, and the receiver can read 400 bytes at once, not two times
  200 bytes
- it may send 270 and 130, etc.

How a TCP implementation sends data depends on the implementation of TCP,
available resources, etc. Such scenarios may happen, for example:

- if the time between the two write operations is short (may happen in
  Jitsi due to some JMF timing)
- some resources are not available to sent the first bunch of data
  immediatley and is therefore delayed, for example in WLAN environments
  where we have seen "bursts" of several UDP packets in a very short
  timeframe. The TCP implementation may decide to combine these and optimize
  the send stream
- many other circumstances where TCP may decide to send more or less data

Fact: reading from a TCP stream does not gurantee that you read the data in
the same way the sender sent it - in terms of packet/message size.
That's one big difference between a stream and a datagram transport.

Have a look at: http://en.wikipedia.org/wiki/User_Datagram_Protocol
and scroll down to the chapter "Comparison of UDP and TCP"

== Implications for RTP and RTCP ==

(S)RTP packets do not carry a length field (RFC 3550, chap 5.1). RTP
determines the length implicitly because it uses UDP and UDP informs the
receiver about the length of the datagram packet. A streaming transport
does not support this feature because it was not designed for that.

(S)RTCP carries length fields: one for each part of a compound RTCP packet.
RTCP does not have an indication how many parts a compund packet contains,
thus a TCP receiver cannot really decide when to stop reading. In SRTCP
only the very first length field is usable - all others are encrypted :slight_smile: .

RFC 3550 uses UDP as transport, refer chapter 2 and it subchaptes.

== Proposal ==
I would ask to remove the TCP support for RTP to avoid any problems which
may cause non-deterministic errors and wrong behaviour.

Best regards,
Werner


#7

2011-10-12, 09:04(+02), Sebastien Vincent:
[...]

Yes it is true if TCP naggle algorithm is used. But Jitsi as well as
Google's libjingle explicitely disable it (socket option TCP_NODELAY).
So in your example without naggle, the stack will always send two times
200 bytes.

[...]

TCP_NODELAY prevents the delay, not the accumulating of data in
one packet before sending (for instance like in congestion
cases, or retransmissions or when the peer's receive window is
full).

See also SCTP for a streaming transport protocol that supports
framing and unordered delivery (and multiple streams within an
association). SCTP was designed for SIGTRAN (SS7 over IP), so I
suppose it could be well adapted for parts of jitsi as well (it
now comes with authentication mechanisms as well). It would
probably not help in traversing firewalls though.

It still isn't very adapted for RTP, because it implements the
same kind of congestion control and avoidance as TCP and
guarantees delivery (so unwanted retransmissions, delays...)

Routers QoS also tend to minimise delay for UDP traffic while
guaranteeing throughput for TCP, which make UDP more appropriate
for VoIP.

···

--
Stephane


#8

Hi,

I check the code and if we do not remove the TCP framing header in ice4j, we will confuse the STUN filter (as the beginning of bytes does not correspond to STUN pattern) and we will end up with ICE checks timeout.

I think protocols (other than RTP) that could be uesd in session negociated by ICE should use TCP framing too ? What do you think ? Or we can add again the TCP framing when we pass data in application ?

Regards,

···

--
Seb

Le 12/10/11 11:09, Sebastien Vincent a �crit :

Hi Werner,

After a quick read in source, it is in ice4j's DelegatingSocket that do effectively the add/remove of TCP framing. I cannot remember exactly why it is in ice4j rather than Jitsi. I need to check more on this. Also there is no additionnal verification on receive buffer length (mainly because Jitsi and Gmail/libjingle use TCP_NODELAY), we will add additionnal checks.

Regards,
--
Seb

Le 12/10/11 07:16, Werner Dittmann a �crit :

Hi Emil,

thanks for the clarification. However, can somebody check the implementation
of RTPConnectorTCPInputStream.

I've checked RTPConnectorTCPOutputStream after the clarification - and see
some code that perform framing, but no counterpart at the receiving end.

In RTPConnectorTCPInputStream I can only see one read and no check if the
received length matches the expected length. What do I miss in the
implementation of receivePacket? The standard InputStream does not
perform such a check AFAIK - do I miss some "framing" here?

Best regards,
Werner

Code snippet from RTPConnectorTCPInputStream:

     /**
      * Receive packet.
      *
      * @param p packet for receiving
      * @throws IOException if something goes wrong during receiving
      */
     protected void receivePacket(DatagramPacket p)
         throws IOException
     {
         int len = -1;
         byte data[] = null;

         try
         {
             data = p.getData();
             InputStream stream = socket.getInputStream();

             //stream.skip(2);
             len = stream.read(data);
         }
         catch(Exception e)
         {
             logger.info("problem read: " + e);
         }

         if(len> 0)
         {
             p.setData(data);
             p.setLength(len);
             p.setAddress(socket.getInetAddress());
             p.setPort(socket.getPort());
         }
         else
         {
             throw new IOException("Failed to read on TCP socket");
         }
     }
}

Am 11.10.2011 21:31, schrieb Emil Ivov:

Hey Werner,

The immediate reason we introduced TCP support for RTP was mostly GTalk compatibility. Other than that however, while it is indeed suboptimal for VoIP, TCP can save the day in a number of situations where UDP would fail (the most common reason being administrative prohibition). Note that we only use TCP with ICE and only in cases where UDP has failed. Our tests have shown that when we do fall back to TCP, call quality tends to be quite acceptable.

As Aleksandar mentioned, Skype is another popular implementation of RTP over TCP.

The fragmentation issues that you talk about would indeed apply which is why the IETF defined a framing mechanism for RTP in RFC4571:

http://tools.ietf.org/html/rfc4571

RTP packets sent by Jitsi over TCP use such framing.

Hope this clears it out,

Emil

--sent from my mobile

On 11 oct. 2011, at 20:40, Werner >>> Dittmann<Werner.Dittmann@t-online.de> wrote:

Hi all,

during the implementation and integration of SRTCP I stumbled over
some modifications that were checked in on July, 20th that enable
usage of RTP over TCP. In my not so humble opinion that is a bad idea
and is prone to fail or generate serious problems.

UDP is a datagram transport:
- if the receiver reads data then it gets the whole datagram (data)
  in _one_ read - in the same way the sender sent it.
- it returns the correct length of the data that was sent by the
  sender in _one_ write operation.

With UDP you either receive all the data that the other party sent
with with a datagram at once or you receive nothing, also the order
is not guaranteed in UPD.

In contrast to this TCP is a stream and thus it behaves different:

Using TCP the client may send 200 bytes and then, sometimes later another
200 bytes, and so on. Due to cirumstances that the client cannot control
the TCP implementation may decide to send:
- 2 times 200 bytes (as one would expect)
- it may also combine the data of both write operations and send 400 bytes
  at once, and the receiver can read 400 bytes at once, not two times
  200 bytes
- it may send 270 and 130, etc.

How a TCP implementation sends data depends on the implementation of TCP,
available resources, etc. Such scenarios may happen, for example:

- if the time between the two write operations is short (may happen in
  Jitsi due to some JMF timing)
- some resources are not available to sent the first bunch of data
  immediatley and is therefore delayed, for example in WLAN environments
  where we have seen "bursts" of several UDP packets in a very short
  timeframe. The TCP implementation may decide to combine these and optimize
  the send stream
- many other circumstances where TCP may decide to send more or less data

Fact: reading from a TCP stream does not gurantee that you read the data in
the same way the sender sent it - in terms of packet/message size.
That's one big difference between a stream and a datagram transport.

Have a look at: http://en.wikipedia.org/wiki/User_Datagram_Protocol
and scroll down to the chapter "Comparison of UDP and TCP"

== Implications for RTP and RTCP ==

(S)RTP packets do not carry a length field (RFC 3550, chap 5.1). RTP
determines the length implicitly because it uses UDP and UDP informs the
receiver about the length of the datagram packet. A streaming transport
does not support this feature because it was not designed for that.

(S)RTCP carries length fields: one for each part of a compound RTCP packet.
RTCP does not have an indication how many parts a compund packet contains,
thus a TCP receiver cannot really decide when to stop reading. In SRTCP
only the very first length field is usable - all others are encrypted :slight_smile: .

RFC 3550 uses UDP as transport, refer chapter 2 and it subchaptes.

== Proposal ==
I would ask to remove the TCP support for RTP to avoid any problems which
may cause non-deterministic errors and wrong behaviour.

Best regards,
Werner


#9

Hi Seb,

Hi Werner,

After a quick read in source, it is in ice4j's DelegatingSocket that do effectively the add/remove of TCP framing.

Well, if ice4j's DelegatingSocket adds framing than this would be done twice IMHO
because RTPConnectorTCPOutputStream adds framing too.

Does ice4j DelegatingSocket handle RTP only? if yes - then it may add framing,
but if this class is a generic socket class then framing should be done by the
protocol handler, not the socket level.

I cannot remember exactly why it is in ice4j

rather than Jitsi. I need to check more on this. Also there is no additionnal verification on receive buffer length (mainly because Jitsi and Gmail/libjingle
use TCP_NODELAY), we will add additionnal checks.

Jitsi and some others may use TCP_NODELAY (but see some comments about the effect
some mails below in the list) - however, what happens if other clients don't do
this? I'm not a TCP expert, thus I don't know it this setting is propagated to
the other peer of the TCP connection

Regards,
--
Seb

Best regards,
Werner

<SNIP --- SNAP>
...

···

Am 12.10.2011 11:09, schrieb Sebastien Vincent:


#10

BTW it seems that TCP transport with Gmail is broken. I am currently looking at the problem.

···

--
Seb

Le 12/10/11 12:25, Sebastien Vincent a �crit :

Hi,

I check the code and if we do not remove the TCP framing header in ice4j, we will confuse the STUN filter (as the beginning of bytes does not correspond to STUN pattern) and we will end up with ICE checks timeout.

I think protocols (other than RTP) that could be uesd in session negociated by ICE should use TCP framing too ? What do you think ? Or we can add again the TCP framing when we pass data in application ?

Regards,
--
Seb

Le 12/10/11 11:09, Sebastien Vincent a �crit :

Hi Werner,

After a quick read in source, it is in ice4j's DelegatingSocket that do effectively the add/remove of TCP framing. I cannot remember exactly why it is in ice4j rather than Jitsi. I need to check more on this. Also there is no additionnal verification on receive buffer length (mainly because Jitsi and Gmail/libjingle use TCP_NODELAY), we will add additionnal checks.

Regards,
--
Seb

Le 12/10/11 07:16, Werner Dittmann a �crit :

Hi Emil,

thanks for the clarification. However, can somebody check the implementation
of RTPConnectorTCPInputStream.

I've checked RTPConnectorTCPOutputStream after the clarification - and see
some code that perform framing, but no counterpart at the receiving end.

In RTPConnectorTCPInputStream I can only see one read and no check if the
received length matches the expected length. What do I miss in the
implementation of receivePacket? The standard InputStream does not
perform such a check AFAIK - do I miss some "framing" here?

Best regards,
Werner

Code snippet from RTPConnectorTCPInputStream:

     /**
      * Receive packet.
      *
      * @param p packet for receiving
      * @throws IOException if something goes wrong during receiving
      */
     protected void receivePacket(DatagramPacket p)
         throws IOException
     {
         int len = -1;
         byte data[] = null;

         try
         {
             data = p.getData();
             InputStream stream = socket.getInputStream();

             //stream.skip(2);
             len = stream.read(data);
         }
         catch(Exception e)
         {
             logger.info("problem read: " + e);
         }

         if(len> 0)
         {
             p.setData(data);
             p.setLength(len);
             p.setAddress(socket.getInetAddress());
             p.setPort(socket.getPort());
         }
         else
         {
             throw new IOException("Failed to read on TCP socket");
         }
     }
}

Am 11.10.2011 21:31, schrieb Emil Ivov:

Hey Werner,

The immediate reason we introduced TCP support for RTP was mostly GTalk compatibility. Other than that however, while it is indeed suboptimal for VoIP, TCP can save the day in a number of situations where UDP would fail (the most common reason being administrative prohibition). Note that we only use TCP with ICE and only in cases where UDP has failed. Our tests have shown that when we do fall back to TCP, call quality tends to be quite acceptable.

As Aleksandar mentioned, Skype is another popular implementation of RTP over TCP.

The fragmentation issues that you talk about would indeed apply which is why the IETF defined a framing mechanism for RTP in RFC4571:

http://tools.ietf.org/html/rfc4571

RTP packets sent by Jitsi over TCP use such framing.

Hope this clears it out,

Emil

--sent from my mobile

On 11 oct. 2011, at 20:40, Werner >>>> Dittmann<Werner.Dittmann@t-online.de> wrote:

Hi all,

during the implementation and integration of SRTCP I stumbled over
some modifications that were checked in on July, 20th that enable
usage of RTP over TCP. In my not so humble opinion that is a bad idea
and is prone to fail or generate serious problems.

UDP is a datagram transport:
- if the receiver reads data then it gets the whole datagram (data)
  in _one_ read - in the same way the sender sent it.
- it returns the correct length of the data that was sent by the
  sender in _one_ write operation.

With UDP you either receive all the data that the other party sent
with with a datagram at once or you receive nothing, also the order
is not guaranteed in UPD.

In contrast to this TCP is a stream and thus it behaves different:

Using TCP the client may send 200 bytes and then, sometimes later another
200 bytes, and so on. Due to cirumstances that the client cannot control
the TCP implementation may decide to send:
- 2 times 200 bytes (as one would expect)
- it may also combine the data of both write operations and send 400 bytes
  at once, and the receiver can read 400 bytes at once, not two times
  200 bytes
- it may send 270 and 130, etc.

How a TCP implementation sends data depends on the implementation of TCP,
available resources, etc. Such scenarios may happen, for example:

- if the time between the two write operations is short (may happen in
  Jitsi due to some JMF timing)
- some resources are not available to sent the first bunch of data
  immediatley and is therefore delayed, for example in WLAN environments
  where we have seen "bursts" of several UDP packets in a very short
  timeframe. The TCP implementation may decide to combine these and optimize
  the send stream
- many other circumstances where TCP may decide to send more or less data

Fact: reading from a TCP stream does not gurantee that you read the data in
the same way the sender sent it - in terms of packet/message size.
That's one big difference between a stream and a datagram transport.

Have a look at: http://en.wikipedia.org/wiki/User_Datagram_Protocol
and scroll down to the chapter "Comparison of UDP and TCP"

== Implications for RTP and RTCP ==

(S)RTP packets do not carry a length field (RFC 3550, chap 5.1). RTP
determines the length implicitly because it uses UDP and UDP informs the
receiver about the length of the datagram packet. A streaming transport
does not support this feature because it was not designed for that.

(S)RTCP carries length fields: one for each part of a compound RTCP packet.
RTCP does not have an indication how many parts a compund packet contains,
thus a TCP receiver cannot really decide when to stop reading. In SRTCP
only the very first length field is usable - all others are encrypted :slight_smile: .

RFC 3550 uses UDP as transport, refer chapter 2 and it subchaptes.

== Proposal ==
I would ask to remove the TCP support for RTP to avoid any problems which
may cause non-deterministic errors and wrong behaviour.

Best regards,
Werner


#11

Hey

I think protocols (other than RTP) that could be uesd in session
negociated by ICE should use TCP framing too ? What do you think ? Or we
can add again the TCP framing when we pass data in application ?

I know neither STUN nor ICE, but whatever we do, it should be synchronous in terms of what the InputStream and the OutputStream of Jitsi does. Currently the OutputStream adds the length-prefix, but the InputStream doesn't care about them.
As I can see it, the length-prefix is currently added twice: once in Jitsi in the RTPConnectorTCPOutputStream and a second time in ice4j. The removal however only happens in ice4j.

And the length-prefix should not just be ignored with .skip(2). It should be interpreted and then the application should read the exact number of bytes specified. If there are more bytes in the stream: fine, let another read take care of them. If there are less: wait until they arrive.

Ingo


#12

Hi Werner,

Le 12/10/11 16:35, Werner Dittmann a �crit :

Hi Seb,

Hi Werner,

After a quick read in source, it is in ice4j's DelegatingSocket that do effectively the add/remove of TCP framing.

Well, if ice4j's DelegatingSocket adds framing than this would be done twice IMHO
because RTPConnectorTCPOutputStream adds framing too.

As I said to previous mail, it is ignored by ice4j and I wonder why I keep the code in RTPConnector. I will probably remove it to not confuse anyone for the moment.

Does ice4j DelegatingSocket handle RTP only? if yes - then it may add framing,
but if this class is a generic socket class then framing should be done by the
protocol handler, not the socket level.

As I already said libjingle and libnice also add this framing for their TCP sockets classes. Again for the moment, if we don't remove framing in ice4j, no STUN packets will be parsed because the STUN data will be prefixed by 2 bytes and will result to a non-STUN packet by the filter. I precise it is ice4j that read first the data, then filter data to handle STUN messages and forward non-STUN messages to the application (in our case RTPConnectorTCPInputStream).

···

Am 12.10.2011 11:09, schrieb Sebastien Vincent:

I cannot remember exactly why it is in ice4j

rather than Jitsi. I need to check more on this. Also there is no additionnal verification on receive buffer length (mainly because Jitsi and Gmail/libjingle
use TCP_NODELAY), we will add additionnal checks.

Jitsi and some others may use TCP_NODELAY (but see some comments about the effect
some mails below in the list) - however, what happens if other clients don't do
this? I'm not a TCP expert, thus I don't know it this setting is propagated to
the other peer of the TCP connection

Regards,
--
Seb

Best regards,
Werner

<SNIP --- SNAP>
...


#13

Hi Ingo,

Le 12/10/11 15:17, Bauersachs Ingo a �crit :

Hey

I think protocols (other than RTP) that could be uesd in session
negociated by ICE should use TCP framing too ? What do you think ? Or we
can add again the TCP framing when we pass data in application ?

I know neither STUN nor ICE, but whatever we do, it should be synchronous in terms of what the InputStream and the OutputStream of Jitsi does. Currently the OutputStream adds the length-prefix, but the InputStream doesn't care about them.
As I can see it, the length-prefix is currently added twice: once in Jitsi in the RTPConnectorTCPOutputStream and a second time in ice4j. The removal however only happens in ice4j.

Yes but in fact the TCP framing added by RTPConnectorTCPOutputStream is ignored. I will remove the stuff in RTPConnector* to be clear about this.

I just want to add that both libjingle and libnice TCP sockets object add/remove the framing header when they recv/send data.

And the length-prefix should not just be ignored with .skip(2). It should be interpreted and then the application should read the exact number of bytes specified. If there are more bytes in the stream: fine, let another read take care of them. If there are less: wait until they arrive.

OK.

···

Ingo


#14

Hi again,

In ice-tcp draft (http://www.ietf.org/id/draft-ietf-mmusic-ice-tcp-15.txt) section 3, we can find the following text:

"

    ICE requires an agent to demultiplex STUN and application layer
    traffic, since they appear on the same port. This demultiplexing is
    described in [RFC5245], and is done using the magic cookie and other
    fields of the message. Stream-oriented transports introduce another
    wrinkle, since they require a way to frame the connection so that the
    application and STUN packets can be extracted in order to determine
    which is which. For this reason, TCP media streams utilizing ICE use
    the basic framing provided in RFC 4571 [RFC4571], even if the
    application layer protocol is not RTP.

"

It confirms why libnice and libjingle add/remove framing header at socket level (send()/recv()) and I think it is safe for us to move all references of TCP framing to ice4j and let RTPConnector* outside of this.

What do you think ?

Regards,

···

--
Seb

Le 13/10/11 09:11, Sebastien Vincent a �crit :

Hi Werner,

Le 12/10/11 16:35, Werner Dittmann a �crit :

Hi Seb,

Am 12.10.2011 11:09, schrieb Sebastien Vincent:

Hi Werner,

After a quick read in source, it is in ice4j's DelegatingSocket that do effectively the add/remove of TCP framing.

Well, if ice4j's DelegatingSocket adds framing than this would be done twice IMHO
because RTPConnectorTCPOutputStream adds framing too.

As I said to previous mail, it is ignored by ice4j and I wonder why I keep the code in RTPConnector. I will probably remove it to not confuse anyone for the moment.

Does ice4j DelegatingSocket handle RTP only? if yes - then it may add framing,
but if this class is a generic socket class then framing should be done by the
protocol handler, not the socket level.

As I already said libjingle and libnice also add this framing for their TCP sockets classes. Again for the moment, if we don't remove framing in ice4j, no STUN packets will be parsed because the STUN data will be prefixed by 2 bytes and will result to a non-STUN packet by the filter. I precise it is ice4j that read first the data, then filter data to handle STUN messages and forward non-STUN messages to the application (in our case RTPConnectorTCPInputStream).

I cannot remember exactly why it is in ice4j

rather than Jitsi. I need to check more on this. Also there is no additionnal verification on receive buffer length (mainly because Jitsi and Gmail/libjingle
use TCP_NODELAY), we will add additionnal checks.

Jitsi and some others may use TCP_NODELAY (but see some comments about the effect
some mails below in the list) - however, what happens if other clients don't do
this? I'm not a TCP expert, thus I don't know it this setting is propagated to
the other peer of the TCP connection

Regards,
--
Seb

Best regards,
Werner

<SNIP --- SNAP>
...