[jitsi-dev] jitsi videobridge


#1

Hello Mircea,

[snip]
RECV:
<iq id="1IR3D-62" to="jitsi-videobridge.ezuce.ro <http://jitsi-videobridge.ezuce.ro>" type="get" from="200@ezuce.ro/jitsi-3ivre23 <http://200@ezuce.ro/jitsi-3ivre23>">
<conference xmlns="http://jitsi.org/protocol/colibri" id="f61ddeb9070eee43">
<content name="video">
<channel>
<payload-type id="105" name="H264" clockrate="90000">
<parameter name="profile-level-id" value="4DE01f"/>
<parameter name="packetization-mode" value="1"/>
<parameter name="imageattr" value="send * recv [x=[0-1366],y=[0-768]]"/>
</payload-type>
<payload-type id="99" name="H264" clockrate="90000">
<parameter name="profile-level-id" value="4DE01f"/>
<parameter name="imageattr" value="send * recv [x=[0-1366],y=[0-768]]"/>
</payload-type></channel>
<channel>
<payload-type id="105" name="H264" clockrate="90000">
<parameter name="profile-level-id" value="4DE01f"/>
<parameter name="packetization-mode" value="1"/>
<parameter name="imageattr" value="send * recv [x=[0-1366],y=[0-768]]"/>
</payload-type>
<payload-type id="99" name="H264" clockrate="90000">
<parameter name="profile-level-id" value="4DE01f"/>
<parameter name="imageattr" value="send * recv [x=[0-1366],y=[0-768]]"/>
</payload-type>
</channel>
</content>
</conference>
</iq>

I assume that this stanza is constructed somewhere in Jabber protocol implementation in jitsi code, right?

Yes, see CallJabberImpl#createColibriChannels().

Based on this stanza, the videobridge communicates back to the client 200 where to send the stream, the RTP and RTCP ports, and the hostname
SENT:
<iq type="result" id="1IR3D-62" from="jitsi-videobridge.ezuce.ro <http://jitsi-videobridge.ezuce.ro>" to="200@ezuce.ro/jitsi-3ivre23 <http://200@ezuce.ro/jitsi-3ivre23>">
<conference xmlns="http://jitsi.org/protocol/colibri" id="f61ddeb9070eee43">
<content name="video"><channel id="e95cccaee985e2d3" host="192.168.7.100" rtpport="10036" rtcpport="10037" expire="60"/>
<channel id="d9700b2ae0e2a5e8" host="192.168.7.100" rtpport="10038" rtcpport="10039" expire="60"/>
</content>
</conference>
</iq>

Yep. The videobridge creates a conference with two channels in it. It will relay the media received on each channel to the other. One channel will be used by the client on 200, the other will be used by 201.

SENT:
<iq type="set" id="PYBjr-12" from="jitsi-videobridge.ezuce.ro <http://jitsi-videobridge.ezuce.ro>" to="200@ezuce.ro/jitsi-3ivre23 <http://200@ezuce.ro/jitsi-3ivre23>">
<conference xmlns="http://jitsi.org/protocol/colibri" id="f61ddeb9070eee43">
<content name="video">
<channel id="e95cccaee985e2d3" host="192.168.7.100" rtpport="10036" rtcpport="10037" expire="60">
<ssrc>-2087326191</ssrc>
</channel>
</content>
</conference>
</iq>

Here the videobridge notifies the organizer of the conference (200) that it has received media with a new SSRC on one of the conference's channels. Jitsi will use this information to match the streams it receives to the participants in the conference ("aha! There is media with SSRC 1234 on channel C1, and this is the channel I gave to 201, so SSRC 1234 comes from 201"). It needs this, because it will receive multiple streams from the videobridge multiplexed on a single port.

Looking in videobridge code, I realized that the whole mechanics resides in Channel.java where a RECVONLY stream is created based on the above exchange.

So far so good, but I wasn't able to figure out how jitsi videobridge broadcasts the stream to the other user (201 in my case)
I noticed something regarding a jingle nodes session establishment between users 200 and 201 when the call is initiated, but I am not able to figure out how this is involved in jitsi videobridge entire picture. If we have video bridge in place, we should not need any jingle nodes support, I think

At the moment the videobridge uses latching -- it waits until it receives a packet on the socket for a given channel, and then associates the channel with the packet's source. See org.jitsi.videobridge.Channel#acceptDataInputStreamDatagramPacket()

To make this work, Jitsi clients always send some packets when they open a stream (even if they are only prepared to receive media). See TransportManager#sendHolePunchPacket()

When Jitsi creates a videobridge conference, it makes Jingle sessions with each participant (just as in a "regular" conference hosted by the client). The difference is that it substitutes the addresses obtained from the videobridge in the Jingle transport candidates, so that the media goes through the videobridge.

This is not strictly needed -- you can use the videobridge channels in other ways. You just need to communicate the ips/ports to the other side and have it send some initial packets.

Regards,
Boris

···

On 5/20/13 11:59 AM, Mircea Carasel wrote:


#2

Hi Boris,

Thanks for clarifications

There are few other things I would like to ask in order for me to fully
understand the workflow
Is it possible to just define a template, a specification on how the packet
exchange should be done in order to share the screen for example.

I want to create my own routines to do that, ant not to reuse jitsi
sessions.
Lets assume that we have 3 IM users: 200, 201, 202 and we want to
accomplish the following scenario

-user 200 is logged in IM (Openfire XMPP server for example)
-user 201 is logged in IM (Openfire XMPP server for example)
-user 202 is logged in IM

user 200 would like to share his screen to users 201 and 202 using Jitsi
Video bridge as a RTP Relay layer
How the packet exchange should look like?
Here is how I would think it should happen - please correct me If I am
wrong:

1. user 200 sends a IQ packet to jitsi conference bridge.
This packet should contain 3 channels. How should I construct the
<payload-type and <parameter tags? H264 is the video encoder,
clockrate=90000 - can I use these values as is?
what is profile-level-id?. I assume that <parameter name="imageattr"
signifies the screen properties
2. videobridge sends back to the user 200 port information, host where to
send the stream by assigning a channel to user 200
3.The other users (201 and 202) will send packets to jitsi video bridge.
How these packets will look like? based on these packets jitsi videobridge
assigns the other two channels to users 201 and 202
The part I don't understand is, shouldn't jitsi videobridge communicate
back to clients 201 and 202 where to listen for receiving media (ip and
ports)?

Thanks again,
Mircea

···

On Mon, May 20, 2013 at 2:02 PM, Boris Grozev <boris@jitsi.org> wrote:

Hello Mircea,

On 5/20/13 11:59 AM, Mircea Carasel wrote:

[snip]
  RECV:
<iq id="1IR3D-62" to="jitsi-videobridge.ezuce.ro" type="get" from="
200@ezuce.ro/jitsi-3ivre23">
<conference xmlns="http://jitsi.org/protocol/colibri"
id="f61ddeb9070eee43">
<content name="video">
<channel>
<payload-type id="105" name="H264" clockrate="90000">
<parameter name="profile-level-id" value="4DE01f"/>
<parameter name="packetization-mode" value="1"/>
<parameter name="imageattr" value="send * recv [x=[0-1366],y=[0-768]]"/>
</payload-type>
<payload-type id="99" name="H264" clockrate="90000">
<parameter name="profile-level-id" value="4DE01f"/>
<parameter name="imageattr" value="send * recv [x=[0-1366],y=[0-768]]"/>
</payload-type></channel>
<channel>
<payload-type id="105" name="H264" clockrate="90000">
<parameter name="profile-level-id" value="4DE01f"/>
<parameter name="packetization-mode" value="1"/>
<parameter name="imageattr" value="send * recv [x=[0-1366],y=[0-768]]"/>
</payload-type>
<payload-type id="99" name="H264" clockrate="90000">
<parameter name="profile-level-id" value="4DE01f"/>
<parameter name="imageattr" value="send * recv [x=[0-1366],y=[0-768]]"/>
</payload-type>
</channel>
</content>
</conference>
</iq>

I assume that this stanza is constructed somewhere in Jabber protocol
implementation in jitsi code, right?

Yes, see CallJabberImpl#createColibriChannels().

Based on this stanza, the videobridge communicates back to the client
200 where to send the stream, the RTP and RTCP ports, and the hostname

  SENT:
<iq type="result" id="1IR3D-62" from="jitsi-videobridge.ezuce.ro" to="
200@ezuce.ro/jitsi-3ivre23">
<conference xmlns="http://jitsi.org/protocol/colibri"
id="f61ddeb9070eee43">
<content name="video"><channel id="e95cccaee985e2d3" host="192.168.7.100"
rtpport="10036" rtcpport="10037" expire="60"/>
<channel id="d9700b2ae0e2a5e8" host="192.168.7.100" rtpport="10038"
rtcpport="10039" expire="60"/>
</content>
</conference>
</iq>

Yep. The videobridge creates a conference with two channels in it. It will
relay the media received on each channel to the other. One channel will be
used by the client on 200, the other will be used by 201.

  SENT:
<iq type="set" id="PYBjr-12" from="jitsi-videobridge.ezuce.ro" to="
200@ezuce.ro/jitsi-3ivre23">
<conference xmlns="http://jitsi.org/protocol/colibri"
id="f61ddeb9070eee43">
<content name="video">
<channel id="e95cccaee985e2d3" host="192.168.7.100" rtpport="10036"
rtcpport="10037" expire="60">
<ssrc>-2087326191</ssrc>
</channel>
</content>
</conference>
</iq>

Here the videobridge notifies the organizer of the conference (200) that
it has received media with a new SSRC on one of the conference's channels.
Jitsi will use this information to match the streams it receives to the
participants in the conference ("aha! There is media with SSRC 1234 on
channel C1, and this is the channel I gave to 201, so SSRC 1234 comes from
201"). It needs this, because it will receive multiple streams from the
videobridge multiplexed on a single port.

Looking in videobridge code, I realized that the whole mechanics resides
in Channel.java where a RECVONLY stream is created based on the above
exchange.

So far so good, but I wasn't able to figure out how jitsi videobridge
broadcasts the stream to the other user (201 in my case)
I noticed something regarding a jingle nodes session establishment between
users 200 and 201 when the call is initiated, but I am not able to figure
out how this is involved in jitsi videobridge entire picture. If we have
video bridge in place, we should not need any jingle nodes support, I think

At the moment the videobridge uses latching -- it waits until it receives
a packet on the socket for a given channel, and then associates the channel
with the packet's source. See
org.jitsi.videobridge.Channel#acceptDataInputStreamDatagramPacket()

To make this work, Jitsi clients always send some packets when they open a
stream (even if they are only prepared to receive media). See
TransportManager#sendHolePunchPacket()

When Jitsi creates a videobridge conference, it makes Jingle sessions with
each participant (just as in a "regular" conference hosted by the client).
The difference is that it substitutes the addresses obtained from the
videobridge in the Jingle transport candidates, so that the media goes
through the videobridge.

This is not strictly needed -- you can use the videobridge channels in
other ways. You just need to communicate the ips/ports to the other side
and have it send some initial packets.

Regards,
Boris


#3

Hello and sorry for the long delay

Hi Boris,

Thanks for clarifications

There are few other things I would like to ask in order for me to
fully understand the workflow
Is it possible to just define a template, a specification on how the
packet exchange should be done in order to share the screen for example.

I want to create my own routines to do that, ant not to reuse jitsi
sessions.
Lets assume that we have 3 IM users: 200, 201, 202 and we want to
accomplish the following scenario

-user 200 is logged in IM (Openfire XMPP server for example)
-user 201 is logged in IM (Openfire XMPP server for example)
-user 202 is logged in IM

user 200 would like to share his screen to users 201 and 202 using
Jitsi Video bridge as a RTP Relay layer
How the packet exchange should look like?
Here is how I would think it should happen - please correct me If I am
wrong:

1. user 200 sends a IQ packet to jitsi conference bridge.
This packet should contain 3 channels. How should I construct the
<payload-type and <parameter tags? H264 is the video encoder,
clockrate=90000 - can I use these values as is?
what is profile-level-id?. I assume that <parameter name="imageattr"
signifies the screen properties

See http://xmpp.org/extensions/xep-0167.html#format for the meaning of these attributes.

Currently, for video, the videobridge only uses these fields to determine whether to change the RTP payload type field. I think that in a simple application you could ignore them. Please check back and refer to the upcoming specification for details and/or corrections.

2. videobridge sends back to the user 200 port information, host where
to send the stream by assigning a channel to user 200

It will assign a "conference" to 200. Then 200 (and only 200) will be able to add more channels to this conference, expire channels on it, etc.

It will assign 3 channels to the conference and will provide information about them (host and port numbers for each) to user 200. It is up to user 200 to decide what to do with them.

3.The other users (201 and 202) will send packets to jitsi video
bridge. How these packets will look like? based on these packets jitsi
videobridge assigns the other two channels to users 201 and 202
The part I don't understand is, shouldn't jitsi videobridge
communicate back to clients 201 and 202 where to listen for receiving
media (ip and ports)?

The videobridge will only communicate with a single entity, 200 in your case. 200 will have to send the allocated ports to 201 and 202 in some other way (Jitsi uses Jingle).
The packets from 201 and 202 to the videobridge can be just empty UDP datagrams.

So, after step 2 above something like this should happen:
a. 200 starts streaming media on one of the channels.
b. 200 communicates the host/ports of the other two channels to 201 and 202
c. 201 sends an empty packet to the host/port it received from 200. On receipt of this packet the videobridge "lock in" and starts to relay 200's stream to 201.
d. Ditto with 202.

The way that 200 communicates with 201 and 202 is intentionally not specified.

Regards,
Boris

···

On Tue May 21 12:10:48 2013, Mircea Carasel wrote:


#4

The videobridge will only communicate with a single entity, 200 in your
case. 200 will have to send the allocated ports to 201 and 202 in some
other way (Jitsi uses Jingle).
The packets from 201 and 202 to the videobridge can be just empty UDP
datagrams.

So, after step 2 above something like this should happen:
a. 200 starts streaming media on one of the channels.
b. 200 communicates the host/ports of the other two channels to 201 and 202
c. 201 sends an empty packet to the host/port it received from 200. On
receipt of this packet the videobridge "lock in" and starts to relay 200's
stream to 201.
d. Ditto with 202.

The way that 200 communicates with 201 and 202 is intentionally not
specified.

Thank you so much Boris, this is really helpful for me. I think now I
understand the workflow and I am good to go.
Thanks,
Mircea

···

Regards,
Boris