We'd been noticing some a/v sync issues, and had previously thought they
were due to a bug in chrome that was recently rolled out (bug here
<https://code.google.com/p/webrtc/issues/detail?id=4667>). After reading
through the discussion in that bug (mainly about how the stack would rely
on the msid to match up audio and video streams to sync) I started looking
through the jitsi meet code and realized that, since it uses two separate
MediaStream objects, we didn't get matching msids for free and therefore,
even with the chrome bug fixed, chrome would not be making any efforts to
sync things up.
Emil and I were talking about this and were wondering if it was the msids
that needed to be in sync or if cnames would be sufficient, so I did some
I added some logging in Call::ConfigureSync. This is where it pairs up
audio and video streams to configure sync, so I added a message there as a
way to evaluate whether or not sync was enabled when trying different
Using the sdp-munge demo (here
<https://doubango.org/webrtc/samples/web/content/munge-sdp/>) I tried
manually changing the msids and cnames of the audio and video streams.
When changing the audio stream's cname (so that it didn't match video) but
leaving the msids alone, sync still got configured. When changing the msid
but leaving the cname (so it did match the video stream's cname) I didn't
see sync get configured.
I then tweaked the sdp-munge demo a bit such that it acquired the audio and
video as two separate MediaStreams (so the msids didn't match by default)
and manually munged the msids to match. When doing that, I do see it
trigger the sync configuration logic in chrome.
So it looks like we don't need to use a single MediaStream, but we will
need to munge the sdp to make the msids match. This matches what we see in
1) CreateTracksFromSsrcInfos in webrtcsdp.cc extracts the mslabel and puts
it in 'sync_label' inside the StreamParams.
2) The sync_label inside StreamParams is used to fill out the sync_group
var in the VideoReceiveStream::Config (in webrtcvideoengine2.cc &
3) Call.cc uses the sync_group values to associate audio and video streams
to configure sync (this is where my print is).