[jitsi-dev] [jitsi/jitsi-videobridge] Implements LS hack. (#277)


#1

Implements a hack to mitigate https://bugs.chromium.org/p/chromium/issues/detail?id=403710. The hack sends black VP8 key frames so that audio playback is unstuck at the client.
You can view, comment on, or merge this pull request online at:

  https://github.com/jitsi/jitsi-videobridge/pull/277

-- Commit Summary --

  * Implements LS hack.

-- File Changes --

    M src/main/java/org/jitsi/videobridge/AudioChannel.java (15)
    M src/main/java/org/jitsi/videobridge/Endpoint.java (15)
    A src/main/java/org/jitsi/videobridge/LipSyncHack.java (423)
    M src/main/java/org/jitsi/videobridge/VideoChannel.java (3)

-- Patch Links --

https://github.com/jitsi/jitsi-videobridge/pull/277.patch
https://github.com/jitsi/jitsi-videobridge/pull/277.diff

···

---
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/jitsi/jitsi-videobridge/pull/277


#2

is there a way to disable this?

···

---
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/jitsi/jitsi-videobridge/pull/277#issuecomment-235674074


#3

I will make this configurable, thanks @brianh5 !

···

---
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/jitsi/jitsi-videobridge/pull/277#issuecomment-235674406


#4

thank YOU :slight_smile:

···

---
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/jitsi/jitsi-videobridge/pull/277#issuecomment-235674730


#5

@gpolitis pushed 1 commit.

d72e3f9 Makes the LS hack disabled by default.

···


You are receiving this because you are subscribed to this thread.
View it on GitHub:
https://github.com/jitsi/jitsi-videobridge/pull/277/files/64324a9471b7331689713b558f2b668104a50c52..d72e3f93d335a3be489db8fff0c7fb8d843fd2c1


#6

@@ -536,8 +536,13 @@ boolean rtpTranslatorWillWrite(
         // XXX(gp) we could potentially move this into a TransformEngine.
         boolean accept = lastNController.isForwarded(source);

- getEndpoint().getLipSyncHack().onRTPTranslatorWillWriteVideo(
- accept, data, buffer, offset, length, this);
+ LipSyncHack h = getEndpoint().getLipSyncHack();

Can we rename this to lipSyncHack?

···


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/jitsi/jitsi-videobridge/pull/277/files/64324a9471b7331689713b558f2b668104a50c52..d72e3f93d335a3be489db8fff0c7fb8d843fd2c1#r72512051


#7

+ 0};
+
+ /**
+ * A constant defining the maximum number of black key frames to send.
+ */
+ private static final int MAX_KEY_FRAMES = 20;
+
+ /**
+ * The rate (in ms) at which we are to send black key frames.
+ */
+ private static final long KEY_FRAME_RATE_MS = 33;
+
+ /**
+ * Wait for media for WAIT_MS before sending frames.
+ */
+ private final long WAIT_MS = 1000;

This could be static.

···

---
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/jitsi/jitsi-videobridge/pull/277/files/64324a9471b7331689713b558f2b668104a50c52#r72512101


#8

+ = RawPacket.getSSRCAsLong(buffer, offset, length);
+
+ // In order to minimize the synchronization overhead, we process
+ // only the first data packet of a given RTP stream.
+ //
+ // XXX No synchronization is required to r/w the acceptedAudioSSRCs
+ // because in the current architecture this method is called by a single
+ // thread at the time.
+ if (acceptedAudioSSRCs.contains(acceptedAudioSSRC))
+ {
+ // We've already triggered the hack for this audio stream and its
+ // associated video streams.
+ return;
+ }
+
+ acceptedAudioSSRCs.add(acceptedAudioSSRC);

Would it make sense to move this below the sanity checks (e.g. right before the for loop), in order to avoid any race condition with a video channel being created and added to its Endpoint? Or is that for the case where an endpoint only has an audio channel?

···

---
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/jitsi/jitsi-videobridge/pull/277/files/64324a9471b7331689713b558f2b668104a50c52#r72513368


#9

+ }
+
+ List<RtpChannel> sourceVideoChannels
+ = source.getEndpoint().getChannels(MediaType.VIDEO);
+ if (sourceVideoChannels == null || sourceVideoChannels.size() == 0)
+ {
+ return;
+ }
+
+ VideoChannel sourceVC = (VideoChannel) sourceVideoChannels.get(0);
+ if (sourceVC == null)
+ {
+ return;
+ }
+
+ // FIXME this will include rtx ssrcs and whatnot.

If I am reading this correctly, this injects keyframes for each known video SSRC, with each audio packet from an SSRC it hasn't seen before. Is there a reason that the audio SSRC is taken into account and not just the source audio Channel?

···

---
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/jitsi/jitsi-videobridge/pull/277/files/64324a9471b7331689713b558f2b668104a50c52#r72514465


#10

+ {
+ synchronized (injectState)
+ {
+ if (!injectState.active
+ || injectState.numOfKeyframesSent++ >= MAX_KEY_FRAMES)
+ {
+ scheduledFuture.cancel(true);
+ return;
+ }
+
+ MediaStream mediaStream = injectState.target.get();
+ if (mediaStream == null || !mediaStream.isStarted())
+ {
+ logger.debug("Waiting for the media stream to become" +
+ "available.");
+ return;

Here numOfKeyframesSent has already been incremented, although we didn't send a keyframe. Is that on purpose?

···

---
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/jitsi/jitsi-videobridge/pull/277/files/64324a9471b7331689713b558f2b668104a50c52#r72515519


#11

+
+ MediaStream mediaStream = injectState.target.get();
+ if (mediaStream == null || !mediaStream.isStarted())
+ {
+ logger.debug("Waiting for the media stream to become" +
+ "available.");
+ return;
+ }
+
+ try
+ {
+ // FIXME maybe grab from the write pool and copy the array?
+ byte[] buf = KEY_FRAME_BUFFER.clone();
+ RawPacket keyframe = new RawPacket(buf, 0, buf.length);
+
+ long timestamp = injectState.numOfKeyframesSent * 3000;

Is "3000" meant to approximate 90000 / KEY_FRAME_RATE_MS?

···

---
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/jitsi/jitsi-videobridge/pull/277/files/64324a9471b7331689713b558f2b668104a50c52#r72516011


#12

+import java.lang.ref.*;
+import java.util.*;
+import java.util.concurrent.*;
+
+/**
+ * Implements a hack for
+ * https://bugs.chromium.org/p/chromium/issues/detail?id=403710. The hack
+ * injects black video key frames to unstuck the playback of audio for composite
+ * media streams.
+ *
+ * @author George Politis
+ */
+public class LipSyncHack
+{
+ /**
+ * A byte array holding a black key frame.

Is that just the VP8 frame? Or a frame encapsulated in RTP?

···

---
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/jitsi/jitsi-videobridge/pull/277/files/64324a9471b7331689713b558f2b668104a50c52#r72516740


#13

+ "available.");
+ return;
+ }
+
+ try
+ {
+ // FIXME maybe grab from the write pool and copy the array?
+ byte[] buf = KEY_FRAME_BUFFER.clone();
+ RawPacket keyframe = new RawPacket(buf, 0, buf.length);
+
+ long timestamp = injectState.numOfKeyframesSent * 3000;
+ keyframe.setSSRC(injectState.ssrc.intValue());
+ keyframe.setSequenceNumber(
+ injectState.numOfKeyframesSent);
+ keyframe.setTimestamp(timestamp);
+

Assuming KEY_FRAME_BUFFER contains space for the RTP header (and VP8 descriptor), how does the Payload Type value get set? If it is hard-coded in KEY_FRAME_BUFFER, then it's "52", or am I missing something?

···

---
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/jitsi/jitsi-videobridge/pull/277/files/64324a9471b7331689713b558f2b668104a50c52#r72517434


#14

+ {
+ logger.debug("Waiting for the media stream to become" +
+ "available.");
+ return;
+ }
+
+ try
+ {
+ // FIXME maybe grab from the write pool and copy the array?
+ byte[] buf = KEY_FRAME_BUFFER.clone();
+ RawPacket keyframe = new RawPacket(buf, 0, buf.length);
+
+ long timestamp = injectState.numOfKeyframesSent * 3000;
+ keyframe.setSSRC(injectState.ssrc.intValue());
+ keyframe.setSequenceNumber(
+ injectState.numOfKeyframesSent);

This will probably affect the sequence number space used for video packets, if video is enabled later. It should start with a randomly chosen value per the RTP specs.

···

---
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/jitsi/jitsi-videobridge/pull/277/files/64324a9471b7331689713b558f2b668104a50c52#r72517804


#15

+ logger.debug("Waiting for the media stream to become" +
+ "available.");
+ return;
+ }
+
+ try
+ {
+ // FIXME maybe grab from the write pool and copy the array?
+ byte[] buf = KEY_FRAME_BUFFER.clone();
+ RawPacket keyframe = new RawPacket(buf, 0, buf.length);
+
+ long timestamp = injectState.numOfKeyframesSent * 3000;
+ keyframe.setSSRC(injectState.ssrc.intValue());
+ keyframe.setSequenceNumber(
+ injectState.numOfKeyframesSent);
+ keyframe.setTimestamp(timestamp);

Same here

···

---
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/jitsi/jitsi-videobridge/pull/277/files/64324a9471b7331689713b558f2b668104a50c52#r72517889


#16

+ return;
+ }
+
+ try
+ {
+ // FIXME maybe grab from the write pool and copy the array?
+ byte[] buf = KEY_FRAME_BUFFER.clone();
+ RawPacket keyframe = new RawPacket(buf, 0, buf.length);
+
+ long timestamp = injectState.numOfKeyframesSent * 3000;
+ keyframe.setSSRC(injectState.ssrc.intValue());
+ keyframe.setSequenceNumber(
+ injectState.numOfKeyframesSent);
+ keyframe.setTimestamp(timestamp);
+
+ logger.debug("Injecting black key frame ssrc=" + injectState.ssrc

Check for isDebugEnabled()?

···

---
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/jitsi/jitsi-videobridge/pull/277/files/64324a9471b7331689713b558f2b668104a50c52#r72517989


#17

+ if (acceptedVideoSSRCs.contains(acceptedVideoSSRC))
+ {
+ return;
+ }
+
+ acceptedVideoSSRCs.add(acceptedVideoSSRC);
+
+ InjectState state;
+ synchronized (states)
+ {
+ state = states.get(acceptedVideoSSRC);
+ if (state == null)
+ {
+ // The hack has never been triggered for this stream.
+ states.put(acceptedVideoSSRC, new InjectState(acceptedVideoSSRC,
+ ((RtpChannel) target).getStream(), false));

There's two casts of 'target'. Maybe cast once in a local variable?

···

---
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/jitsi/jitsi-videobridge/pull/277/files/64324a9471b7331689713b558f2b668104a50c52#r72518707


#18

+ }
+
+ List<RtpChannel> sourceVideoChannels
+ = source.getEndpoint().getChannels(MediaType.VIDEO);
+ if (sourceVideoChannels == null || sourceVideoChannels.size() == 0)
+ {
+ return;
+ }
+
+ VideoChannel sourceVC = (VideoChannel) sourceVideoChannels.get(0);
+ if (sourceVC == null)
+ {
+ return;
+ }
+
+ // FIXME this will include rtx ssrcs and whatnot.

We create a state object for a given video SSRC only once. So, however many audio SSRCs we receive from the a given audio channel, the SSRCs from the associated video channel are added only once.

···

---
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/jitsi/jitsi-videobridge/pull/277/files/64324a9471b7331689713b558f2b668104a50c52#r72644203


#19

+ "available.");
+ return;
+ }
+
+ try
+ {
+ // FIXME maybe grab from the write pool and copy the array?
+ byte[] buf = KEY_FRAME_BUFFER.clone();
+ RawPacket keyframe = new RawPacket(buf, 0, buf.length);
+
+ long timestamp = injectState.numOfKeyframesSent * 3000;
+ keyframe.setSSRC(injectState.ssrc.intValue());
+ keyframe.setSequenceNumber(
+ injectState.numOfKeyframesSent);
+ keyframe.setTimestamp(timestamp);
+

The KEY_FRAME_BUFFER contains the full UDP payload -> the full RTP packet. The payload type is the `KEY_FRAME_BUFFER[1] & 7f` which is equal to 100.

···

---
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/jitsi/jitsi-videobridge/pull/277/files/64324a9471b7331689713b558f2b668104a50c52#r72655653


#20

+ {
+ logger.debug("Waiting for the media stream to become" +
+ "available.");
+ return;
+ }
+
+ try
+ {
+ // FIXME maybe grab from the write pool and copy the array?
+ byte[] buf = KEY_FRAME_BUFFER.clone();
+ RawPacket keyframe = new RawPacket(buf, 0, buf.length);
+
+ long timestamp = injectState.numOfKeyframesSent * 3000;
+ keyframe.setSSRC(injectState.ssrc.intValue());
+ keyframe.setSequenceNumber(
+ injectState.numOfKeyframesSent);

If video is enabled later, the sequence numbers of the (real) video packets will be re-written (as well as their timestamps). Shouldn't that solve the problem?

···

---
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/jitsi/jitsi-videobridge/pull/277/files/64324a9471b7331689713b558f2b668104a50c52#r72656370