[jitsi-dev] [jitsi/libjitsi] Add keyframe detection logic for H264 and pipeline refactoring around that (#183)


#1

Keyframe detection and the utility function in libjitsi was centered around VP8, we added the keyframe detection for H264 and generalized the utility function to handle the codec type based on the packet and negotiated codec
You can view, comment on, or merge this pull request online at:

  https://github.com/jitsi/libjitsi/pull/183

-- Commit Summary --

  * Add keyframe detection logic for H264 and some pipeline refactoring around that

-- File Changes --

    M src/org/jitsi/impl/neomedia/codec/video/h264/DePacketizer.java (119)
    M src/org/jitsi/impl/neomedia/codec/video/vp8/DePacketizer.java (2)
    D src/org/jitsi/impl/neomedia/codec/video/vp8/Utils.java (126)
    M src/org/jitsi/impl/neomedia/rtp/translator/RTCPFeedbackMessageSender.java (33)
    M src/org/jitsi/impl/neomedia/transform/rewriting/SsrcGroupRewriter.java (27)
    M src/org/jitsi/impl/neomedia/transform/rewriting/SsrcRewritingEngine.java (1)
    A src/org/jitsi/impl/neomedia/transform/rewriting/Utils.java (221)
    M src/org/jitsi/service/neomedia/codec/Constants.java (2)

-- Patch Links --

https://github.com/jitsi/libjitsi/pull/183.patch
https://github.com/jitsi/libjitsi/pull/183.diff

···

---
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/jitsi/libjitsi/pull/183


#2

This is really great @mdaneshi, I will have a look.

···

---
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/jitsi/libjitsi/pull/183#issuecomment-236996887


#3

@@ -519,6 +537,105 @@ else if (nal_unit_type == 28) // FU-A Fragmentation unit (FU)
     }

     /**
+ * Returns true if the buffer contains a H264 key frame at offset
+ * <tt>offset</tt>.
+ *
+ * @param buff the byte buffer to check
+ * @param off the offset in the byte buffer where the actuall data starts
+ * @param len the length of the data in the byte buffer
+ * @return true if the buffer contains a H264 key frame at offset
+ * <tt>offset</tt>.
+ */
+ public boolean isKeyFrame(byte[] buff, int off, int len)
+ {
+ if (buff == null || len < 1 || buff.length < len)

@mdaneshi, do you think it would make sense to have something like this in every static methods that you wrote?

    if (buf == null || Math.min(buf.length, len) < off + 1 /* required array length for this method to work correctly */)
    {
        return false;
    }

I know this is not a common practice currently in libjitsi and we sometimes are a little sloppy when we access arrays, but I believe we should change this and write more defensive code whenever possible.

···

---
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/jitsi/libjitsi/pull/183/files/d12c171d8e57d45492037a8687009705cb750076#r73911779


#4

+ // Skip the StapA header (StapA nal type + length).
+ if (len <= kStapAHeaderSize)
+ {
+ logger.error("StapA header truncated.");
+ return false;
+ }
+ if (!verifyStapANaluLengths(buff, naluStart, naluLength)) {
+ logger.error("StapA packet with incorrect NALU packet lengths.");
+ return false;
+ }
+ nalType = buff[off + kStapAHeaderSize] & kTypeMask;
+ }
+ return (nalType == kIdr || nalType == kSps || nalType == kPps || nalType == kSei);
+ }
+
+ private static boolean verifyStapANaluLengths(byte[] data, int offset, int lengthRemaining)

This line (and a few others) exceed our 80 chars limit. If you're using IntelliJ Idea you can configure it to show a gray column at the 80 chars limit (I believe the option is called right margin). You can do the same thing with ViM (and most likely any other editor)

    set colorcolumn=80,100

···

---
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/jitsi/libjitsi/pull/183/files/d12c171d8e57d45492037a8687009705cb750076#r73914096


#5

+ * Copyright @ 2015 Atlassian Pty Ltd
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.jitsi.impl.neomedia.transform.rewriting;

Maybe move this in the `org.jitsi.impl.neomedia.codec.video` package?

···

---
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/jitsi/libjitsi/pull/183/files/d12c171d8e57d45492037a8687009705cb750076#r73916292


#6

@@ -82,7 +82,7 @@
     /**
      * The H264 constant.
      */
- public static final String H264 = "h264";
+ public static final String H264 = "H264";

Since we're checking for case insensitive equality, is this really needed?

···

---
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/jitsi/libjitsi/pull/183/files/d12c171d8e57d45492037a8687009705cb750076#r73917771


#7

+ return false;
+ }
+
+ /**
+ * Utility method that determines whether or not a packet is a key frame.
+ *
+ * @param buf the buffer that holds the RTP payload.
+ * @param off the offset in the buff where the RTP payload is found.
+ * @param len then length of the RTP payload in the buffer.
+ * @param redPT The RED payload type.
+ * @param vp8PT The VP8 payload type.
+ * @return true if the packet is a VP8 key frame, false otherwise.
+ * @return true if the packet is a VP8 key frame, false otherwise.
+ */
+ public static boolean isVP8KeyFrame(
+ byte[] buf, int off, int len, Byte redPT, Byte vp8PT)

Since `isVP8KeyFrame` and `isH264KeyFrame` are almost identical, would it make sense to factor out the part that finds the buffer, offset, length of the payload and then call the appropriate depacketizer?

···

---
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/jitsi/libjitsi/pull/183/files/d12c171d8e57d45492037a8687009705cb750076#r73920200


#8

@@ -414,11 +415,25 @@ private void maybeSwitchActiveRewriter(final RawPacket pkt)
      */
     boolean isKeyFrame(RawPacket pkt)
     {
- int sourceSSRC = pkt.getSSRC();
- Byte redPT = ssrcRewritingEngine.ssrc2red.get(sourceSSRC);
- byte vp8PT = 0x64;
-
- return Utils.isKeyFrame(pkt, redPT, vp8PT);
+ byte pt = pkt.getPayloadType();
+ Byte redPT = null, codecPT = null;
+ String codecType = null;
+ for (Map.Entry<Byte, MediaFormat> entry :

We have this logic twice - the dynamic payload types loop that detects a keyframe -. Once here and once in the RTCPFeedbackMessageSender class. Maybe we could move it inside the MediaStreamImpl?

···

---
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/jitsi/libjitsi/pull/183/files/d12c171d8e57d45492037a8687009705cb750076#r73922286


#9

@@ -519,6 +537,105 @@ else if (nal_unit_type == 28) // FU-A Fragmentation unit (FU)
     }

     /**
+ * Returns true if the buffer contains a H264 key frame at offset
+ * <tt>offset</tt>.
+ *
+ * @param buff the byte buffer to check
+ * @param off the offset in the byte buffer where the actuall data starts
+ * @param len the length of the data in the byte buffer
+ * @return true if the buffer contains a H264 key frame at offset
+ * <tt>offset</tt>.
+ */
+ public boolean isKeyFrame(byte[] buff, int off, int len)
+ {
+ if (buff == null || len < 1 || buff.length < len)

@gpolitis sounds good to me.

···

---
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/jitsi/libjitsi/pull/183/files/d12c171d8e57d45492037a8687009705cb750076#r74333815


#10

@@ -519,6 +537,105 @@ else if (nal_unit_type == 28) // FU-A Fragmentation unit (FU)
     }

     /**
+ * Returns true if the buffer contains a H264 key frame at offset
+ * <tt>offset</tt>.
+ *
+ * @param buff the byte buffer to check
+ * @param off the offset in the byte buffer where the actuall data starts
+ * @param len the length of the data in the byte buffer
+ * @return true if the buffer contains a H264 key frame at offset
+ * <tt>offset</tt>.
+ */
+ public boolean isKeyFrame(byte[] buff, int off, int len)
+ {
+ if (buff == null || len < 1 || buff.length < len)

Actually @bgrozev noticed that the condition I gave you is wrong, it should look like this

    if (buf == null || buf.length < off + Math.max(len, 1 /* required array length for this method to work correctly */))
    {
        return false;
    }

···

--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/jitsi/libjitsi/pull/183/files/d12c171d8e57d45492037a8687009705cb750076#r74334281


#11

+ * Copyright @ 2015 Atlassian Pty Ltd
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.jitsi.impl.neomedia.transform.rewriting;

sure, makes sense.

···

--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/jitsi/libjitsi/pull/183/files/d12c171d8e57d45492037a8687009705cb750076#r74337234


#12

@@ -414,11 +415,25 @@ private void maybeSwitchActiveRewriter(final RawPacket pkt)
      */
     boolean isKeyFrame(RawPacket pkt)
     {
- int sourceSSRC = pkt.getSSRC();
- Byte redPT = ssrcRewritingEngine.ssrc2red.get(sourceSSRC);
- byte vp8PT = 0x64;
-
- return Utils.isKeyFrame(pkt, redPT, vp8PT);
+ byte pt = pkt.getPayloadType();
+ Byte redPT = null, codecPT = null;
+ String codecType = null;
+ for (Map.Entry<Byte, MediaFormat> entry :

I don't think I am following your concern here, since we don't have the codecType, we have to look it up anywhere we are looking for keyframe, these two places don't share a common path to reuse the logic, does that make sense?

···

--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/jitsi/libjitsi/pull/183/files/d12c171d8e57d45492037a8687009705cb750076#r74341473


#13

+ return false;
+ }
+
+ /**
+ * Utility method that determines whether or not a packet is a key frame.
+ *
+ * @param buf the buffer that holds the RTP payload.
+ * @param off the offset in the buff where the RTP payload is found.
+ * @param len then length of the RTP payload in the buffer.
+ * @param redPT The RED payload type.
+ * @param vp8PT The VP8 payload type.
+ * @return true if the packet is a VP8 key frame, false otherwise.
+ * @return true if the packet is a VP8 key frame, false otherwise.
+ */
+ public static boolean isVP8KeyFrame(
+ byte[] buf, int off, int len, Byte redPT, Byte vp8PT)

yes, good idea.

···

--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/jitsi/libjitsi/pull/183/files/d12c171d8e57d45492037a8687009705cb750076#r74341491


#14

@mdaneshi pushed 1 commit.

f208006 Address PR comments

···


You are receiving this because you are subscribed to this thread.
View it on GitHub:
https://github.com/jitsi/libjitsi/pull/183/files/d12c171d8e57d45492037a8687009705cb750076..f208006e988bbe987be4e10871868b608ecfcb3c


#15

@@ -414,11 +415,25 @@ private void maybeSwitchActiveRewriter(final RawPacket pkt)
      */
     boolean isKeyFrame(RawPacket pkt)
     {
- int sourceSSRC = pkt.getSSRC();
- Byte redPT = ssrcRewritingEngine.ssrc2red.get(sourceSSRC);
- byte vp8PT = 0x64;
-
- return Utils.isKeyFrame(pkt, redPT, vp8PT);
+ byte pt = pkt.getPayloadType();
+ Byte redPT = null, codecPT = null;
+ String codecType = null;
+ for (Map.Entry<Byte, MediaFormat> entry :

@mdaneshi thank you for the updated PR. Could you please have a look at https://github.com/parlaylabs/libjitsi/pull/1 and see if that makes sense? I haven't tested it, but it sketches what I've tried to explain in my previous comment.

···

--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/jitsi/libjitsi/pull/183/files/d12c171d8e57d45492037a8687009705cb750076#r74981141


#16

Hi, thanks for your contribution!
If you haven't already done so, could you please make sure you sign our CLA (https://jitsi.org/icla for individuals and https://jitsi.org/ccla for corporations)? We would unfortunately be unable to merge your patch unless we have that piece :(.

···

--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/jitsi/libjitsi/pull/183#issuecomment-240540818


#17

@mdaneshi pushed 1 commit.

86ab325 Refactor based on George's suggestion

···


You are receiving this because you are subscribed to this thread.
View it on GitHub:
https://github.com/jitsi/libjitsi/pull/183/files/f208006e988bbe987be4e10871868b608ecfcb3c..86ab32533f1a8702dae861d341631ccfa622b5b5


#18

+ payloadType
+ = RawPacket.getPayloadType(buf, off, len);
+ }
+ }
+ catch (ArrayIndexOutOfBoundsException e)
+ {
+ // While ideally we should check the bounds everywhere and not
+ // attempt to access the packet's buffer at invalid indexes, there
+ // are too many places where it could inadvertently happen. It's
+ // safer to return a default value of 'false' from this utility
+ // method than to risk killing a thread which doesn't expect this.
+ logger.warn("Caught an exception while checking for keyframe:", e);
+ return false;
+ }
+
+ if (vp8PT != null && payloadType == vp8PT)

I think we should probably move this if block inside the try-catch. I hate it that we have to catch array boundaries errors like this, we have done a sloppy job, but that's what we have right now.

···


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/jitsi/libjitsi/pull/183/files/f208006e988bbe987be4e10871868b608ecfcb3c..86ab32533f1a8702dae861d341631ccfa622b5b5#r75512314


#19

+ h264PT = payloadType;
+ }
+ }
+
+ int payloadOff, payloadLen;
+ byte payloadType;
+
+ try {
+ // XXX this will not work correctly when RTX gets enabled!
+ if (redPT != null
+ && redPT == RawPacket.getPayloadType(buf, off, len))
+ {
+ REDBlock block = REDBlockIterator
+ .getPrimaryBlock(buf,
+ RawPacket.getPayloadOffset(buf, off, len),
+ RawPacket.getPayloadLength(buf, off, len));

Can we include this comment and adapt the code accordingly:

    // XXX There's RawPacket#getPayloadLength() but the implementation
    // includes pkt.paddingSize at the time of this writing and
    // we do not know whether that's going to stay that way.

In my opinion, eventually `RawPacket` has to go away. It needs to be split into different static classes and its methods need to do proper input validation. But again, that's what we have right now and we need to take that into consideration.

···


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/jitsi/libjitsi/pull/183/files/f208006e988bbe987be4e10871868b608ecfcb3c..86ab32533f1a8702dae861d341631ccfa622b5b5#r75512837


#20

+ if (redPT != null
+ && redPT == RawPacket.getPayloadType(buf, off, len))
+ {
+ REDBlock block = REDBlockIterator
+ .getPrimaryBlock(buf,
+ RawPacket.getPayloadOffset(buf, off, len),
+ RawPacket.getPayloadLength(buf, off, len));
+
+ payloadOff = block.getOffset();
+ payloadLen = block.getLength();
+ payloadType = block.getPayloadType();
+ }
+ else
+ {
+ payloadOff = RawPacket.getPayloadOffset(buf, off, len);
+ payloadLen = RawPacket.getPayloadLength(buf, off, len);

Same as above.

···


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/jitsi/libjitsi/pull/183/files/f208006e988bbe987be4e10871868b608ecfcb3c..86ab32533f1a8702dae861d341631ccfa622b5b5#r75512878