Problems implementing RtpTanslation in Jigasi for TranscriptionGateway

jigasi

#1

As I had mentioned on the last Jitsi Community call, we are working on using RtpTranslation in Jigasi (instead of the mixer) for the TranscriptionGateway. Part of our goal here is to use the RecorderRtpImpl, which requires an RtpTranslator construction parameter.

The problem we are having is that the mixer still seems to be getting used, but it’s not clear to me exactly where in the code we need to change things so that we use the translator. I’ve been looking at the SipGatewaySession for reference, and it looks like it is doing the following:

if (destination == null)
{
call.setConference(incomingCall.getConference());
ensureSsrcRewriter(incomingCall); // moved the code in ensureSsrcRewriter into abstractGatewaySession so that it cd also be used by TranscriptionGatewaySession

}

Do we need to do something similar in the TranscriptionGatewaySession (rather than creating a new instance of MediaAwareCallConference with an overridden getMediaDevice method)? Or do we still need to create a new instance of MediaAwareCallConference, but somehow override getRtpTranslator?

Currently, we are getting repeated nullPointerExceptions like the following (when using RecorderRtpImpl):
2018-11-13 20:52:41.536 SEVERE: [138277] org.jitsi.impl.neomedia.rtp.translator.OutputDataStreamImpl.log() Failed to translate RTP packet
java.lang.NullPointerException
at org.jitsi.impl.neomedia.transform.fec.FECTransformEngine.getPrimarySsrc(FECTransformEngine.java:121)
at org.jitsi.impl.neomedia.transform.fec.FECTransformEngine.reverseTransform(FECTransformEngine.java:152)
at org.jitsi.impl.neomedia.transform.TransformEngineChain$PacketTransformerChain.reverseTransform(TransformEngineChain.java:381)
at org.jitsi.impl.neomedia.recording.AudioRecorderRtpImpl$RTPConnectorImpl$OutputDataStreamImpl.write(AudioRecorderRtpImpl.java:1290)
at org.jitsi.impl.neomedia.recording.AudioRecorderRtpImpl$RTPConnectorImpl$OutputDataStreamImpl.write(AudioRecorderRtpImpl.java:1261)
at org.jitsi.impl.neomedia.rtp.translator.OutputDataStreamImpl.doWrite(OutputDataStreamImpl.java:262)
at org.jitsi.impl.neomedia.rtp.translator.OutputDataStreamImpl.run(OutputDataStreamImpl.java:421)
at java.lang.Thread.run(Thread.java:748)

I am guessing that this is probably related to not having the RtpTranslator properly integrated, however.

My last question, relates to getting the audio samples (for sending to transcription). Is the right approach to extend the SilenceEffect in the RecorderRtpImpl, overriding process (in order to get access to the audio Buffer)?

THanks for all your help!


#2

@damencho any suggestions on this? Thanks!


#3

I was looking at your post, but I needed time to look at the code, so in order to use translator currently (for sip) you only need to enable few properties:

net.java.sip.communicator.impl.protocol.sip.acc1403273890647.USE_TRANSLATOR_IN_CONFERENCE=true
org.jitsi.jigasi.xmpp.acc.USE_TRANSLATOR_IN_CONFERENCE=true
net.java.sip.communicator.impl.neomedia.audioSystem.audiosilence.captureDevice_list=["AudioSilenceCaptureDevice:noTransferData"]

So the actual code handling this is:



When it is not translator we create a mixer here:

And here is where the translator is created:

Hope this helps and is what you are looking for.


#4

Hi Damian,

Thanks for your help with this! I did enable those relevant configuration properties to ensure RtpTranslation is enabled. The part that I’m a bit confused about is whether anything else needs to be changed/added on the TransriptionGateway side. For the SipGateway/Session, it seems like there are two Call entities (I assume representing the Sip callee and the Jitsi conference, respectively). For the TranscriptionGateway/Session, however, a new MediaAwareCallConference is created, within the onConferenceCallStarted(Call jvbConferenceCall) method.

// We create a MediaWareCallConference whose MediaDevice
// will get the get all of the audio and video packets
jvbConferenceCall.setConference(new MediaAwareCallConference()
{
@Override
public MediaDevice getDefaultDevice(MediaType mediaType,
MediaUseCase useCase)
{
if(MediaType.AUDIO.equals(mediaType))
{
return transcriber.getMediaDevice();
}
// FIXME: 18/07/17 what to do with video?
// will cause an exception when mediaType == VIDEO
return super.getDefaultDevice(mediaType, useCase);
}
});

I understand that the extended MediaAwareCallConference is overriding getDefaultDevice in order to return the TranscribingAudioMixerDevice instantiated in the Transcriber (which is then used to set the ReceiveStreamBufferListener in order to get audio buffer data from each Participant to send to the transcription service). However, if I’m using the RtpTranslator instead of the AudioMixerDevice then it’s not clear to me whether I still need to extend MediaAwareCallConference, and if so, what Device I should use/return from the getDefaultDevice method.

Also, I noticed that the SipGatewaySession calls the following code onConferenceCallInvited(Call incomingCall):

boolean useTranslator = incomingCall.getProtocolProvider()
.getAccountID().getAccountPropertyBoolean(
ProtocolProviderFactory.USE_TRANSLATOR_IN_CONFERENCE,
false);
CallPeer peer = incomingCall.getCallPeers().next();
// if use translator is enabled add a ssrc rewriter
boolean ssrcRewriterAdded = false;
if (useTranslator && !addSsrcRewriter(peer))
{
peer.addCallPeerListener(new CallPeerAdapter()
{
@Override
public void peerStateChanged(CallPeerChangeEvent evt)
{
CallPeer peer = evt.getSourceCallPeer();
CallPeerState peerState = peer.getState();

if (CallPeerState.CONNECTED.equals(peerState))
{
peer.removeCallPeerListener(this);
addSsrcRewriter(peer);
}
}
});
}

So, I was wondering whether I needed to call addSsrcRewriter (which eventually invokes the SsrcRewriter TransformerAdapter) in the TranscriptionGatewaySession as well, once the ConferenceCall is invited or started.

I tried calling addSsrcRewriter within the TranscriptionGatewaySession, but I’m not sure whether I’m invoking it on the correct call or in the correct place. So, if you have any suggestions, or if you can clarify any of the above questions, I would really appreciate it!

Thanks again for all your help!


#5

Hi @damencho — just a quick followup on this. I tried adding the SsrcRewriter in the TranscriptionGatewaySession, but this didn’t seem to make a difference. The reason I thought it might be necessary is that there was a comment in the SipGatewaySession that mentioned that the SsrcRewriter should be used when Translation is enabled.

Overall, it just seems that I’m missing some integration piece which is required to ensure that the RtpTranslator is properly connected to the Call/CallPeer/Conference.

Thanks again for all your help!


#6

I just noticed the following error in the logs (but this seems to happen before starting up the RecorderRtpImpl):

2018-11-15 20:43:33.267 INFO: [154] net.sf.fmj.media.Log.info() Starting RTPSourceStream.

2018-11-15 20:43:33.347 SEVERE: [165] net.sf.fmj.media.Log.error() Unable to handle format: LINEAR, 48000.0 Hz, 16-bit, Mono, LittleEndian, Signed

2018-11-15 20:43:33.347 SEVERE: [165] net.sf.fmj.media.Log.error() Failed to prefetch: net.sf.fmj.media.ProcessEngine@eb0ad90

2018-11-15 20:43:33.357 SEVERE: [164] net.sf.fmj.media.Log.error() Error: Unable to prefetch net.sf.fmj.media.ProcessEngine@eb0ad90

Could this be part of the problem? @Boris_Grozev

Thanks!