Developing a custom Jigasi transcriber

I’m developing a custom transcriber for Deepgram and I’m running into trouble. You can see the code in this gist: DeepgramTranscriptionService.java · GitHub.

I’m able to get the transcriber to join the room and things seem to start off well; however, I’m not getting any transcriptions. This is what I see in the logs:

2023-02-09 00:06:09.989 INFO: [67] CallControl.handleDialIq#201: [ctx=1675901169931675052271] Got dial request fromnumber -> jitsi_meet_transcribe room: asleepslogansdeprivesternly@conference.omoiomoi.org
2023-02-09 00:06:10.429 INFO: [67] JvbConference.start#454: [ctx=1675901169931675052271] Starting JVB conference room: asleepslogansdeprivesternly@conference.omoiomoi.org
2023-02-09 00:06:10.439 INFO: [67] JvbConference.createAccountPropertiesForCallId#1583: [ctx=1675901169931675052271] Using bosh url:https://omoiomoi.org/http-bind?room=asleepslogansdeprivesternly
2023-02-09 00:06:10.465 INFO: [67] JvbConference.setXmppProvider#590: [ctx=1675901169931675052271] Using ProtocolProviderServiceJabberImpl(Jabber:1725fdf4@omoiomoi.org/1725fdf4)
2023-02-09 00:06:10.541 INFO: [70] org.igniterealtime.jbosh.BOSHClient.init: Starting with 1 request processors
2023-02-09 00:06:10.817 INFO: [70] net.java.sip.communicator.impl.protocol.jabber.OperationSetBasicTelephonyJabberImpl.registrationStateChanged: Jingle : ON
2023-02-09 00:06:10.822 INFO: [70] JvbConference.registrationStateChanged#652: [ctx=1675901169931675052271] Registering XMPP.
2023-02-09 00:06:10.918 INFO: [70] net.java.sip.communicator.impl.protocol.jabber.ProtocolProviderServiceJabberImpl$JabberConnectionListener.authenticated: Authenticated: false
2023-02-09 00:06:10.923 INFO: [70] JvbConference.joinConferenceRoom#734: [ctx=1675901169931675052271] Joining JVB conference room: asleepslogansdeprivesternly@conference.omoiomoi.org
2023-02-09 00:06:11.077 INFO: [56] net.java.sip.communicator.impl.protocol.jabber.ChatRoomJabberImpl$MemberListener.joined: asleepslogansdeprivesternly@conference.omoiomoi.org/focus has joined the asleepslogansdeprivesternly@conference.omoiomoi.org chat room.
2023-02-09 00:06:11.773 INFO: [56] net.java.sip.communicator.impl.protocol.jabber.ChatRoomJabberImpl$MemberListener.joined: asleepslogansdeprivesternly@conference.omoiomoi.org/71ea4b8b has joined the asleepslogansdeprivesternly@conference.omoiomoi.org chat room.
2023-02-09 00:06:11.847 INFO: [56] net.java.sip.communicator.impl.protocol.jabber.ChatRoomJabberImpl$MemberListener.joined: asleepslogansdeprivesternly@conference.omoiomoi.org/1725fdf4 has joined the asleepslogansdeprivesternly@conference.omoiomoi.org chat room.
2023-02-09 00:06:11.862 SEVERE: [70] JvbConference.registrationStateChanged#641: [ctx=1675901169931675052271] Registered bosh sid: 1ad0aa0a-85fa-49fa-beb6-9170e1c1c4a8
2023-02-09 00:06:12.297 INFO: [79] net.java.sip.communicator.impl.protocol.jabber.IceUdpTransportManager.createIceAgent: End gathering harvester within 666 ms
2023-02-09 00:06:12.939 INFO: [79] net.java.sip.communicator.impl.protocol.jabber.CallPeerMediaHandlerJabberImpl.harvestCandidates: End candidate harvest within 131 ms
2023-02-09 00:06:12.960 INFO: [79] JvbConference$JvbCallListener.incomingCallReceived#1342: [ctx=1675901169931675052271] Got invite from focus
2023-02-09 00:06:13.146 INFO: [120] TranscriptionGatewaySession$1.getDefaultDevice#153: Transcriber: Media Device Audio
2023-02-09 00:06:13.317 INFO: [120] net.java.sip.communicator.service.protocol.media.MediaHandler.registerDynamicPTsWithStream: Dynamic PT map: 126=rtpmap:-1 telephone-event/8000; 111=rtpmap:-1 opus/48000/2 fmtp:useinbandfec=1;minptime=10; 103=rtpmap:-1 unknown/90000;
2023-02-09 00:06:13.318 INFO: [120] net.java.sip.communicator.service.protocol.media.MediaHandler.registerDynamicPTsWithStream: PT overrides [103->104 ]
2023-02-09 00:06:13.374 INFO: [120] net.java.sip.communicator.service.protocol.media.CallPeerMediaHandler.start: Starting
2023-02-09 00:06:13.848 INFO: [134] JitsiOpenSslProvider.<clinit>#52: jitsisrtp successfully loaded for OpenSSL 3
2023-02-09 00:06:14.060 INFO: [120] JvbConference$JvbCallChangeListener.callStateChanged#1439: [ctx=1675901169931675052271] JVB conference call IN_PROGRESS.
2023-02-09 00:06:15.607 INFO: [134] Aes.benchmark#367: AES benchmark (of execution times expressed in nanoseconds): OpenSSL 814, SunJCE 8826, BouncyCastle 17922 for AES/CTR/NoPadding
2023-02-09 00:06:15.607 INFO: [134] Aes.createCipher#433: Will employ AES implemented by OpenSSL for AES/CTR/NoPadding.
2023-02-09 00:06:16.318 INFO: [180] DeepgramTranscriptionService$DeepgramWebsocketStreamingSession.sendRequest#188: sendRequest bytes: 48000
2023-02-09 00:06:16.815 INFO: [180] DeepgramTranscriptionService$DeepgramWebsocketStreamingSession.sendRequest#188: sendRequest bytes: 48000
2023-02-09 00:06:17.315 INFO: [180] DeepgramTranscriptionService$DeepgramWebsocketStreamingSession.sendRequest#188: sendRequest bytes: 48000
2023-02-09 00:06:17.816 INFO: [180] DeepgramTranscriptionService$DeepgramWebsocketStreamingSession.sendRequest#188: sendRequest bytes: 48000
2023-02-09 00:06:17.872 INFO: [116] DeepgramTranscriptionService$DeepgramWebsocketStreamingSession.onMessage#161: asleepslogansdeprivesternly/71ea4b8b Received response: {"transaction_key":"deprecated","request_id":"23989363-018b-4099-acfe-e8502f69bbcb","sha256":"ef9fb74236796b255470ee6ba93bb784abd9270d9a7614bd6b661ba0153461e2","created":"2023-02-09T00:06:13.131Z","duration":0.0,"channels":0}

Every time onMessage is called, Jigasi stops sending audio.

Any ideas?

Maybe enable packet logging so you can see in the pcap the xmpp signalling to see is there something suspicious there.

I sorted it out by kicking up the logging level. I noticed this:

2023-02-09 02:02:32.090 FINE: [84] org.eclipse.jetty.websocket.core.internal.WebSocketCoreSession.closeConnection: closeConnection() {1011=SERVER_ERROR,NET-0001} WSCoreSession@1595f2f0{CLIENT,WebSocketSessionState@619f783a{CLOSED,i=NO-OP,o=NO-OP,c={1011=SERVER_ERROR,NET-0001}},[wss://api.deepgram.com:443/v1/listen?language=en&interim_results=true,null,true.[]],af=true,i/o=4096/4096,fs=65536}->JettyWebSocketFrameHandler@12f04174[org.jitsi.jigasi.transcription.DeepgramTranscriptionService$DeepgramWebsocketStreamingSession]
2023-02-09 02:02:32.090 FINE: [84] org.eclipse.jetty.websocket.core.internal.WebSocketCoreSession.abort: abort(): WSCoreSession@1595f2f0{CLIENT,WebSocketSessionState@619f783a{CLOSED,i=NO-OP,o=NO-OP,c={1011=SERVER_ERROR,NET-0001}},[wss://api.deepgram.com:443/v1/listen?language=en&interim_results=true,null,true.[]],af=true,i/o=4096/4096,fs=65536}->JettyWebSocketFrameHandler@12f04174[org.jitsi.jigasi.transcription.DeepgramTranscriptionService$DeepgramWebsocketStreamingSession]

This suggested to me that Deepgram was closing the stream. After adding the encoding and sample rate to the URL, it works.