Jigasi open source alternative of Google Speech to Text

Hi developers,

I was just trying out Jitsi Meet with the transcriber in Jigasi and thought of using an open source alternative of Google Speech-to-text API, because of the costs. I was wondering if someone is already working on that or not? If not, then which one do you think is the best (Mozilla DeepSpeech, Kaldi or CMU Sphinx) and also how much time do you think it would take to implement it?

I saw that @Nik_V worked on implementing IBM Watson’s solution. Any word of advice you can give me Nik?

Best regards

1 Like


Nik’s first implementation was based on CMU Sphinx[0] (note this is not compatible with jigasi), and it did not perform well. My understanding is that the project is not active anymore, it’s website actually suggests using Kaldi[1].

No one from the jitsi team is working on supporting DeepSpeech or Kaldi right, and I don’t know enough about them to make a recommendation.



[0] https://github.com/jitsi/Sphinx4-HTTP-server
[1] https://cmusphinx.github.io/

@Boris_Grozev thank you very much for replying. And about that, I managed to get in contact with Nik through email and he gave me some tips if I start working on that.

@filani_jitsi can you share, whatever you have as information so we keep it in the history :)?
And if you want share your progress …

Yeah sure, to whoever wants to start this implementation. This is what Nik said:

I would say the implementation would involve two major steps:

  1. Implement/find an existing library for Mozilla DeepSpeech which is capable of:
    * running on a GPU instance
    * offer a form of authentication
    * receive a continuous stream of audio
    * use VAD to filter out non-usable audio)
    * respond with a continuous stream of transcriptions
  2. Implement a TranscriptionService [1] in Jigasi [2] which connects to the DeepSpeech server described in 1)
    [1] = https://github.com/nikvaessen/jigasi/blob/master/src/main/java/org/jitsi/jigasi/transcription/TranscriptionService.java
    [2] = https://github.com/nikvaessen/jigasi

I’m unfamiliar whether existing open software for the tasks in 1) exists. If 1) is trivially solved, it shouldn’t be that much work.

If I ever start working on this implementation, I will share my progress, but right now I don’t think I have enough experience to do this. So these are the tips that Nik gave me. Thank you @Nik_V

1 Like