Simultaneous interpretation


We have some clients which are currently using the “simultaneous interpretation” feature of ZOOM to manage with audio translations in their webinars (

They are moderately satisfied with this solution and really need an Open Source one : Do you think it can be possible to integrate this kind of “all-in-one” conference feature in JITSI ?

Does anyone have already test something for that, or does we have ideas ?

Thx !

Jigasi supports captions using google-speech-to-text service, it also supports translating this captions, there is just no UI implemented for that.
About translating the audio not sure, but I suppose it is doable … the problem is that this brings delay and will de-sync quite a bit the translation and the real audio.

Thx for your answer Damencho : the purpose here is not to translate the audio to text but to have real translators in the jitsi room, and allow each participants to choose which one they want to hear.

Basicaly, if there is a webinar between 1 polish and 1 spanish guy and if want to propose an english and a french traduction, I will also have in the jisti room :

  • 1 polish to french traductor
  • 1 spanish to french traductor
  • 1 polish to english traductor
  • 1 spanish to english traductor

The purpose here is to allow each participant to “activate” the french or english translation by clicling on a “translate” button which will activate the audio for both translaters of the chosen language (and reduce the volume of the polish and spanish guy).

I hope my request is more clear :slight_smile: !

Anyone ?

Our organization is also very interested in a simultaneous interpretation option in Jitsi. We have used the interpretation tool a lot in Zoom. But since Zoom is a US company, it is not possible to use it for communications with some countries (for example we work with labour organizations in Cuba, but Zoom cannot be used there due to the embargo), and in any case, we prefer to use open software. If anyone has managed to develop a similar toolfor Jitzi, we’d love to hear about it.

This can be done manually using the volume slider for each user. If there is only 4 active speakers in a room, not a bad choice…

This can be done manually using the volume slider for each user. If there is only 4 active speakers in a room, not a bad choice…

Yes… but no :slight_smile: !

The purpose is not to find a workaround for one simple case with the existing features, but to think about a global solution that respond to any needs.

For exemple, if I made a webinar with 5 experts which each speak in a different language (for exemple : polish, russian, french, italian, greek), so for a traduction language I must have :

  • 1 polish to target language traductor
  • 1 russian to target language traductor
  • 1 french to target language traductor
  • 1 italian to target language traductor
  • 1 greek to target language traductor

(Note : If the target language is polish, there is no need to have a “polish to polish” traductor, so there will be only 4 traductors)

So if I want to offer a polish, russian, french, italian, greek and also english traduction I must have 25 traductors (it’s how it works during international meetings).
(Note : 25 = 4 traductors for each 5 expert language + 5 traductors for english)

In this conditions, I can’t deal with manual volume sliders in a jitsi room with about +30 peoples…
(Note : and “+30” is only for the experts, traductors and administrators of the webinar, but what about the xxx viewers ?)

Since it’s possible to do manually, it can be automated for larger group.

Probably it will be needed a customized login panel which creates a JWT token which contains the language info for each participant and some customizations on the UI which manage volumes according to the JWT context