Jigasi problem with unicode characters (master branch vs nightly version)


i did clone and build jigasi from master branch. in transcription, it writes unicode characters like this: (in both text and json files)
<10:02:51 AM> a: ??? ??? ??? ???

when i am installing jigasi nightly version with apt (i think it is version 1.1-177) it works fine

how can i clone nightly version code from git?


The latest from unstable is the latest from master.

so, there is any config about encoding of transcription text? i checked it multiple times with vosk

on latest master: (it shows text correctly in subtitle but in text file i have something like this)
on jigasi/unstable,now 1.1-177-g1f1cb4e-1 amd64 [installed]
<1:13:12 PM> aaa: یک دو سه

If its only the saved text file, I guess this is a bug and is not using UTF-8 to save the transcriptions in the text file …
It can be the problem the FileWriter in the code that is picking something different from the OS, I see you can do -Dfile.encoding=UTF-8 so you may try adding that in /etc/jitsi/jigasi/config and see whether it fixes it for you

i add -Dfile.encoding=UTF8 in /etc/services.d/jigasi/run and now i am happy :slight_smile:

So this is a bug in jigasi code, it should be handled there and always write utf-8 files, any PRs are welcome :slight_smile: