[sip-comm-dev] [patch] issue 330 - internationalization in resource files


#1

Hello all,

I just stumbled ad issue 330 - internationalization in resource files. Since
i18n in Java programs is kind of a hobby for me, I thought I could help out.

The attached patch changes the ant build file to pass all **/*.properties*
files through native2ascii, which converts from an arbitrary character
encoding to ASCII with properties format escapes. I chose UTF-8 as character
encoding, since it should cover everything we need. You can also pass the
language to use in the ant run task.

Also included are two example translations for German and Japanese (with
only one or two translated entries). For fun run
ant -Duser.language=de rebuild run
to use the German translation or
ant -Duser.language=ja rebuild run
for Japanese (note the title bar, you have to have a Japanese font
installed).
(The user.language system property is usually set by Java based on the
operating system language setting.)

I hope this is useful.

Regards
Michael Koch

sip-communicator-issue-330.patch (5.21 KB)


#2

Hello Michael,

(First, please accept my apologies for the lag!)

Internationalization is a very important issue that we definitely need
to be handling in the near future. We started a discussion with Jean
some time ago:

https://sip-communicator.dev.java.net/servlets/BrowseList?list=dev&by=thread&from=771948

We failed to get it to an end, but will have to get back to it in order
to resolve the localization issues.

Now if I understand your patch correctly, with your build.xml
modifications we will be able to produce one installation package per
language. I am afraid that this is not exactly the way we have previewed
to go. The way our discussion started with Jean, we were hoping to have
all language packs shipping with every version, and leave it up to the
user to decide and change the language they use. This is how the izPack
installers do it for example.

I hope we'll be able to get back to this soon.

We'd still be glad to have your German translation though! :slight_smile:

Cheers
Emil

Koch Michael wrote:

···

Hello all,

I just stumbled ad issue 330 - internationalization in resource files. Since
i18n in Java programs is kind of a hobby for me, I thought I could help out.

The attached patch changes the ant build file to pass all **/*.properties*
files through native2ascii, which converts from an arbitrary character
encoding to ASCII with properties format escapes. I chose UTF-8 as character
encoding, since it should cover everything we need. You can also pass the
language to use in the ant run task.

Also included are two example translations for German and Japanese (with
only one or two translated entries). For fun run
ant -Duser.language=de rebuild run
to use the German translation or
ant -Duser.language=ja rebuild run
for Japanese (note the title bar, you have to have a Japanese font
installed).
(The user.language system property is usually set by Java based on the
operating system language setting.)

I hope this is useful.

Regards
Michael Koch

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@sip-communicator.dev.java.net
For additional commands, e-mail: dev-help@sip-communicator.dev.java.net


#3

Hello Emil!

(First, please accept my apologies for the lag!)

No problem at all.

Now if I understand your patch correctly, with your build.xml
modifications we will be able to produce one installation package per
language. I am afraid that this is not exactly the way we
have previewed to go.

It seems my explanation wasn't clear enough, so you got me wrong. The patch
includes the translations in the normal Jar files, no change to the
installer is needed. Let me try to explain again.

With the way ResourceBundle.getBundle handles property files, all you have
to do to create, say, a German translation is to copy messages.properties to
messages.properties_de and replace all English text by the German
translations. The bundle loader will load the properties files by language
and country codes from specific to general (..._de_DE.properties ->
..._de.properties -> ...properties). All keys which are not found in the
specific file will be looked up in the general file, so if something is not
translated, the original text is used. Since the build.xml already packs all
properties file into the jar file, the translations are automatically
included without any changes.

The only problem with this is that properties files must be encoded as ISO
8859-1, which does not work for characters from the Japanese alphabet (for
example). The properties format however allows to embed characters not in
ISO 8859-1 as unicode escapes. The native2ascii ant task (which is a wrapper
for the native2ascii JDK tool) will convert properties files with an
arbitrary encoding to ISO 8859-1 with unicode escapes. In my patch, I have
used this task to save the translated properties files as UTF-8 (which all
modern editors should understand) and have them converted on the fly to the
correct encoding when the Jar files are built.

We'd still be glad to have your German translation though! :slight_smile:

Since we are not using the GUI part of SIP-Communicator here, I don't think
I can do this on company time :wink: Perhaps I can find the time to do it at
home (since I have provided the patch, I should set an example :slight_smile:

Regards
Michael Koch


#4

Hi Michael,

Koch Michael wrote:

It seems my explanation wasn't clear enough, so you got me wrong. The patch
includes the translations in the normal Jar files, no change to the
installer is needed. Let me try to explain again.

Yes I had misunderstood indeed. My bad, sorry!

Anyway, I got it now and have committed and acked your contribution! Thanks!

I've only left the two translations out for now. We are waiting for
Collab.NET to migrate our CVS repository to SVN. They had specifically
requested to know what encodings are currently used in the repo and I
told them we only had standard ascii. I don't know whether it would mess
something during the transition but I guess it's wiser to wait and
commit after we're well set on SVN.

With the way ResourceBundle.getBundle handles property files, all you have
to do to create, say, a German translation is to copy messages.properties to
messages.properties_de and replace all English text by the German
translations.

This reminds me that right now we have many of the properties files
scattered all over the place so the whole translation procedure might be
a bit difficult for people that don't know the project well. We'd
probably have to make an effort and either change our resource loading
policy or extract all these files and make them available somewhere so
that volunteers could easily translate them. I think I'd prefer the
latter. We'd probably also need to accompany these with a short wiki
entry in "Developer Documentation" explaining how to do the translation.

Your comments in the various posts and in messages.properties are pretty
much what we need. Are you interested in authoring a short translation
manual on our wiki? Let me know if so and I'll create a user for you.

Another thing that we need to do is figure how to handle language
selection in a user friendly way. It is probably a good idea to have our
platform specific installers (rpm, deb, exe, dmg) set the user.language
property the same way you do (isn't java supposed to be doing this
automatically btw?). Another nice thing to have (but probably a bit
tricky to implement) would be to add a configuration form in the UI that
would allow the user to override system-wide settings (Hope we'll have
the time do this one of these days). Other suggestions?

Thanks again for the contrib!

Emil

···

The bundle loader will load the properties files by language
and country codes from specific to general (..._de_DE.properties ->
..._de.properties -> ...properties). All keys which are not found in the
specific file will be looked up in the general file, so if something is not
translated, the original text is used. Since the build.xml already packs all
properties file into the jar file, the translations are automatically
included without any changes.

The only problem with this is that properties files must be encoded as ISO
8859-1, which does not work for characters from the Japanese alphabet (for
example). The properties format however allows to embed characters not in
ISO 8859-1 as unicode escapes. The native2ascii ant task (which is a wrapper
for the native2ascii JDK tool) will convert properties files with an
arbitrary encoding to ISO 8859-1 with unicode escapes. In my patch, I have
used this task to save the translated properties files as UTF-8 (which all
modern editors should understand) and have them converted on the fly to the
correct encoding when the Jar files are built.

We'd still be glad to have your German translation though! :slight_smile:

Since we are not using the GUI part of SIP-Communicator here, I don't think
I can do this on company time :wink: Perhaps I can find the time to do it at
home (since I have provided the patch, I should set an example :slight_smile:

Regards
Michael Koch

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@sip-communicator.dev.java.net
For additional commands, e-mail: dev-help@sip-communicator.dev.java.net


#5

Hi Emil!

Anyway, I got it now and have committed and acked your contribution!

Thanks!

I'm happy that I am of help.

We are waiting for Collab.NET to migrate our CVS repository to SVN.

YES! I've switched from CVS to SVN three years ago, and going back to CVS
was such a pain :wink:

This reminds me that right now we have many of the properties files
scattered all over the place so the whole translation
procedure might be
a bit difficult for people that don't know the project well. We'd
probably have to make an effort and either change our resource loading
policy or extract all these files and make them available somewhere so
that volunteers could easily translate them. I think I'd prefer the
latter.

I'd also say that the latter approach would be nicer, but are you thinking
about somewhere else in the sources or in the installation directory? If
they would be placed as separate files in the installation directory,
contributors could edit the translations without having to care about the
source and build process. The drawback would be that there would be more
work for the installer authors, since the directory layout and classpath
setting would become more complex. Speaking of classpath, since the resource
files are looked up through the class loader, and the OSGi framework has its
own classloader, I don't know if it would even be possible to load resources
which are not in the bundle JAR, making my previous ideas moot.

Stuffing the properties files in a separate directory in the source tree
would be no problem, since the build.xml could be changed so that the
properties files are put into the correct JAR files. This would mean
increasing the complexity of the build file, however, and could become a
maintenance problem in my opinion.

Perhaps it would be best to ask would-be contributors if they can work with
the existing layout and change it according to their needs.

We'd probably also need to accompany these with a short wiki
entry in "Developer Documentation" explaining how to do the
translation.

Your comments in the various posts and in messages.properties
are pretty
much what we need. Are you interested in authoring a short translation
manual on our wiki? Let me know if so and I'll create a user for you.

Yes, I'd be happy to do that.

Another thing that we need to do is figure how to handle language
selection in a user friendly way. It is probably a good idea
to have our
platform specific installers (rpm, deb, exe, dmg) set the
user.language
property the same way you do (isn't java supposed to be doing this
automatically btw?). Another nice thing to have (but probably a bit
tricky to implement) would be to add a configuration form in
the UI that
would allow the user to override system-wide settings (Hope we'll have
the time do this one of these days). Other suggestions?

The user.language... properties are indeed set automatically by the Java
runtime based on the operating system settings. They are normal system
properties which can be queried with System.getProperties() and are used to
initialize the default locale (see Locale.getDefaultLocale()). (At least
this is how it works with the Sun JVM.) I added the override to the
build.xml so you can easily test SIP-Communicator with languages other than
your system language.

Changing the language based on user selection would mean to change the
default locale with Locale.setDefaultLocale(). The ResourceBundle uses the
default locale to select the translation bundle to load if no language is
specified explicitly. This would have to be done before the SIP Communicator
resources are loaded. My personal feeling is however that per-application
language settings are somewhat superfluous (except for debugging) and using
the system language is ok.

Regards
Michael Koch