[sip-comm-dev] History - special chars problem


#1

Hi Damian,

I was working on the gui for multi user chat history these days and I found some interesting problems in the HistoryService. The error which pointed me to the HistoryService was the following:

17:58:36.369 SEVERE: impl.history.HistoryImpl.isValidXML().363 not valid xml

As it was only appearing for the IRC protocol I easily found out that it was coming from the fact that IRC channels are containing # in their names. After some further investigation in the History service I noticed that in the HistoryID class there are two methods for creating history id-s: createFromID(String[] id) and createFromRawID(String[] rawid). The javadoc of the second one tells that this method could create a valid id from any String array passed as a parameter. However it's not used at all. All id-s in the MessageHistory or in the CallHisotory are created by createFromID and thus they are not checked for invalid characters (like the # one).

I have tried replacing the createFromID in getHistory and getHistoryForMultiChat in the MessageHistoryServiceImpl with createFromRawID and the same in the call history service and it works. You have just to fix in addition a little bug in the HistoryID.hasSpecialChat method at line: 238 - it should be hasSpecialChar = true;

Actually the difference between the two methods is that in the createFromRawID we adjust the Strings, so that they contain only "good" characters (XML friendly). It appears that it's just what we need.

The only problem I see is that some of the ID-s created with createFromRawID may be different from the existing ones, thus loosing the associated history records. What do you think?

Yana

···

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@sip-communicator.dev.java.net
For additional commands, e-mail: dev-help@sip-communicator.dev.java.net


#2

Hi Yana, Damian, everyone,

I must admit that the names of the functions createFromID and createFromRawID are badly chosen (my fault).

A reminder: ID = String[].

The difference between the two is the following:
1) createFromID - no processing whatsoever of the ID. It's used when you already know that your ID is "good".
2) createFromRawID - processes the passed ID. If there are "faults" in the ID, they are corrected.

"Good" and "faults" are terms depending on the concrete implementation. For example, an implementation using a DB to store the history would probably have different constraints from another one using XML.
The current implementation uses directory names to reflect the hierarchical structure of the ID + it keep the data in XML files. Thus, in order for an ID to be valid, it should contain only letters, digits + a couple of other characters. Slashes, backslashes, &, <, >, etc. cannot be present in the ID.

The intended usage was the following :
When you create a new history (for a given ID) you use createFromRawID. If you have "bad" in it characters, they are "corrected" and everything works like a charm. If, however, you have stored somewhere the already corrected ID, you don't want to give it to the same function, as it could potentially "correct" it again and you would end up with a different ID. In fact, at the time I wrote this I was struggling to find a nice way to achieve C++'s "friend" classes, but to no avail.

So, to resume:
A) Use createFromRawID when you create new databases.
B) Use createFromID when you have stored an existing HistoryID (which was created in A)).
C) Think of better names of the two functions.

The problem Yana is pointing out is serious - until now only createFromID was used to create new History data tables. That means, that if we simply start using createFromRawID, some of the IDs may not map to the old ones. For example, createFromID(["test@test.com"]) would leave the ID unchanged, while createFromRawID(["test@test.com"]) would see the "@" as a bad character and would return something like: "test_test.com%123933".

It is 100% sure that we have to start using createFromRawID.

I see two workarounds of the problem:
1) Change the history service, mark it as ver 1.1, and add a mechanism to migrate history from ver 1.0 to the new version upon installation.

2) Fine-tune the function that says which characters are "good" and which are "bad". The logic behind this is the following: until now, we didn't have any exceptions -> the HistoryIDs that were created were all valid directory names AND valid XML identifiers. So, if we fine-tune the function, we can keep 100% backward compatibility. Currently, the function is pretty rough - it keeps only letters, digits and "_". We may create a list of the currently possible IDs and see what characters should be added as "valid" ones - for example ICQ identifiers consist only of digits -> OK!, Jabber and SIP accounts are of the form SOME@THING.BLAH -> have to add "@" as a valid character, and so on.

I'd go with the second solution, as it seems the easiest thing to do AND is not a compromise with the quality of the code (does someone know of a filesystem that doesn't support the "@" character in filenames?).

These are my thoughts on the issue. What do you think about that?

Alex

···

Le 19 sept. 07 à 19:03, Yana Stamcheva a écrit :

Hi Damian,

I was working on the gui for multi user chat history these days and I found some interesting problems in the HistoryService. The error which pointed me to the HistoryService was the following:

17:58:36.369 SEVERE: impl.history.HistoryImpl.isValidXML().363 not valid xml

As it was only appearing for the IRC protocol I easily found out that it was coming from the fact that IRC channels are containing # in their names. After some further investigation in the History service I noticed that in the HistoryID class there are two methods for creating history id-s: createFromID(String[] id) and createFromRawID(String[] rawid). The javadoc of the second one tells that this method could create a valid id from any String array passed as a parameter. However it's not used at all. All id-s in the MessageHistory or in the CallHisotory are created by createFromID and thus they are not checked for invalid characters (like the # one).

I have tried replacing the createFromID in getHistory and getHistoryForMultiChat in the MessageHistoryServiceImpl with createFromRawID and the same in the call history service and it works. You have just to fix in addition a little bug in the HistoryID.hasSpecialChat method at line: 238 - it should be hasSpecialChar = true;

Actually the difference between the two methods is that in the createFromRawID we adjust the Strings, so that they contain only "good" characters (XML friendly). It appears that it's just what we need.

The only problem I see is that some of the ID-s created with createFromRawID may be different from the existing ones, thus loosing the associated history records. What do you think?

Yana

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@sip-communicator.dev.java.net
For additional commands, e-mail: dev-help@sip-communicator.dev.java.net

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@sip-communicator.dev.java.net
For additional commands, e-mail: dev-help@sip-communicator.dev.java.net