well thanks, I meant not the text content but the chars ... does it
contain cdata, or is it html formated?
Is it possible to connect somewhere and reproduce the issue myself,
this will be the easiest way to track it down?
On Tue, Feb 11, 2014 at 8:28 PM, Danny van Heumen >>> <danny@dannyvanheumen.nl> wrote:
Hi damencho,
The message is basically:"#bitcoin: Beware of scams! Scammers are sending
users private messages with bitcoin-stealing malware and offers to trade. We
are unable to stop them, so you must protect yourself. NEVER download or run
programs from strangers! When in doubt, ask the ops.".
(In case the html formatting doesn't come through, "you must protect
yourself" is in bold.)
The IRC control char to indicate bold formatting is 0x02, and 0x02 the
second time indicates ending bold formatting. I suspect that closing and
opening CDATA may be due to the html numeric reference char. I think that
CDATA is literal text so html numeric reference can only be placed outside a
CDATA section so it can be interpreted. I haven't digged deep enough to see
whether we explicitly open a new CDATA section, or that this happens inside
some (third party) library.
Kind regards,
Danny
On 02/11/2014 08:07 AM, Damian Minkov wrote:
Hi,
The message part of the record looks strange. It contains 3 CDATA
sections, while I think it is supposed to have only one.
Can you confirm what is the exact message coming from that bot?
Regards
damencho
On Tue, Feb 11, 2014 at 1:01 AM, Danny van Heumen >>>> <danny@dannyvanheumen.nl> wrote:
Hi damencho,
See the attached jitsi.log file. I may have misunderstood from the error
message that the escaped char was already stored. (I didn't find it in
the XML file.) It does however truncate the log. Once I got that error,
the existing XML file was just an empty 'history' tag. I think that
isn't supposed to happen.
I get this message immediately after I get a private message from the
user (gribble), when, I guess, the message log is first loaded and this
"malformed" message is received.
I can help with testing if needed. I only need to disable parsing the
message in order to get this raw formatting code. (And the response is
from a bot, so very predictable.)
Kind regards,
Danny
On 02/10/2014 08:12 AM, Damian Minkov wrote:
Hey,
can you send me a fragment of such broken history xml file, so I can
take a look? I think we already escape some chars.
Thanks
damencho
On Sun, Feb 9, 2014 at 5:53 PM, Danny van Heumen >>>> <danny@dannyvanheumen.nl> wrote:
Hi,
The way message history is stored in Jitsi currently, it is possible to
corrupt the message history file. Also, when the history file gets
corrupted, the file gets truncated, because the XML is invalid and therefore
isn't parsed correctly and preserved. The root cause for this is that there
are still some chars that are invalid, even as numeric character reference
(e.g. {). See
http://en.wikipedia.org/wiki/Character_encodings_in_HTML#Illegal_characters
for a list of illegal characters.
I encountered this by accident as IRC has some control codes in the range
0-31, which is an illegal range in HTML and XML. I currently drop these
control characters, and once I get html formatting set up I will convert it
to actual formatting codes.
The actual problem arises when Jitsi is started the next time after having
received such a message with illegal chars, and you "get in contact" with
the history file. For example, by again chatting with the same contact that
sent you the illegal character previously. While the history file is being
opened, a parser exception is thrown and history is truncated.
I believe this should be fixed in the general history processor. (Somewhere
around MessageHistoryServiceImpl)
Danny
_______________________________________________
dev mailing list
dev@jitsi.org
Unsubscribe instructions and other list options:
http://lists.jitsi.org/mailman/listinfo/dev
_______________________________________________
dev mailing list
dev@jitsi.org
Unsubscribe instructions and other list options:
http://lists.jitsi.org/mailman/listinfo/dev
_______________________________________________
dev mailing list
dev@jitsi.org
Unsubscribe instructions and other list options:
http://lists.jitsi.org/mailman/listinfo/dev
_______________________________________________
dev mailing list
dev@jitsi.org
Unsubscribe instructions and other list options:
http://lists.jitsi.org/mailman/listinfo/dev