[jitsi-dev] Workings of ChatConversation methods processMessage, formatMessage, processLinksAndHTMLChars (escaping contact names)


#1

Hi all,

I have encountered an issue that closely relates to how ChatConversation
processes message. The cause for this is the following interesting
discovery: '#<a%20href="www.google.com">Test</a>' is actually a valid
IRC channel name. It turns out that any but a hand full of characters
are allowed in IRC channel names.

The code in Jitsi does not correctly handle this case. And on top of
that I found out something else:

1. chat room/contact names are not escaped (and there is no method for
requesting a "html-friendly" version of the name)
2. In ChatConversation.processMessage: STATUS_MESSAGE, ACTION_MESSAGE
and ERROR_MESSAGE do not escape message content at all.
3. Additionally, ERROR_MESSAGE also displays a message title and this
isn't escaped either.
4. (Because of the use of the html tag <plaintext>?) HTML Styles, such
as <H4> are lost after opening <plaintext> for the first time. For this
reason I cannot use the existing formatting/escaping infrastructure, as
styling would be lost almost immediately.
5. I have not confirmed this in other use case, but it looks like
formatMessage() will close tag plaintext (</plaintext>) without it every
being opened. This may be due to bad use on my part. (Also, it ends with
an open <plaintext> tag which is not closed as part of the

In the original situation, as a result of above issues, a mix of html
and original plain text message is inserted into the Chat Conversation
als HTML.

I would like to modify this behavior since I believe it should be fairly
straight forward to implement: I would suggest constructing an HTML
message, and every time a type "text/plain" message (part) is to be
added, escape any HTML entities before appending. The exception to this
is when we also need to parse parts of plain text in order to inject
hyperlinks when an URL is discovered. But this is just a use case that
we should incorporate in the message formatting process. I would like to
avoid <plaintext> alltogether.

I am curious of what you guys think about this, before I actually start
making changes. I am aware that this is quite a crucial bit of code,
given that it formats (all?) messages that are displayed in the chat
window. Maybe I have overlooked an obvious solution? Do you see any
problems with my suggestion? (Patches are welcome :wink:

Thanks in advance,
Danny


#2

Hi all,

I am currently looking into the processing and formatting code described
below. I am not completely confident that I understand it completely, so
I'd really like some feedback on a few questions and remarks here.
Please correct me where I am wrong.

1. It looks like every message eventually gets turned into a bunch of
HTML text, which represents a header and message but with attributes to
guide styling, and to enable searching.
2. So far, I have been able to find support for 2 different types of
messages: HTML and plain text.
3. Input in HTML format is mainly (if not only) supported for the
message (body) itself.
4. I noticed there is this replacement service too. Is this the same
replacement service that would replace a Youtube link with a video
thumbnail and such, or is it for another purpose?

Thanks in advance,
Danny

ยทยทยท

On 08/03/2014 08:53 PM, Danny van Heumen wrote:

Hi all,

I have encountered an issue that closely relates to how ChatConversation
processes message. The cause for this is the following interesting
discovery: '#<a%20href="www.google.com">Test</a>' is actually a valid
IRC channel name. It turns out that any but a hand full of characters
are allowed in IRC channel names.

The code in Jitsi does not correctly handle this case. And on top of
that I found out something else:

1. chat room/contact names are not escaped (and there is no method for
requesting a "html-friendly" version of the name)
2. In ChatConversation.processMessage: STATUS_MESSAGE, ACTION_MESSAGE
and ERROR_MESSAGE do not escape message content at all.
3. Additionally, ERROR_MESSAGE also displays a message title and this
isn't escaped either.
4. (Because of the use of the html tag <plaintext>?) HTML Styles, such
as <H4> are lost after opening <plaintext> for the first time. For this
reason I cannot use the existing formatting/escaping infrastructure, as
styling would be lost almost immediately.
5. I have not confirmed this in other use case, but it looks like
formatMessage() will close tag plaintext (</plaintext>) without it every
being opened. This may be due to bad use on my part. (Also, it ends with
an open <plaintext> tag which is not closed as part of the

In the original situation, as a result of above issues, a mix of html
and original plain text message is inserted into the Chat Conversation
als HTML.

I would like to modify this behavior since I believe it should be fairly
straight forward to implement: I would suggest constructing an HTML
message, and every time a type "text/plain" message (part) is to be
added, escape any HTML entities before appending. The exception to this
is when we also need to parse parts of plain text in order to inject
hyperlinks when an URL is discovered. But this is just a use case that
we should incorporate in the message formatting process. I would like to
avoid <plaintext> alltogether.

I am curious of what you guys think about this, before I actually start
making changes. I am aware that this is quite a crucial bit of code,
given that it formats (all?) messages that are displayed in the chat
window. Maybe I have overlooked an obvious solution? Do you see any
problems with my suggestion? (Patches are welcome :wink:

Thanks in advance,
Danny

_______________________________________________
dev mailing list
dev@jitsi.org
Unsubscribe instructions and other list options:
http://lists.jitsi.org/mailman/listinfo/dev