[sip-comm-dev] About data detection project idea in GSoC 2008


I am an aspiring student of GSoC 2008 and am basically interested in SIP
Communicator's project idea of detecting data in IM Conversations.

The example you gave was good about Apple's Mail but i don't use apple or
MAC so have no idea about it...

But from the snapshot it appears that the main goal would be to Parse the
whole Text and Find keywords like "Address" or "Phone no" or "E-mail"
..........but this task seems to be easy.

The way i think is that to include Some Artificial Intelligence into
Like it should automatically detect Strings like Some Place's name for
eg.India and then search for the address totally.

The approach could be like.............Search for names of Countries or
states or known places(landmarks) and then extract the TEXT( in the IM
Conversation) in proximity to that "place"

For eg if We find somethin like::

" Flat no-12 ,ABC Colony,J M Street ,Pune, India"

We can first identify that India is a known country .....So its order of
Occurence would be AT LAST in an address(just before the PIN or ZIP code)
so.......we can go on extracting strings until the Keywords like "Street "
,"Village", or "town" appear and treat it as an address.


We can go on parsing till India is reached and then go backwards Until the
String contains Comma seperated Text......because commas are used while
writing addresses.

The gist of my approach above is to find the START and END of the String
that is An ADDRESS.

and then apply functionality to it.............

Again Phone no's ,E-mails follow a particular standard ( 10-digit nos
starting with +91 or starting with 9) and can quite simply be detected by
parsing the text but the case is not the same for Addresses...

Also i was thinking of using a Java Bean Component for the DETECTED DATA..

Like clicking on the DETECTED DATA(the address) it should be included in
editable textbox for the user to confirm its Correctness before mapping it
on a map or adding it to contacts List. and Right-Clicking on the DETECTED
DATA it should desplay the tasks available to perform on the DETECTED
DATA(like add to contacts or Map it on a MAP etc)

Please review my Concerns and Tell me whether i am following the right
path...or something else is expected....


Rana Vishal Singh