One question that got me thinking during the interviews with Google was, “Do you have any experience in building an NLP tool, like a tagger or a parser, for Indonesian language?”, and my answer was, “Well, ehem, not yet.” I wonder why…
That’s why, during the last Christmas/New Year break *while waiting for the result of the interviews*, I decided to do something for Indonesian language :”>. Actually, almost the same thing I already did for Italian… building an automatic extraction system for Indonesian temporal expressions!
Extraction means recognizing time expressions given a text, then normalize their values. For example, if today’s date is March 25, 2015 (2015-03-25), then when the system found dua hari yang lalu [two days ago] the value will be normalized as 2015-03-23. I called the system: IndoTimex!
The online demo of IndoTimex is available here.
The complete system, implemented in Python, is available (for download) here.
And… since there was a conference deadline around that time, PACLING 2015 *which will be held in Bali! :D*, I submitted a paper about it and got accepted. So, to know more about the technology behind the system, please read the paper here.
If everything works fine, soon I will visit Bali (and definitely, also home) for a vacation with my family, oh, and also for the conference ;). This is what we call as an Indonesian proverb “sambil menyelam minum air”, hohoho…