Category: Study

Keep Pushing (the Boundary)

I was re-organizing my bookmarks the other day, which were just a collection of links that I found interesting… at some point in my life. Some of them can’t even be opened anymore.

Anyway, I found this one that is actually quite relevant with my situation now:

Matt Might, a professor in Computer Science at the University of Utah, created The Illustrated Guide to a Ph.D. to explain what a Ph.D. is to new and aspiring graduate students. [Matt has licensed the guide for sharing with special terms under the Creative Commons license.]

Reminding me how little my ‘contribution to knowledge’ is and not to forget the big picture. So yeah… let’s keep on pushing! 🙂

So… What’s up?

It has been 5 months since the last time I wrote here. What has happened since then? A lot!

First and foremost, I’ve finally finished my PhD! 😀 This was probably the main reason why I’d been neglecting my blog for so long. The first three months of this year I was basically in a frenzy of thesis writing, experiments, paper writing and… job hunting. After the thesis was submitted on March 25, the madness still continued with preparations for the PhD defense: making presentation slides, hosting my sister and cousin—who came to give supports on the D day, thanks a lot, Dita and Dhika! :*— and arranging graduation parties. Finally, the D(efense) day was April 12, on which I was declared as Dr. Paramita Paramita! 🙂

This was followed by another frenzy of post-doc position hunting, sending CV here and there, and a series of interviews (onsite and online). Considering the options for the future was also giving me headache. Anyway, I’ve picked one that I believe to be the best for now, let’s see…

Life was definitely much more relaxed in May, but… my domain paramitopia.com was expired, and the website where I registered this domain doesn’t offer an easy way to pay the renewal fee :(. So, I will definitely change the domain registrar. In the meantime, I bought another domain paramitamirza.com, which sounds more professional ;). I plan to host my academic/professional profile—publications, research activities, research output, etc.— in this domain, along with my blog. For now, only the blog is set up, but at least I can start writing blog post again.

And now it’s June already!

I will try my best to fill this blog again with things happened during those 5 months hiatus. Stay tuned! 😉

IndoTimex for Indonesian Temporal Expressions

One question that got me thinking during the interviews with Google was, “Do you have any experience in building an NLP tool, like a tagger or a parser, for Indonesian language?”, and my answer was, “Well, ehem, not yet.”  I wonder why…

That’s why, during the last Christmas/New Year break *while waiting for the result of the interviews*, I decided to do something for Indonesian language :”>. Actually, almost the same thing I already did for Italian… building an automatic extraction system for Indonesian temporal expressions!

Extraction means recognizing time expressions given a text, then normalize their values. For example, if today’s date is March 25, 2015 (2015-03-25), then when the system found dua hari yang lalu [two days ago] the value will be normalized as 2015-03-23. I called the system: IndoTimex!

Of A Very Shiny Crystal Ball Called Google (Part 2)


Zürichsee, from Arboretum, a small park at the very top of the lake

Exactly one day after I published this blog post here, a recruiter from Google contacted me, saying that there is an open position fitting my profile:

Analytical Linguist, NLU in Machine Intelligence (Thai and Indonesian)

Understanding natural language is at the core of Google’s technologies. The Natural Language Understanding (NLU) team in Google Research guides, builds, and innovates methodologies around semantic analysis and representation, syntactic parsing and realization, morphology and lexicon development. Our work directly impacts Conversational Search in Google Now, the Knowledge Graph, and Google Translate, as well as other Machine Intelligence research.

As an Analytical Linguist, you will collaborate with Researchers and Engineers in NLU/Machine Intelligence to achieve high quality data that improves our ability to understand and generate natural language systems. To this end, you will also be managing a team of junior linguists and vendors to derive linguistic databases as well as propose direction for approaches to language specific problems.

Target languages: Thai and Indonesian

My first reaction was… amazed. How come, just right after I posted a writing about my epic failure at the previous interview?? 😀 What amazed me even more was that this opportunity answered my question back then, quoted exactly as it is: “what else could be the reason to work at Google other than having something ‘shiny’ in your CV? :p”

Main NLP/CL 2015 Conference Deadlines

It’s a bit late, I know *I was caught up with some of these deadlines*, but here is the list of deadlines for main Natural Language Processing (NLP) or Computational Linguistics (CL) conferences in 2015. I also put the conferences’ important dates in Google Calendar, and make it publicly available at the following URLs:

The calendar’s timezone is GMT+01:00, since I’m in Italy. I couldn’t find a way to make the timezone adjustable according to your own calendar. If you have an idea please let me know.

How to subscribe to Google public calendar? here.

Conference Submission Date Notification Date Conference Date Location
NAACL 2015 Dec 5, 2014 Feb 20, 2015 Jun 1-3, 2015 Denver, Colorado
SIGIR 2015 (long paper) Jan 28, 2015 (Jan 21, 2015) Apr 20, 2015 Aug 9-13, 2015 Santiago, Chile
CICLing 2015 Feb 1, 2015 (Jan 25, 2015) Apr 14-20, 2015 Cairo, Egypt
IJCAI 2015 Feb 12, 2015 (Feb 8, 2015) Apr 16, 2015 Jul 25-31, 2015 Buenos Aires, Argentina
SIGIR 2015 (short paper) Feb 18, 2015 Apr 20, 2015 Aug 9-13, 2015 Santiago, Chile
ACL-IJCNLP 2015 (long paper) Feb 27, 2015 Apr 23, 2015 Jul 26-31, 2015 Beijing, China
*SEM 2015 Mar 6, 2015 Mar 30, 2015 Jun 4-5, 2015 Denver, Colorado
EAMT 2015 Mar 7, 2015 Mar 31, 2014 May 11-13, 2015 Antalya, Turkey
Interspeech 2015 Mar 20, 2015 Jun 1, 2015 Sep 6-10, 2015 Dresden, Germany
ACL-IJCNLP 2015 (short paper) Apr 30, 2015 Jun 8, 2015 Jul 26-31, 2015 Beijing, China
SIGDIAL 2015 Apr 30, 2015 Jun 12, 2015 Sep 2-4, 2015 Prague, Czech Republic
RANLP 2015 May 4, 2015 (Apr 27, 2015) Jun 22, 2015 Sep 7-9, 2015 Hissar, Bulgaria
CoNLL 2015 May 4, 2015 Jun 15, 2015 Jul 30-31, 2015 Beijing, China
EMNLP 2015 (long paper) May 31, 2015 Jul 24, 2015 Sep 19-21, 2015 Lisbon, Portugal
EMNLP 2015 (short paper) Jun 15, 2015 Jul 24, 2015 Sep 19-21, 2015 Lisbon, Portugal

Of A Very Shiny Crystal Ball Called Google

Imagine this feeling. You have this very shiny crystal ball on your hands, and you can’t wait to put it safely on a shelf, so you can show it off to everyone. However, it’s reaaally heavy, and it took a lot of effort and carefulness to place it safely on the shelf. Just when you think you’re really close, you somehow let it go. You stumbled, or your hands just gave up because it’s too heavy. It then fell and broke into pieces.

That’s exactly what I felt last weekend.

When someone from Google contacted me, asking whether I will be interested to do a summer internship there, working with him, I was ecstatic. I met this guy in a conference, and luckily he was quite interested with my research. I’ve been trying to get an internship position there *well, actually anywhere, big companies are preferable though ;)* since last year, but without knowing anyone that could recommend me, it’s hardly possible to even get an interview.

Reading the email, I was jumping around like crazy. It was my boyfriend who brought me back to the ground, saying this doesn’t mean that I will go for an internship there. He’s right. The guy confirmed it, I still have to go through the regular protocol, including passing the dreadful technical interview. So this is like my ticket, not for the internship itself, but for a chance to get one.

I got two phone interviews scheduled three weeks after that, 45 minutes each. The first one was purely technical, live coding on Google Docs. The second one was more focused on the research I am doing. The first interview was more terrifying than the second one. Talking about my own research was easier than solving a random coding problem, no matter how basic it is :p. Before the interviews I was studying like crazy, recalling all the basics about data structures and algorithms, practicing on Topcoder, reading all the tips and tricks. On the day of the first interview, I thought I was ready, or maybe more like I was trying to convince myself that I was ready. It turned out no, I was certainly not.

The Story of Tim and Casey

Tim and Casey are best buddies since high school. One day, after years, they meet up in a bar for some drinks, sharing their work life stories, cursing at their bosses ;). They’re working at a company named McRels Inc. It turns out that their jobs are very similar. Both of them are working with text, news to be exact. Tim’s main responsibility is reading news, then ordering every events happening in the news in a timeline, guessing whether an event comes after or before another event. Casey started working on his current job just recently. His job is to decide whether there is a causality between two events.

Tim: “So, when I am given a text Typhoon Haiyan struck the eastern Philippines on Friday, killing thousands of people, I should be able to guess that the struck happens before the killing.”
Casey: “Aaah, I see… For me, I must decide whether the struck caused the killing or not.”

Casey’s job seems to be easier since the decision is binary: yes or no (well, also to decide which one is the cause and which one is the effect), but it’s actually much more difficult than Tim’s. One reason is that, unlike Tim, Casey doesn’t have enough resources to learn how to decide on the causality. Moreover, the concept of causality is more abstract than temporal ordering.

Tim is very lucky, because he could participate in a challenge on guessing the event ordering. As we know, competition can lead people to perform their best—that is, it can improve their quality of performance<fn>Paramita Mirza and Sara Tonelli. 2014. Classifying Temporal Relations with Simple Features. In Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, pages 308–317, Gothenburg, Sweden, April. Association for Computational Linguistics.</fn>. Unfortunately, Casey doesn’t get that chance.

First of all, Casey needs to build resources for learning on deciding the causality between events, so he hires minions to do that. He can only afford two minions since he’s low on budget.

The minions are not so smart, so he needs to set up guidelines for them to annotate causal information in text<fn>Paramita Mirza, Rachele Sprugnoli, Sara Tonelli and Manuela Speranza. 2014. Annotating causality in the TempEval-3 corpus. In Proceedings of the EACL 2014 Workshop on Computational Approaches to Causality in Language (CAtoCL), pages 10–19, Gothenburg, Sweden, April. Association for Computational Linguistics.</fn>. Tim offers him to use his available resources, which is a collection of text already annotated with all events available in the text. Having the resources annotated with causal information, Casey can finally learn how to identify causality between events in text<fn>Paramita Mirza and Sara Tonelli. 2014. An Analysis of Causality between Events and its Relation to Temporal Information. (to appear) in Proceedings of the 25th International Conference on Computational Linguistics, Dublin, Ireland.</fn>.

Since in theory, causality has a temporal constraint, that the cause happens before the effect, Tim and Casey have an idea to cooperate in order to improve their learning abilities. There still need to be some discussions for this idea, involving more meetings in a bar… with a lot of drinks, I suppose :).

P.S.: in case you don’t get the metaphors, this story is basically my PhD topic <fn>Paramita Mirza. 2014. Extracting Temporal and Causal Relations between Events. In Proceedings of the ACL 2014 Student Research Workshop, pages 10–17, Baltimore, MD, United States, June.</fn>, where Tim and Casey are automatic systems for extracting temporal and causal relations, respectively. And the two minions are actually me and my advisor :D.

Updates on PhD Life

picture is taken from PhD Comics

In order to be consistent with ‘one blogpost per month’ this year *damn, I was more productive back then…*, I will just write some updates on my PhD life, which is going quite well. I’m halfway through my second year now, and I can say.. I’m back on track, after being a bit lost during the first year :).

Top NLP/CL Conferences and Journals

picture is taken from here

I got this tips from the Research Methodology course that I took last year, which is mandatory for first year students in my PhD program: one of the first steps in the PhD study is knowing your community, i.e. the conferences and journals for your field, where people with the same interests gather and share the knowledge and their research.

For Natural Language Processing or Computational Linguistics field, Google Scholar computes the ranking of publications (including conferences, journals and workshops) in which the published papers are cited the most *with the assumption of good papers are usually cited a lot*, which looks like the following: