Monthly Archives: August 2012

Ways To Say Goodbye (You Died)

[youtube]http://www.youtube.com/watch?v=GSBFehvLJDc[/youtube]

Okay, just a quick post, I’m in the middle of composing the ‘best’ part of my thesis now, trying to give it… the ‘soul’ that can make people who read it will say, “This is the best thesis I’ve ever read!”. It’s really hard, no kidding, because what most probably will happen is that they will just read the first paragraph and say, “What the hell is she talking about?”.

Back to topic… I’m so into this song now, 50 Ways to Say Goodbye by Train. The Mexican theme and tune in the very beginning (and also in the interlude) of the song hooked me from the start ;). Not to mention the funny lyric, I love it!

Gotta go.. and back to work. Ciao ciao!

French & Italian Grade to ECTS

After I received my transcript from my first year university in France (with ECTS grade and equivalent Italian grade also provided), I want to make a correction of the French grading system conversion that I posted before. One thing that I can say for sure is (I think) the conversion depends on the university.

French grading system based on 20-point grading scale. What I heard is that you have to get at least 10 point to be considered as ‘passed’.

French Grade ECTS Grade
17 – 20 A
14 – 16.99 B
12 – 13.99 C
10 – 11.99 D

This table of conversion is deduced from my transcript *yep, my transcript covers quite a range ;p*, I’m not sure for grade below 10, and also the lower bound of A, which is either 16 or 17. I don’t have 16 so I can’t tell.

While Italian grading system uses 30-point grading scale. 30/30 is the highest grade; it is sometimes given “cum laude”, with honours, when performance is considered exceptional. 18/30 is the lowest passing grade. Grades from 1/30 to 17/30 are fail, and are not registered on transcripts.

Italian Grade ECTS Grade
30/30 cum laude A
29/30 – 30/30 B
26/30 – 28/30 C
21/30 – 25/30 D
18/30 – 20/30 E

I got this information from the website of university where I’m studying now. Crazy huh? ^^; When I first knew this I was like, “What the hell??”, and then started studying like crazy. I mean, imagine when I get 27 for an exam, I get the impression that I did it quite well and should be satisfied with that. But then I see the conversion table… argh, my confidence just dropped.

Anyway, in the end I don’t really care about how my transcript looks like, because IMHO transcript doesn’t really define your study quality 🙂

Cheers!

Damn You, Chisel!

Now that I’ve calmed down a bit, I decided to write about this. To remind me later, after I graduate from this master study, that there was this moment during my thesis work when I felt extremely frustrated… at myself, for being so careless.

The deadline is getting near, less than two months to be precise. So I try my best not to waste any time, given that right until this moment I’m still struggling with the experiments.

The thing is, working in this computational linguistics (or natural language processing) field means that I have to deal with a whole lot of text. Long story short, now I have to build a matrix of co-occurrence frequency of pairs of words in the text, which contains approximately 120 million sentences in total. Well, not as simple as pair of words actually, since I have to include the property of the word from the grammatical point of view, such as the POS tag, or the category. Sorry, sounds a bit technical, I know. Just skip it.

Anyway, to be able to go further into the next experiment, I should make sure that words which will be evaluated are contained in the matrix *my supervisor emphasized this, twice*. I made a simple shell script to check it, done. Then I started building the matrix, of 20,000 x 20,000 size, and it took around 15 hours to extract the matrix from only 1 million sentences. Thanks to computer clusters, I managed to get the matrix extracted from 40 million sentences in 3 days.

While waiting for the rest to finish, I decided to use the currently available matrix for the next experiment, and see what will happen. There are 44 words (all nouns) in the gold standard, to be evaluated. So after extracting those 44 words from the matrix, there should be 44 rows in the submatrix. I started to panic when I only saw 43 rows in the output file. Sh*t, I’m screwed.

Apparently, I was missing a chisel! Only this one word forced me to delete all of the previous results that I’ve got during these last 3 days, and start anew. Too confident, too lazy to double check, and look at what happened. Please learn from this, dear me!