Showing posts from September, 2014

A Random Line From A File

The problem at work was the following: given a relatively large text file (with a little over 5,000,000 lines), write a procedure to return a random line out of this file. One naïve implementation that can be thought of would be the following: count the number of lines (O(n)), read off all the lines into an array (O(n)) and finally return a random line by randomly choosing one from the bounded array. That would cost not only O(2n), but O(n)-space which leads to yet extra processing time. A better approach would be the following: as you read the lines off of the file, have an index that indicates the current line that you're processing (line 1, 2, 3, and so on). Call this index the "lineIndex" counter. Then, here is the magical trick: select the current line as the one to be chosen with a probability 1/lineIndex . Suppose that your file only has one line. That line will surely be the one picked by this algorithm since the probability 1/1 means 100% chance. If the file has

The Dancing Men

In 1903, in southern London, a serial killer was on the loose, with already five murder cases confirmed and attributed to the perpetrator. Police wasn’t getting any closer to stopping him, or her. The serial killer was codenamed “The Dancing Killer” for the simple fact that he (or she) used to leave funny encoded notes depicting a set of “dancing men” by their victims’ bodies. No one was able to decipher those messages. After the killer’s last attack, the following message was recovered:   A brilliant detective was actually able to crack down this last message, leading to the apprehension of the assassin. In 1903 there was no computers around. Thankfully, that’s no longer the case. We can solve this puzzle with computers now, and that’s what this post is about. We can start by assuming the symbols in the note map to English letters, which would be the simplest mapping possible. We can also assume that the “flags” in the dancing men’s hands are non-interesting artifacts (a