Wednesday, February 06, 2013

UVA 12526 - Cellphone Typing

You are part of a research team that is developing a new technology to save time when typing text messages in mobile devices. This new technology aids users to skip key-strokes by the following conditions:
  1. The first letter always has to be input manually (can't be skipped).
  2. If a given prefix \(c_{1}, c_{2}, \cdots, c_{n}\) has been input into the system, and the user tried to insert a letter c such that for all the words that starts with \(c_{1}, c_{2}, \cdots, c_{n}\) also starts with \(c_{1}, c_{2}, \cdots, c_{n}\mathbf{c}\) then this key-stroke can be skipped. Otherwise the system waits for the users input.
Your task is, given the dictionary of words, calculate the average of key-strokes to input those words in the new system.

This problem can be solved using the Trie data structure, if you are not familiar with Tries I personally recommend the topcoder tutorial.

Back to the business :) In order to get a better intuition of the task let's first build a Trie with the following words  `hello', `hell', `heaven' and `bye':

As we can see in the graph bellow the numbers in red are the vertex prefix counter, in simple terms, we can say that every time a word is insert in our Trie a certain path is follow along the tree, and for each of the vertex in that path we keep a counter that records all the visits.

For example the letter H has a prefix count of 3 because there are 3 words that begin with H, `hello', `hell' and `heaven'. We could also say that the prefix count of H is 3 because there are 3 paths that crossed by H.

Understanding this let's focus in our goal the average number of key-strokes per word. Looking the picture below is not hard to see that we can only skip key-strokes when two consecutive nodes has the same prefix count.
Let's try to convince ourselves about the previous statement, assume that we are moving along a path in our Trie, along this path we visit the nodes in the following order \(1, 2, 3, \cdots, n, n + 1\) if a node \(n\) and \(n+1\) had the same prefix count implies that all the paths that past through \(n+1\) also past through \(n\) so we can say then that all the words that cross by the node \(n\) also cross by the node \(n+1\).

Finally, To get our result we just need to create a method that count the number of key-strokes. This method should not count the strokes of consecutive vertex with the same prefix count. The overall time complexity of this algorithm is \(O(|L|)\) where \(|L|\) is the length of the concatenate dictionary.

1 comment:

A said...

Good to see you blogging again. I always enjoy your posts.