Thoughts on Scribe

As someone who works with autocompletion, this week has been a good one. Google launched two products relevant to my research: the first one was Google Scribe, a Labs experiment that uses Web n-grams to assist in sentence construction. This system solves the same problem addressed in my VLDB’07 paper, “Effective Phrase Prediction” (paper, slides). The paper proposes a data structure called FussyTree to efficiently serve phrase suggestions, and provides a metric I called “Total Profit Metric”(TPM) to evaluate phrase prediction systems. Google Scribe looks quite promising, and I thought I’d share my observations.

To simplify writing, let’s quickly define the problem using a slide from the slide deck :

Query Time:
Latency while typing is quite impressive. There is no evidence of speculative caching(a la Google Instant), but interaction is fairly fluid, despite the fact that an HTTP GET is sent to a Google Frontend Server on every keystroke. I’m a little surprised that there isn’t a latency check (or if it exists, it’s too low) — GET requests are made even when I’m typing too fast for the UI to keep up, rendering many of the results useless even before the server has responded to them.

Length of Completion:
My experience with Google Scribe is that the length of completion is quite small; I was expecting it to produce large completions as I gave it more data, but I couldn’t get it to suggest beyond three words.

Length of Prefix+Context:
It looks like the length of the prefix/context(context being the text before the prefix, used to bias completions) is 40 characters, with no special treatment to word endings. At every keystroke, the previous 40 characters are sent to the server, with completions in return. So as I was typing in the sentence, this is what the requests look like:

this is a forty character sentence and i
his is a forty character sentence and it
is is a forty character sentence and it
s is a forty character sentence and it i
_(and so on)_

I’m not sure what the benefit of sending requests for partial words is. It’s hard to discern the prefix from the context by inspection, but the prefix seems to be quite small(2-3 words), which sounds right.

Prediction Confidence:
Google Scribe always displays a list of completions. This isn’t ideal, since it’s often making arbitrary low-confidence predictions. This makes sense from a demo perspective, but since there is a distraction cost associated with the completions, it would be valuable to completions only when they are of high-confidence. Confidence can either be calculated using TPM or learned from usage data(which I hope Scribe is collecting!)

Prediction Quality:
People playing with Scribe produced sentences such as “hell yea it is a good idea to have a look at the new version of the Macromedia Flash Player to view this video” and “Designated trademarks and brands are the property of their respective owners and are”. I find these sentences interesting because they are both very topical; i.e. they seem more like outliers from counting boilerplate text on webpages than “generic” sentences you’d find in, say an email. To solve this issue and produce more “generic” completions, one solution is to cluster the corpus into multiple topic domains, and ensure that the completion is not just popular in one isolated domain.

I was also interested in knowing, “How many keystrokes will this save?”. To measure this, we can use TPM. In these two slides, I describe the TPM metric with an example calculation:

While it would be nice to see a comparison of the FussyTree method vs Google Scribe in terms of Precision, Recall and TPM, constructing such an experiment is hard, since training FussyTree over web-sized corpora would require some significant instrumentation. Based on a few minutes of playing with it, I think Scribe will outperform the FussyTree method in Recall due to the small window size — i.e. it will produce small suggestions that are often correct. However, if we take into account the distraction factor from the suggestion itself, then Scribe in its current form will do poorly, since it pulls up a suggestion for every word. This can be fixed by making longer suggestions, and considering prediction confidence.

Overall, I am really glad that systems like these are making it into mainstream. The more exposure these systems get, the more chance they have to get better and more accurate, saving us time and enabling us to interact with computers better!

|

About the author:

Arnab Nandi is an Assistant Professor in the Department of Computer Science and Engineering at The Ohio State University. You can read more about him here.


August 2002 : 9 posts September 2002 : 16 posts October 2002 : 7 posts November 2002 : 21 posts December 2002 : 25 posts January 2003 : 8 posts February 2003 : 11 posts March 2003 : 7 posts April 2003 : 21 posts May 2003 : 14 posts June 2003 : 15 posts July 2003 : 4 posts August 2003 : 16 posts September 2003 : 25 posts October 2003 : 15 posts November 2003 : 24 posts December 2003 : 17 posts January 2004 : 6 posts February 2004 : 8 posts March 2004 : 6 posts April 2004 : 5 posts May 2004 : 29 posts June 2004 : 3 posts July 2004 : 17 posts August 2004 : 19 posts September 2004 : 3 posts October 2004 : 4 posts December 2004 : 1 posts February 2005 : 14 posts March 2005 : 17 posts April 2005 : 8 posts May 2005 : 27 posts June 2005 : 73 posts July 2005 : 44 posts August 2005 : 13 posts September 2005 : 3 posts October 2005 : 9 posts November 2005 : 20 posts December 2005 : 6 posts January 2006 : 25 posts February 2006 : 23 posts March 2006 : 36 posts April 2006 : 35 posts May 2006 : 7 posts June 2006 : 22 posts July 2006 : 20 posts August 2006 : 27 posts September 2006 : 15 posts October 2006 : 6 posts November 2006 : 19 posts December 2006 : 3 posts January 2007 : 4 posts February 2007 : 1 posts March 2007 : 3 posts May 2007 : 5 posts June 2007 : 2 posts July 2007 : 1 posts August 2007 : 13 posts September 2007 : 1 posts October 2007 : 21 posts November 2007 : 7 posts December 2007 : 9 posts January 2008 : 4 posts February 2008 : 13 posts March 2008 : 14 posts April 2008 : 11 posts May 2008 : 12 posts June 2008 : 12 posts July 2008 : 5 posts August 2008 : 10 posts September 2008 : 11 posts October 2008 : 10 posts November 2008 : 8 posts December 2008 : 4 posts January 2009 : 6 posts February 2009 : 13 posts March 2009 : 7 posts April 2009 : 7 posts May 2009 : 2 posts June 2009 : 3 posts July 2009 : 4 posts August 2009 : 4 posts September 2009 : 6 posts October 2009 : 4 posts November 2009 : 7 posts December 2009 : 10 posts January 2010 : 3 posts February 2010 : 2 posts April 2010 : 5 posts May 2010 : 1 posts July 2010 : 4 posts August 2010 : 3 posts September 2010 : 4 posts October 2010 : 1 posts November 2010 : 2 posts December 2010 : 3 posts June 2011 : 1 posts August 2011 : 1 posts November 2011 : 1 posts December 2011 : 1 posts February 2012 : 1 posts May 2012 : 2 posts December 2012 : 1 posts June 2013 : 1 posts August 2013 : 1 posts October 2013 : 2 posts September 2014 : 1 posts November 2014 : 1 posts November 2015 : 2 posts January 2016 : 1 posts January 2017 : 1 posts April 2017 : 2 posts