February 11th, 2009

Purple Summer

For those who care, I will be working at Yahoo! Research in New York this summer. The offer letter just arrived in well designed (and very purple) packaging, with seventeen thousand million documents to sign. Nonetheless, I’m excited about working there!

| |

February 8th

360° view from Mt. Everest

(via APOD)


February 5th

what is wrong with the new gmail buttons

I’m having trouble getting used to the new Gmail button. Here’s the explanation, in a simple animated GIF. My my motor memory remembers the position of the “Delete” button, and finds the “Move” button in its place.

Also, “Move to”? WTF? It looks like the fruit of a religious compromise between the “Folders are evil, Labels are googly” and “Folders are how the world works, you hippies” camps. Hey design-by-committee… Look, it’s your new friend, Gmail!

| |

Database Autocompletion Demo

The video overview of the SIGMOD 2007 demo is now up!


My Research Papers, now more accessible

Many readers have complained that this blog is always full of artsy and time-wasting material… “what about all the technical stuff? Aren’t you a computer person?!” they ask. To pacify these masses, I have just converted three of my recent papers to HTML format. For the first two, I used the HEVEA LaTeX to HTML converter, which I found slightly better than LaTeX2HTML. For the 3rd paper, I have inexplicably misplaced the source files, and hence the HTMLization was done via Gmail’s PDF Viewer.

The picture above is from the 2007 SIGMOD demo paper. I’ll post videos of the demo in a later post. Here’s a quick preview of each paper:

  • Qunits: queried units in database search CIDR, 2009
    Keyword search against structured databases has become a popular topic of investigation, since many users find structured queries too hard to express, and enjoy the freedom of a “Google-like” query box into which search terms can be entered. Attempts to address this problem face a fundamental dilemma. Database querying is based on the logic of predicate evaluation, with a precisely defined answer set for a given query. On the other hand, in an information retrieval approach, ranked query results have long been accepted as far superior to results based on boolean query evaluation. As a consequence, when keyword queries are attempted against databases, relatively ad-hoc ranking mechanisms are invented (if ranking is used at all), and there is little leverage from the large body of IR literature regarding how to rank query results.
  • Effective Phrase Prediction VLDB, 2007
    Autocompletion is a widely deployed facility in systems that require user input. Having the system complete a partially typed “word” can save user time and effort. In this paper, we study the problem of autocompletion not just at the level of a single “word”, but at the level of a multi-word “phrase”. There are two main challenges: one is that the number of phrases (both the number possible and the number actually observed in a corpus) is combinatorially larger than the
    number of words; the second is that a “phrase”, unlike a “word”, does not have a well-defined boundary, so that the autocompletion system has to decide not just what to predict, but also how far. We introduce a FussyTree structure to address the first challenge and the concept of a significant hrase to address the second. We develop a probabilistically driven multiple completion choice model, and exploit features such as frequency distributions to improve the quality of our suffix completions. We experimentally demonstrate the practicability and value of our technique for an email composition application and show that we can save approximately a fifth of the keystrokes typed
  • Assisted querying using instant-response interfaces SIGMOD 2007
    We demonstrate a novel query interface that enables users to construct a rich search query without any prior knowledge of the underlying schema or data. The interface, which is in the form of a single text input box, interacts in real-time with the users as they type, guiding them through the query construction. We discuss the issues of schema and data complexity, result size estimation, and query validity; and provide novel approaches to solving these problems. We demonstrate our query interface on two popular applications; an enterprise-wide personnel search, and a biological information database.

February 3rd

Larry's Bell Labs Days

Larry Luckham documents the people at Bell Labs :

In the late ’60’s I worked for Bell Labs for a few years managing a data center and developing an ultra high speed information retrieval system. It was the days of beehive hair on the women and big mainframe computers. One day I took a camera to work and shot the pictures below. I had a great staff, mostly women except for the programmers who were all men. For some reason only one of them was around for the pictures that day.

A fun set of pictures.


January 31st

These are a few of my favorite things

From a comment on reddit:

Comments on code and captions on kittens
Getting that bug and bad grammar writtens
Working in python and dealing with strings
These are a few of my favorite things

Making good memes and hard logic riddles
Mispells and root shells and cheap ramen noodles
Living near an exchange with really low pings
These are a few of my favorite things

Girls in geek shirts with xkcd dresses
Using grep dict/words to find one that matches
Fantasy actors that say “good tidings”
These are a few of my favorite things

When the net lags
When the kill stings
When I’m feeling sad
I simply remember my favorite things
And then I don’t feel so bad

| |

January 25th

Her Morning Elegance

Video for Her Morning Elegance, a song by Oren Lavie. The song is also featured in the Chevy Malibu ad, but this video is so much better:


January 24th

Liquid Food

I recently underwent oral surgery to have 3 impacted wisdom teeth removed. Having 3 stitches in my jaw, it’s a little hard to eat anything that doesn’t go through a straw. Ironically, I’m not even allowed straws, since the suction isn’t good for the stitches. So what does one eat in this situation? Stuff that I have consumed so far:

* Strawberry Milk shake
* Blended milk and cereal
* Blended carrot and tomato soup
* Strawberry Cheesecake Shake
* Clam chowder
* Orange juice

Today I’m planning to try Ratatouille, which Katherine’s making for me. Also in the works is Blended chicken soup; I’ll post the recipe once it’s done.

I also found an interesting blog called Jaw Dropping Blends which hosts recipes for people just like me!

| |

January 19th

scalable oxymoron

Just encountered a funny error at one of the blogs I read:

That being said, High Scalability is a must read for server people; consistently great content, news and linkage:

This site tries to bring together all the lore, art, science, practice, and experience of building scalable websites into one place so you can learn how to build your website with confidence.