Oh, the SAA is going to love this! The Daily Show featured the Grateful Dead Archivist position that was noticed by BoingBoing last Monday. Some choice quotes from the episode about the job:
“UC Santa Cruz is looking for a Grateful Dead Archivist. They’re looking for someone who loves the Grateful Dead, and yet somehow has exceptional organizational skills. So basically what they’re saying… is that they need a miracle”
And then it gets better (or worse, depending on whether you have a Masters in Archives.) Here’s what Jon Stewart has to say about Archivists:
“By the way, a masters degree in Archives Management? What does that mean? “Oh I can archive things alphabetically or numerically”. What?! Alphanumerically? Slow down, I don’t have a doctorate! There you have it, 4 years of undergrad, 2 years of graduate school and now you can spend your days picking blotter acid coming out of Phil Lesh’s underwear from the Blues for Allah tour.”
On October 9, 2009, U.S. President Barack Obama was awarded the Nobel Peace Prize less than one year after his taking office (in fact, the nominations closed on February 1, about 11 days after Obama took office). While the committee praised his ambitious foreign policy agenda, it acknowledged that he had not yet actually achieved many of the goals that he had set out to accomplish. Former Polish President Lech Wałęsa, a 1983 Nobel Peace laureate, commented: “So soon? Too early. He has no contribution so far. He is still at an early stage. He is only beginning to act.”
This is pretty amazing news. My Facebook, News and IM streams are flooded with one-liners. I though I’d collect them all:
“I too would like a Nobel Peace Prize for the thesis I am about to write in the future.” — me
“it’s a pretty swell booby prize for losing out on the Olympics” – n.d.
“Surely preventing Sarah Palin from taking over the free world deserves a prize… even if it is a Nobel?” — v.b.
““NASA bombs moon”; “Obama wins Nobel Prize” — is today Onion News Day?” — me
“Barack Obama linked to terrorist Yasser Arafat” — fark via a.a.
“The Nobel? Really? I mean, cool…but it seems like we have our cart on the wrong side of the horse. Not that it isn’t a very nice cart.” — c.m.
“…thinks they might as well have given him the Nobel Prize for Literature, Chemistry (we’ve all seen the shirtless photos), Physics and Economics as well. Oh and made him a Knight Commander of the Order of the Bath” — r.d.
“Nobel Committee Rewards Obama For Not Being Bush” — f.n.
“I just want to point out that the Nobel Committee made its decision BEFORE Miley Cyrus quit Twitter.” — j.h.
“Obama will win a second Nobel next year if he can restrain himself from reacting to the snark generated by this one.” — m.w.
“Pretty sure Obama will just trade in his Nobel for a Google Wave invite.” — t.b.
“The news of Obama’s Nobel Peace Prize spreads. Across the miles I can almost HEAR my dad’s eyes rolling.” — p.g.
“Obama wins Nobel Peace Prize? About time Rakhi Sawant wins an Oscar, then.” — s
“If you don’t think Obama deserves that Nobel, then you’ve never seen Sasha and Malia fight.” — a.e.
“Apparently Arizona State has a higher standard than the Nobel Committee. Good thing I never tried to apply there.” — r.m.
Oracle Man Larry Ellison gives his two cents about the new Cloud Computing hype:
“Do you think they run on Water vapor? It’s Databases, and Operating Systems, and Memory, and Microprocessors and the Internet! What are you talking about!”
Michigan Today has a slideshow tribute to Tony Rosenthal, abstract artist and sculptor, who passed away this July. A Michigan alum, I know of him primarily from his Rosenthal Cubes, a pair of identical 15-foot cubes called Endover and Alamo. Endover is located near on Central Campus in Ann Arbor, while Alamo is located at Astor place in Manhattan, New York.
I think the difference between the two is that the New York cube has a platform and is a little harder to spin (yes, I’ve spun both!). I still find it amazing that the 41-year old sculptures are fully functional despite being exposed to the elements for so long.
Techcrunch continued their usual Yahoo-bashing with this story today:
It appears that a few days ago there was a slight change to Flickr’s logo: an addition of a small Yahoo logo to the right side so it reads “Flickr from Yahoo.” In response, many Flickr users have taken to the photo-sharing site’s forums to express their horror at the Yahoo’s new branding of Flickr.
There is definitely some truth to the community backlash, but what I see as more aggravating is a great missed branding opportunity for Yahoo!.
Flickr and Delicious have both been adamant opponents to Yahoo! branding. Even though Yahoo! owns it, the Delicious frontpage doesn’t contain a single mention of Yahoo. Both sites’ communities are predominantly “indie” brand lovers; and don’t want “the man” to infringe their beloved service (even if the man is running it).
What’s crazy is that Yahoo recently launched a $100 million campaign called “Y!ou and Yahoo!”. What’s also interesting is that Flickr actually had a branding that said “Flickr loves you” (in place of Flickr BETA), which reflected Flickr’s personality and branding. People got used to it, and some even thought it was cute.
The last thing you want to do is force a new logo on to the community in an ungraceful manner. Here’s a convenient solution: to morph the “loves you” logo into the “Y!ou and Yahoo!” campaign and do a “flickr loves Y!ou” logo, killing two birds with one stone. The community sees a subtle evolution of the existing logo, and the “Y!ou” campaign is placed on a huge community”.
Finance software giant Intuit is buying personal finance startup Mint.com for $170million. Personally, I wasn’t too happy about this. Jason Fried makes a valid point about this:
Mint was a key leader of the next generation of game changers. And now it’s property of Intuit — the poster-child for the last generation. What a loss. Is that the best the next generation can do? Become part of the old generation? How about kicking the shit out of the old guys? What ever happened to that?
First thing I did when I heard about the deal? Delete my Mint.com account.
Just got done with the HAMSTER presentation; here is the paper, and here are my abstract and slides:
We address the problem of unsupervised matching of schema information from a large number of data sources into the schema of a data warehouse. The matching process is the first step of a framework to integrate data feeds from third-party data providers into a structured-search engine’s data warehouse. Our experiments show that traditional schema- based and instance-based schema matching methods fall short. We propose a new technique based on the search engine’s clicklogs. Two schema elements are matched if the distribution of keyword queries that cause click-throughs on their instances are similar. We present experiments on large commercial datasets that show the new technique has much better accuracy than traditional techniques.
I received a few questions after the talk, hence I thought I’d put up a quick FAQ:
Q: Doesn’t the time(period) of the clicklog affect your integration quality?
A: Yes. And we consider this a good thing. This allows trend information to come into the system, e.g. “pokemon” queries will start coming in, and merge “japanese toys” with “children’s collector items”. Unpopular items that are not searched for may not generate a mapping, but then again, this may be ok since the end goal was to integrate searched-for items.
Q: You use clicklogs. I am a little old company/website owner X. Since my company’s name doesn’t start with G, M or Y, I don’t have clicklogs. How do I use your method?
A: You already have clicklogs. Let’s say you are trying to merge your company/website X’s data with company Y’s data. Since both you (X) and Y have websites, you both run HTTP servers, which have the facility to log requests. Look through your HTTP server referral logs for strings like: URL:http://x.com REFERRER: http://www.google.com/?q=$search_string$
This is your clicklog. The url http://x.com has the query $search_string$. You can grep both websites to create clicklogs, which can then be used to integration.
Q: My website is not very popular and I don’t have that many clicks from search engines. What do I do?
A: Yup, this is a very real case. Specifically, you might have a lot of queries for some of your items, but not for others. This can be balanced out. See the section in our paper about Surrogate Clicklogs. Basically you can use a popular website’s clicklog as a “surrogate” log for your database. From the paper:
…we propose a method by which we identify surrogate clicklogs for any data source without significant web presence. For each candidate entity in the feed that does not have a significant presence in the clicklogs (i.e. clicklog volume is less than a threshold), we look for an entity in our collection of feeds that is most similar to the candidate, and use its clicklog data to generate a query distribution for the candidate object.
Q: I am an academic and do not have access to a public clicklog, or a public website to do get clicklogs from. How do I use this technique?
A: Participate in the Lemur project and get your friends to participate too.
I’m looking forward to my talk at VLDB 2009 in Lyon, France. I will be presenting “HAMSTER: Using Search Clicklogs for Schema and Taxonomy Matching”, which is joint work I did with Phil Bernstein during my internship at Microsoft Research. The talk is scheduled for Tuesday 25, 2009 at 2pm in the Rhône 2 room at the conference venue.
Also look out for my labmate Bin Liu ‘s paper with our advisor, “Using Trees to Depict a Forest”.
CUHK Professor Yufei Tao’s homepage has this interesting tidbit:
My pledges as a reviewer:
I will treat your work with respect.
I will spend enough time with your paper. I will not make any decision without a good understanding.
In case I decide to recommend rejection, I will do so on solid grounds. I do not reject papers based on subjective and vacuous statements such as “I don’t like this idea”.
I will write reviews in a courteous manner. I have seen harsh reviews by other people which heavily mention my publications, and thus make people feel I was the reviewer. I will never do anything like this.