author

Reputation Misrepresentation, Trail Paranoia and other side effects of Liking the World

trafficspike

A few months ago, I wrote up some quick observations about Facebook’s then just-launched “Like” button, pitching “Newsfeed Spam” as a problem exacerbated by the new Like Buttons. The post went “viral”, so to speak, bouncing off Techmeme, ReadWriteWeb / NYTimes, even German news websites. Obviously this is nothing compared to “real” traffic on the Internet, but it was fun to watch the link spread. This is meant to be a follow-up to that post, based on thoughts I’ve had since.

In this post, I'll be writing about five "issues" with the Like button, followed by four "solutions" to these issues. Since this is a slightly long post, here's an outline:


Big Deal!


facebook stats

The Facebook Like Button has been huge success. With over 3 billion buttons served, and major players such as IMDB and CNN signing up to integrate the button (and other social plugins) into their websites, the chance of encountering a Facebook Like button while browsing on the web is quite high; if not certain. Many folks have questioned whether this is a big deal -- IFRAME and javascript based widgets have been around for a long time (shameless self-plug: Blogsnob used a javascript-based widget to cross polinate blogs across the internet as early as 8 years ago). Using the social concept of showing familiar faces to readers isn't new either; MyBlogLog has been doing it for a while. Then why is this silly little button such an issue? The answer is persistent user engagement. With 500 million users, out of which 50% of them log into Facebook at any given day, you're looking at an audience of 250 million users. If you're logged into Facebook while browsing any website with a social plugin, the logged in session is used. Now if you're like me, you'll probably have "remember me" checked at login, which means you're always logged into Facebook. What this means is that on any given day, Facebook has the opportunity to reach 250 million people throughout their web browsing experience; not just when they're on Facebook.com[1]. So clearly, from a company's perspective, this is important. It is a pretty big deal! But why is this something Facebook users need to be educated about? Onwards to the next section!

Issues with the Like Button


Readers should note the use of the word "Issues", as opposed to "Security vulnerability", "Privacy Leak", "Design Flaw", "Cruel Price of Technology", or "Horrible Transgression Against Humankind". Each issue has its own kind of impact on the user, you're welcome to decide which is which!

Screen shot 2010-07-21 at 1.37.51 AM

To better understand the issues with the Like button, let's understand what the Like button provides:
1) It provides a count of the number of people who currently "Like" something.
2) It provides a list of people you know who have liked said object, with profile pictures.
3) It provides the ability to click the button and instantaneously "Like" something, triggering an update on your newsfeed.
All of this is done using an embedded IFRAME -- a little Facebook page within the main page that displays the button.

In the next few paragraphs, we'll see some implications of this button on the web.

Reputation Misrepresentation


The concept of reputation misrepresentation is quite simple:
a not-so-popular website can use another website's reputation to make the site seem more reputed or established to the user.

Here's a quick diagram to explain it:

reputation misrepresentation

Simply put, as of now, any website(e.g. a web store) can claim they are popular (especially with your friends) to gain your trust. Since Facebook doesn't check referrer information, Facebook really doesn't have the power to do anything about this either. A possible solution is to include verifying information inside the like button, which ruins the simplicity of it all.

Browse Trail Inference


This one is a more paranoid concept, but I've noticed that people don't realize it until I spell it out for them:
Facebook is indirectly collecting your entire browsing history for all websites that have Facebook widgets. You don’t have to click any like buttons, just visiting sites like IMDB.com or CNN.com or BritneySpears.com will enable this.

Here's how it works:

browsetrail

Here, our favorite user Jane is logged into Facebook, and visits 2 pages on IMDB.com, checks the news on CNN, and then heads to Yelp to figure out where to eat. Interestingly enough, Facebook records all this information, and can tie it to her Facebook profile, and can thus come up with inferences like "Jane likes Romantic Movies, International News and Thai Food -- let's show her some ads for romantic getaways to Bali!"

(Even worse, if Jane unwittingly visits a nefarious website which coincidentally happens to have the Like button, Facebook gets to know about that too!)

Most modern browsers send the parent document's URL as HTTP_REFERER information to Facebook via the Like IFRAME, which allows Facebook to implicitly record a fraction of your browsing history. Since this information is much more voluminous than your explicit "Likes"; a lot more information can be data-mined from it; which can then be used for "Good"(i.e. adding value to Facebook) or "Evil"(i.e. Ads! Market data!)

What I like about this is that this is an ingenious system to track user's browsing behavior. Currently, companies like Google, Yahoo and Microsoft(Bing/Live/MSN) have to convince you to install a browser toolbar which has this minuscule clause in its agreement that you share back ALL your browsing history, which can be used to better understand the Web(and make more money, etc. etc.). Since Facebook is getting all websites to install this; it gets the job done without getting you to install a toolbar! I'll be discussing how I deal with this in the last section, "My solution".

Newsfeed Spam


In a previous post, I demonstrated how users could be tricked into "Liking" things they didn't intend to, leading to spam in their friends' newsfeeds. A month later, security firm Sophos reported an example of this, where users were virally tricked into spreading a trojan virus through Facebook Likes, something that could easily be initiated by Like buttons across the web, where you can easily be tricked into liking arbitrary things.

Again, this issue has the same root cause as Reputation Misrepresentation: since all the Like button shows you is a usercount, pictures and the button itself, there really is no way to know what you're liking. A solution to this is to use a bookmarklet in your browser, which is under your control.

"Likejacking"


This interesting demo by Eric Kerr demonstrates how to force unwitting users into clicking arbitrary like buttons. The way this works is by making a transparent like button, and make it move along with the users mouse cursor. Since the user is bound to click on the page at some point of time, they're bound to click the Like button instead.

Like Switching


likeswitch

Like switching is an alternative take on Like Jacking -- the difference is that the user is explicitly shown a like button with a prestigious like count and familiar friends first. When a user reaches out to click on it, the like button is swapped out for a different one, triggered by an onmouseover event from the rectangle around the button.

"Solutions"

Given these issues, let's discuss some solutions, responses and fixes. Note the use of quotes -- for many people can argue that nothing is broken, so we don't need solutions! Regardless, one piece of good news is that the W3C is aware of the extensive use of IFRAMES on the web, and has introduced a new "sandbox" attribute for IFRAMES. This will lead to more fine-grained control of social widgets. For example, if we can then set our browsers to force "sandbox" settings for all Facebook IFRAMES, we can avoid handing over our browsing history to Facebook.


Facebook's approach


While I don't expect companies to rationalize every design decision with their users, I am glad that some Facebook engineers are reaching out via online discussions. Clearly this is not representative of the whole company, but here's a snippet:
Also, in case it wasn't clear, as soon as we identify a domain or url to be bad, it's impossible to reach it via any click on facebook, so even if something becomes bad after people have liked it, we still retroactively protect users.

I like this approach because it fits in well with the rest of the security infrastructure that large companies have: the moment a URL is deemed insecure anywhere on the site, all future users are protected from that website. However, this approach doesn't solve problems with user trust -- it's relying on the fact that Facebook has flagged every evil website in the world before you chanced upon it -- something I wouldn't bet my peace of mind on. It's as if the police told you "We will pursue serial killers only after the first murder!"Would you sleep better knowing that? In essence, this approach is great when you're looking at it from the side of protecting 500 million users. But as one of the 500 million, it kinda leaves you out in the dark!


Secure Likes

As we mentioned in the Reputation Misrepresentation section, another interesting improvement would be to include some indication of the URL that is being "Liked" inside the button itself. An option is to display the URL as a tooltip when the user hovers his/her cursor over the button, especially if it disagrees with the parent frame's URL. Obviously placing the whole URL would make the button large and ugly. A possible compromise is to include the favicon(the icon that shows up for each site in your browser) right inside the Like button. The user can simply check if the browser icon is the same as the one on the like button to make sure it's safe. This way, if a website wants to (mis)use BritneySpears.com's Like Button, it will be forced to use BritneySpears.com's favicon too! Here's a mockup of what "Secure Like" would look like for IMDB:

securelike


A browser-based approach


Screen shot 2010-07-26 at 5.11.57 AM

This approach, best exemplified by "Social Web" browser Flock and recently acknowledged by folks at Mozilla, makes you log into the browser, not a web site. All user-sensitive actions(such as "Liking" a page) have to go through the browser, making it inherently more secure.

My Current Solution


dock

At this point, I guess it's best to conclude with what my solution to dealing with all these issues is. My solution is simple: I run Google and Facebook services in their own browsers, separate from my general web surfing. As you can see from the picture of my dock, my GMail and Facebook are separate from my Chrome browser. That way, I appear logged out[2]. Google Search and Facebook Likes when I surf the web or search for things. On a Mac, you can do this using Fluid.app; on Windows you can do this using Mozilla Prism.

And that brings us to the end of this rather long and winded discussion about such a simple "Like" button! Comments are welcome. Until the next post -- Surf safe, and Surf Smart!

 

 

Footnotes:
[1] To my knowledge, there is only one other company that has this level of persistent engagement: Google's GMail remembers logins more aggressively than Facebook. When you're logged into Gmail, you're also logged into Google Search, which means they log your search history as a recognized user. This is usually a good thing for the user, since Google then has a chance to personalize your search. Google actually takes it a step further and personalizes even for non-logged in users.

[2] Yes, they can still get me by my IP, but that's unlikely when I'm usually behind firewalls.

 

Cite this post!:


@article{reputationmisrepresentation,
title={{Reputation Misrepresentation, Trail Paranoia and other side effects of Liking the World}},
author={Nandi, A.},
year={2010},
journal={{Arnab's World}}
}

if heaven and hell decide

My dear beloved reader,

I know you’ve read more than one post similar to this one. It is unfortunate, but I do have to write you another “please don’t be mad I am insanely busy” post. As the experienced weblog reader may know, the verbosity of any blog is usually inversely proportional to the activity in the author’s life. And in my case, work, play, and Life in General has been too fracking intense to describe in a short, excuse-making post. It is sad though — because these are the days one would like to journal meticulously, penning every single detail for friends and family to read, and for my own future entertainment. Instead, I will have to do with faint, faulty, fragmented memories from my defective mind; and make up the rest of it in my favor. I guess I’ll have to ask you to do the same.

Regards,
Arnab

|

now reading

I’ve become a big fan of xkcd, and the author in general, who is also responsible for, amongst other things, this comic series, a binary preference voting engine, efforts for the conservation of gravity, and building robots at NASA Langley. You sir, seriously deserve a real Real Men of Genious award.

|

An Introduction to Digital Watermarking

With the recent spate of cases involving fake currency, no one needs to be reminded of the importance of watermarking. A watermark is a form, image or text that is impressed onto paper, which provides evidence of its authenticity. Digital watermarking is an extension of this concept in the digital world.

PWNtcha!

Captchas, especially visual ones that rely on image recognition have always been a controversial idea; but the truth is that we're in a pretty bad situation with comment spam, and other forms of automated bot attacks, and this looks like the only trick in the book that's (somewhat) effective. I have a tiny hand in this mess, I'm the author of the quickly and dirtily written, ill maintained captcha module for Drupal. It sounds like a war cry in a HongKong action flick, but PWNtcha - captcha decoder, stands for "Pretend We

|