
« Bezos on Amazon's utility | Main | A glass house »
Beyond question
October 02, 2006
It's funny how a set of instructions - an algorithm - written by people can come to be granted, by those same people, a superhuman authority. As if a machine fashioned by man should, upon trembling into motion, shed its earthly origin and assume a god-granted imperium, beyond our small-minded questioning.
Last week, CNET's Elinor Mills reported on how a web search for "Martin Luther King" returns, as its first result on Google and as its second result on Windows Live Search, a web site (martinlutherking.org) operated by a white supremacist organization named Stormfront. The site, titled "Martin Luther King Jr.: A True Historical Examination," refers to King as "The Beast" and says he was "just a sexual degenerate, an America-hating Communist, and a criminal betrayer of even the interests of his own people." The site also features an essay on "Jews & Civil Rights" by former Ku Klux Klan official David Duke.
What's remarkable, though, is not that a search algorithm might be gamed by extremists but that the owners of the algorithm might themselves defend the offensive result - and reject any attempt to override it as an assault on the "integrity" of their system. AOL, because it subcontracts its search results to Google, finds itself in the uncomfortable position of promoting the white supremacist site to its customers. In response to an inquiry from CNET, the company was quick to distance itself from the search result and to place the responsibility for it on Google:
AOL spokesman Andrew Weinstein said the company has contacted Google about the Martin Luther King search results. "We get all of our organic search results from Google, as you know, so we don't set the algorithms by which they are ranked. Although we can't micro-manage billions of search results, our users would not expect this to be the first result for that common search, and we do not want to promote the Web sites of hate organizations, so we have asked Google to remove this particular site from the results it provides to us."
That seems like an entirely reasonable position. Clearly, a white supremacist site is not the site that any rational person would consider an appropriate recommendation for someone looking for information on a black civil-rights leader. But Google doesn't seem to agree. In fact, in responding to CNET, it defends the King result as being "relevant to the query" and suggests that it is evidence of the integrity of the Google PageRank algorithm:
At Google, a Web site's ranking is determined by computer algorithms using thousands of factors to calculate a page's relevance to any given query, a company representative said. The company can't tweak the results because of that automation and the need to maintain the integrity of the results, she said. "In this particular example, the page is relevant to the query and many people have linked to it, giving it more PageRank than some of the other pages. These two factors contribute to its ranking," the representative wrote in an e-mail.
A Microsoft spokesman is even more explicit in asserting that the King result is a manifestation of algorithmic "integrity":
The results on Microsoft's search engine are "not an endorsement, in any way, of the viewpoints held by the owners of that content," said Justin Osmer, senior product manager for Windows Live Search. "The ranking of our results is done in an automated manner through our algorithm which can sometimes lead to unexpected results," he said. "We always work to maintain the integrity of our results to ensure that they are not editorialized."
By "editorialized" he seems to mean "subjected to the exercise of human judgment." And human judgment, it seems, is an unfit substitute for the mindless, automated calculations of an algorithm. We are not worthy to question the machine we have made. It is so pure that even its corruption is a sign of its integrity.
Comments
I think that the main reason that the "no human intervention" policy is in place so that they can defend themselves from anyone asking for the search results to be manipulated, which is what it essentially boils down. It's quite similiar to the ISPs "common carrier" defense -- you don't want to be legally liable for the order (or content) of the sites returned by a search.
For example, let's say a particular product is bad, and someone puts up a site detailing all the problems about it and it hits #1 in Google. What is to prevent the manufacturer to legally go after Google to suppress that result? The policy helps them in that case.
Posted by: Ashish Kulkarni at October 2, 2006 02:06 PM
I don't think it's about legal liability as much as it's about not having to pay people to manually vet the results.
Posted by: Kevin Murphy at October 2, 2006 02:16 PM
Or perhaps when Google says that the site is relevant to the query they're absolutely right. When you start trying to sanitize the information that surrounds something as heated as civil rights, bigotry and racism, you start glossing over the real problems. By returning this site in it's search results Google is simply replicating the state of affairs and providing the user with a first hand example of the kind of ignorance and hate that King himself dealt with on a daily basis. To shield a user from that fact... and from the fact that these people still exist in frighteningly large numbers... would be the truly irresponsible act.
Posted by: Michael Turro at October 2, 2006 02:25 PM
I think Google and Microsoft are acting from a reasonable position. It's a slippery slope when they start editing random issues. This can easily happen since there are likely a number of positive sites watering down each other's PageRank leaving the sole negative site to rise up.
However, I think it would also be reasonable for them to analyze the sitatuin and really see if they could make some algorithmic improvements. For example, Google would know that a user clicked to the site, then came back and clicked to the next entry and did not come back. That might suggest that the first result wasn't what they were looking for. Enough people do this, the result should devalue.
Posted by: pwb at October 2, 2006 02:54 PM
Googling link:martinlutherking.org shows who links to the site. It appears to be a popular example of online misinformation. These links from critical authors seem to have boosted its rank — probably not what the authors intended, eh?
Posted by: Sid Steward at October 2, 2006 03:15 PM
If people are so upset about this. They can google bomb other King sites so it'll bump them up in rankings. Your site has a PR of 7, give them some google juice and the problem is fixed
Posted by: Simon Owens at October 2, 2006 04:01 PM
It very much reminds me of Jaron Lanier's point about the turing test:
Turing's mistake was that he assumed that the only explanation for a successful computer entrant would be that the computer had become elevated in some way; by becoming smarter, more human. There is another, equally valid explanation of a winning computer, however, which is that the human had become less intelligent, less human-like.
Posted by: EzraBall at October 2, 2006 04:29 PM
Simon- I think that's the point: the World According to Google is a popularity contest. Yet it holds enough credibility to upset people. If a human editor were responsible there could be some accounting for it. Since it's defended as the result of an innocent algorithm we should instead re-evaluate our notions about search engines. GIGOogle? ;-)
Posted by: Sid Steward at October 2, 2006 04:35 PM
Hopefully this discloses the lack of an search algorithm and not of the search base.
See my cartoon.
Bye,
Oliver
Posted by: Oliver Widder at October 2, 2006 05:18 PM
There is another, equally valid explanation of a winning computer, however, which is that the human had become less intelligent, less human-like.
Lanier makes a good point. In fact, as I read a longer version of the response from the anonymous Google representative (shared with me by Elinor Mills) I had the sense that it could very well have been generated automatically by a computer: "In response to your below story, Google looks objectively at many factors to determine how to order results. In this particular example, the page is relevant to the query and many people have linked to it, giving it more PageRank than some of the other pages. These two factors contribute to its ranking. A site's ranking in Google's search results is automatically determined by computer algorithms using thousands of factors to calculate a page's relevance to a given query."
Posted by: Nick Carr at October 2, 2006 05:20 PM
What Google needs is a user option to bury a link, much like Digg's "Bury Story" feature.
Posted by: Alan Morrison at October 2, 2006 05:34 PM
the page is relevant to the query and many people have linked to it
Yes, many people have linked to it not necessarily out of endorsement, but out of ignorance. This is similar to sites on Digg rising to popularity, etc. A network that relies on its members for quality is only going to be as good as the weaker parts of that network. Crowds are not always that wise.
I encountered this particular case, by the way, a few years ago when teaching my Internet and Society course to undergrads. Our class fell on MLK Day and so I told students that instead of meeting (so they could attend univ events), their assignment was to blog about a Web site related to the day. A student linked to and discussed in detail this specific site. It was not clear whether she had realized the origins of the source. I was left in a tricky position. If she was aware of the site's origins and still decided to blog about it, should I comment? I ended up sending her an email pointing out the origin of the site and leaving it up to her as to whether she wanted to leave it up on her blog or not. She thanked me for the information and admitted to having missed this "detail". Instead of deleting the post, however, she added an additional entry on her blog describing this entire experience. I think it made for a very educational one in the end and one that we also discussed in class later that week. (Since everyone blogged pseudonymously no one was put on the spot.)
Posted by: eszter at October 2, 2006 05:51 PM
Remember my aphorism here:
Google ranks popularity, not authority
We're re-iterating the difference between "most popular reference" and "most authorative reference"
It's the different between "fame" and "infame".
But do you have any idea what a can worms it would open up to manually change results?
Posted by: Seth Finkelstein at October 2, 2006 06:00 PM
Nick- They clearly use anonymity in order to dodge accountability. Or, maybe their faceless appeal is a clue to a larger mystery: Google isn't only run and managed by robots — it is used by robots. And consider: what use does a super-race of robots have for civil liberties anyhow? They truly don't understand the issue at hand. ;-)
Posted by: Sid Steward at October 2, 2006 06:46 PM
Not thinking about manually changing results, really. From what I glean from http://www.digg.com/faq, it seems the "Bury Story" feature is automated and rule-based.
Last.fm has the "Ban" button, which seems at this point to be relevant only to a user's own preferences (i.e., you're banning the tune from ever being played again on your device), but could have utility community-wide. Google could put a "Ban" button on its toolbar, and quantify the number of times users clicked "Ban" after retrieving a result that wasn't useful.
You could make it an option to retrieve only results that were filtered with "Ban" button feedback. Those who want unfiltered results could opt out.
Posted by: Alan Morrison at October 2, 2006 06:57 PM
The problem here is expecting google not to be google, to expect it not to use its own stated method of operation. To have some "expert" decide which results are more truthy than others not only invites liability, but it also sets up the same problem you complain about now but just from a different angle. File this article under "french military victories".
Posted by: Chris_B at October 2, 2006 08:27 PM
I find it hard to believe that Google or Microsoft have never intervened to remove an offensive search result. Given their boilerplate response, it seems likely that no human at Google has even looked at this yet. I bet with more heat someone would perk up and make the change quietly. Of course they'd probably do it algorithmically with some clever tweak, lest they sacrifice their silly robot purity.
Posted by: Kevin Arthur at October 2, 2006 10:25 PM
Seth makes a good point -- would you rather that Google just decided by fiat which King site should get the top rank? The algorithm is not the culprit -- all it does is reflect the behaviour of linkers on the Internet. It's called voting with your feet (or your links). I know you're not a big fan of democracy though, Nick, so I don't expect that argument to persuade you :-)
Posted by: mathewi at October 2, 2006 10:32 PM
By the way, see my report:
"Jew Watch", Google, and Search Engine Optimization
for discussion about an earlier similar incident.
Posted by: Seth Finkelstein at October 2, 2006 10:48 PM
Is this site truly popular? Many people are defending the *principle* that search sites should return sites that are popular.
It does not look like this site is actually the most popular Martin Luther King site in the world. The vast majority of people who talk about MLK do not flame him, so why would they read or link to this site? Further, the site itself indicates that it presents an alternative history. This site calls itself "a true historical examination".
It looks like Google and Microsoft have both been gamed. They need to improve their algorithm.
Posted by: lexspoon at October 3, 2006 04:31 AM
There is plenty of evidence that Google is willing to edit search results - at least sometimes.
This comment from elsewhere says it all (though I would love to see a complete transcription of Brin's remarks at the conference in question):
http://www.hyperorg.com/blogger/mtarchive/002542.html#comment-32779
"I seem to remember when we heard Sergey Brin speak at Supernova in Santa Clara a year and half ago, he talked about how Google does look at the results and tweaks them in some cases. If memory serves, the two specific examples he gave were suicide and heart attack, where Google looked at the top results generated algorithmically, then tweaked to make sure that a suicide prevention hotline came up on the top for suicide, and a page about what to do if you feel you are having a heart attack comes up for the second. A quick Google shows results consistent with my memory -- if you would like me to check my notes to find out exactly what he said, I can.
So it sounds like Google thinks some things are important enough to ignore the algorithms, and some aren't. I'd ask Google how do those decisions get made? And how does the fact that some of these rankings (e.g. miserable failure, Jew) are the results of gaming the algorithms influence whether they adjust the results or not.
Once Google admit to having changed anything, as they did, then they really have opened Pandora's box for themselves.
I'm glad to see someone trying to keep them honest."
Posted by: David Brake at October 3, 2006 04:34 AM
"What Google needs is a user option to bury a link, much like Digg's "Bury Story" feature."
The "burry" feature will not work. We would create a new "arms race" where someone (person or organization) with enough resources and determination can burry any link.
In the end, what Elinor asks is a kind of "safe search" feature where ceratain sites are filtered out based on their content. We had a similar discussion some time ago when talking about wikipedians "inclusionists" and "deletionists".
Solving the issue of popularity versus authority, good versus bad and so on requires a third party solution. Rating agencies will need to appear, where humans vet the quality of sites and then internet users choose what filtering mechanism they want to be applied to their search results.
Practically we will satisfy our urge to have fast answers by delegating our responsibility to decide what is relevant to a third party.
Posted by: Dragos at October 3, 2006 04:35 AM
Does anyone else see the irony in a company who's motto is "Do no evil" is suggesting that their computer algorithm (which has no notions of good or evil) should control the dispensation of their signature product?
Posted by: Phil Gilbert at October 3, 2006 09:03 AM
Maybe Google should start thinking on how to embed Asimov's laws into their algorithms.
The first law
"A robot may not injure a human being, or, through inaction, allow a human being to come to harm"
can be rewritten as "An algoritm may not mislead a human being, or incite him/her in harming others"
SF becomes reality sooner than we expected
Posted by: Dragos at October 3, 2006 10:22 AM
Wake up people! The results in this case, in any case, are about relevance... not popularity... and certainly not about making moral judgments on right/wrong or good/evil. The only question on the table is whether or not the presence of hate speech is relevant to MLK's life struggle. I tend to think it most definitely is.
In simply negating what King fought against we limit any chance at achieving what he fought for. Would we be any better off if this site were to be pushed back into the shadows where only the dark depressed and disturbed dare seek it out? Is it not better to bring this type of overt racism into the sunlight where discussion and dialogue can expose it for what it is? Shouldn't the open air of public debate be the tonic which we apply to the bleeding sores of these ideas?
Perhaps I'm wrong... perhaps we should sweep this type of thing under the rug. But if we do... if we have Google change their algorithm to protect us and insulate us from this kind of vile hatred... aren't we then yielding REAL power to the computer?
Posted by: Michael Turro at October 3, 2006 11:09 AM
Nick, I think all the evidence that your blame for this episode is misplaced can be found in these comments. When those who agree with you suggest a "Bury This!" option, which would effectively let any group hide any story from the public, I think it's clear why editorializing needs to be kept separate from the underlying data whenever possible. By all means have a separate filter layer on top, but as search engines become the way we find information on the web, we must make every effort to keep them inclusive at their lowest level.
Posted by: Anthony Cowley at October 3, 2006 01:56 PM
Responding to mturro... the issue isn't whether it should be censored (as you and others note). The issue is whether this is even remotely the "most relevant" result for a search for "martin luther king."
I think it's safe to say that any mainstream editor would include hate speach (and FBI wiretaps, etc.) as relevant to a discussion of Martin Luther King, Jr., but I don't think any would believe these were close to the top of relevance. (Jim Crow laws, segregation, and other forms of opposition to MLK would be much more relevant as the anti-positions... hate speach from a new-Nazi organization currently in existence would be way down the list).
So, to me, as we ascribe a certain amount of importance to Google societally, it's important that we understand its limitations... and I also think it's prudent for Google to acknowledge (and continually improve) their limitations.
Software is being asked to take on more and more "advisory" (editorial?) roles... we in the industry (I am one of those) need to understand its limitations, so that we can eliminate them (or work around them with... God forbid... humans...)
Posted by: Phil Gilbert at October 3, 2006 03:17 PM
Obviously I'm not advocating that Google stop trying to improve it's search algorithms... of course they should. What I object to is the idea that they should make modifications in order to eliminate a specific offensive result from a particular search. That is introducing an editorial perspective to search... ascribing to it a function that is best left to a human being. Fortunately a human is most likely the one who initiated the search to begin with and it falls upon that person to deal with those editorial decisions and to decide what they accept as good or bad information... not the machine or its inventor.
As for the relevance of hate to the life and work of MLK... why would the distant history of Jim Crow or segregation be more relevant than the explicit ignorance of group espousing hateful ideas now? I would think that the blinding hate displayed by the site in question is perhaps the best example... here and now... in the flesh... of what King was dealing with. The site serves as an extremely visible reminder that his struggle is not HISTORY but real and relevant TODAY.
Posted by: Michael Turro at October 4, 2006 11:45 AM
This shows the great ability Google has to check political facts as Eric Schmidt claimed (see Techdirt).
I made a small cartoon.
Bye,
Oliver
Posted by: Oliver Widder at October 4, 2006 03:35 PM
Unfortunately, we seem to be missing the basic end-user rule of modern capitalism: caveat emptor. Google, Microsoft, et al, don't -care- about truth or integrity, except as it impacts their basic business (making money for their share holders). In this case, they get more kudos from being "impartial", as indeed they are.
And if the white dudes behind martinlutherking.org can play the game better than anyone else, then surely that is not Google's "fault". No one is to "blame". It is another sign of Adam Smith's invisible hand in action. Revel in it, folks! This is what has made the Internet the place it is today.
(BTW, when I just did a search, the site is now #1. Free advertising from this column. Ta da!)
-mark.
Posted by: mark5009 at October 4, 2006 06:05 PM
The first issue is why this site name was not yanked by the powers that be, is it still ICANN? Secondly the response by Google and Microsoft seem kind of silly. Why can't the algorithm be tweaked? They have to know that this is not the info people are looking for.
If I searched for info on Dr. King and this is the first site or even the 100th site that popped up I would be PO'd. How is the opinion of a fringe group (I hope!) useful to me looking for legit info on King. They should get off their high horse and take another look at their algorithms.
Posted by: GaryValan at October 4, 2006 07:55 PM
There are two separate questions: How can algorithmically produced results be questioned? And what does integrity mean with regard to search engine results?
I think the answer to the first cannot be that old simplistic complaint about ascribing god-like authority to computers. What should be questioned is the algorithm and the issue is that we don't know how it works, so we cannot question it.
The other question about the integrity of search engine results cannot be answered by asking for the morality of its content. But you are right to question whether a racist pamphlet is the most relevant information about Martin Luther King. Whether it is the most relevant or not, however, depends on the intentions of the person who does the searching. This person, like the people who linked to the racist page, might be interested in racist attacks on Martin Luther King. Or they might be more interested in a biography of King, which is what you get when you check wikipedia.
And there we are right at the center of the problem that search engines try to solve, which is to figure out what question the searcher has in mind based on the words he enters. In the case of google, the question turns out to be whether the number of links to a page (and other criteria) are a good enough match for most searchers intentions.
Googles definition of integrity is that nobody has tricked the algorithm into ranking a page higher than it would otherwise be. I think to accept this definition doesn't mean to ascribe god-like authority to the algorithm. However, we should be able to question the algorithm. We should ask for it to be made public so it can be subjected to public scrutiny. But to ask for an ethics filter is not very different from asking for censorship. Who does the censoring? Google? Mr. Bush? The Chinese government? You?
Posted by: fauigerzigerk at October 5, 2006 03:34 AM
At the heart of Google's algorithm is the assumption that a link is a positive indication of a site's value. This idea stems from the academic practice of citations. When you write an academic paper, you cite references to other papers in order to build upon the author's previous work.
This works very well in acadaemia because everyone is an adult and there is no bebfit to be gained by gaming the system. Google uses and extends this citation idea to create it's page ranking of importance. Thus, you could conclude that PageRank is broken.
Having a human intervene in this process is an anathema to Google because a) you can resolve everything to maths and b) you get into political arguments and can be accused of bias. Earlier in these comments someone mentioned that Google had 'fixed' some results for heart attacks and suicide. This was a foolish thing done for good reasons.
Issues like this will continue to arise until someone cracks the semantic issue of understanding the intent behind a search request, the holy search grail of natural language search. When someone simply types "Martin Luther King" into a search engine, there is no indication of the context in which the search should occur, so you get what you consider are inappropriate results.
In summary, PageRank is not broken, you need to tell Google (and MSN) more about what you are looking for.
Posted by: Rob Jones at October 6, 2006 07:31 AM
While the white supremecist ranking is disturbing, it should be pointed out that it comes out this way partly because of the habit of referring to Martin Luther King _Junior_ as simply MLK. Type in his full name, and the first hit in rankings is the one at Stanford.
To me this just points up that one has to be careful what you're asking for when you do a search. IOW, do a sloppy search and you're not necessarily going to get an accurate result.
Posted by: JohnN at October 10, 2006 03:55 PM
Seems to me that we like to shift blame whenever we can. The only problem with doing this is that although it may make us feel better about ourselves in the short-term, in the long-term it makes us weaker and hands over the power to some other 'thing' that is causing us grief and is out of our control/ is controlling us.
We can't chop and change to make these searches personalised - Bertram Brookes understood this with his fundamental problem of information science. So rather than change the system to be all things to all people, make people understand the system (or become Information literate) so they can use these tools with confidence. A quick evaluation of the authority and agenda of the MLK site provides us as much 'information' as the content itself, and is a valid consideration in any research on this topic.
Posted by: Andrew at October 16, 2006 09:44 PM
Post a comment
Now in paperback:
Pulitzer Prize Finalist
"Riveting" -San Francisco Chronicle
"Rewarding" -Financial Times
"Revelatory" -Booklist
The Cloud, demystified:
"Future Shock for the web-apps era" -Fast Company
"Ominously prescient" -Kirkus Reviews
"Riveting stuff" -New York Post
Greatest hits
Avatars consume as much electricity as Brazilians
The love song of J. Alfred Prufrock's avatar
Flight of the wingless coffin fly
Other writing
The end of corporate computing
The limits of computers:
Order from Amazon
Visit book site