« Where's my CloudBook? | Main | Google's friend-to-friend ad network »

Should the Net forget?

August 26, 2007

The New York Times recently got some search-engine-optimization religion, and as a result its articles, including old stories from its vast archives, are now more likely to appear at or near the top of web searches. But the tactic has had an unintended consequence, writes the paper's public editor, Clark Hoyt, in a thought-provoking article today: "Long-buried information about people that is wrong, outdated or incomplete is getting unwelcome new life. People are coming forward at the rate of roughly one a day to complain that they are being embarrassed, are worried about losing or not getting jobs, or may be losing customers because of the sudden prominence of old news articles that contain errors or were never followed up."

Hoyt tells the story of one man, a former New York City official named Allen Kraus, who resigned his position back in 1991 after a run-in with a new boss. The resignation was briefly, and incorrectly, tied to a fraud investigation that was going on at the same time in his office. The Times published stories about the affair, including one with the headline “A Welfare Official Denies He Resigned Because of Inquiry" - and that headline now appears at the top of the results you get if you google Kraus's name. Kraus is, with good reason, unhappy that his good name is, sixteen years after the fact, again being tarnished.

Many other people now find themselves in similar predicaments, and they are contacting the Times (and, one assumes, other papers and magazines) and asking that the offending stories be removed from the archives. The Times, of course, has routinely refused such requests, not wanting to get into the business of rewriting the historical record. Deleting the articles, one of the paper's editors told Hoyt, would be "like airbrushing Trotsky out of the Kremlin picture.”

But if the Times is using search-engine-optimization techniques to push articles toward the top of search-engine results, does it have any ethical obligation to ensure that old errors, distortions, or omissions in its reporting don't blight people's current reputations? Times editors are discussing the problem, writes Hoyt, and in some cases the paper has added corrections to old stories when proof of an error has been supplied.

The Times's predicament highlights a broader issue about the web's tenacious but malleable memory. Hoyt touches on this issue in his article:

Viktor Mayer-Schönberger, an associate professor of public policy at Harvard’s John F. Kennedy School of Government, ... thinks newspapers, including The Times, should program their archives to “forget” some information, just as humans do. Through the ages, humans have generally remembered the important stuff and forgotten the trivial, he said. The computer age has turned that upside down. Now, everything lasts forever, whether it is insignificant or important, ancient or recent, complete or overtaken by events. Following Mayer-Schönberger’s logic, The Times could program some items, like news briefs, which generate a surprising number of the complaints, to expire, at least for wide public access, in a relatively short time. Articles of larger significance could be assigned longer lives, or last forever.

With search engine optimization - or SEO, as it's commonly known - news organizations and other companies are actively manipulating the Web's memory. They're programming the Web to "remember" stuff that might otherwise have become obscure by becoming harder to find. So if we are programming the Web to remember, should we also be programming it to forget - not by expunging information, but by encouraging certain information to drift, so to speak, to the back of the Web's mind?

Comments

Mayer-Schönberger's logic is terrible. According to Hoyt, he claims that "humans have generally remembered the important stuff and forgotten the trivial." There is no way that we can know whether the stuff that we've forgotten is trivial or nontrivial, as we have forgotten it. Any candidates that we know are trivial, like, "There used to be a doughnut shop across from the library," are things that we have not successfully forgotten.



It's entirely possible that we have forgotten a hell of a lot of crucial information. The only sign of that would be that suddenly everything would seem to have gone crazy. Perhaps our leaders would seem significantly less prudent then their predecessors. They might even invade a sovereign nation and overthrow the government without devising a plan for the aftermath.



Regarding Clark Hoyt's real dilemma, about whether to correct the falsehoods that his paper has spread, I would say that while preserving errors for the sake of the integrity of the historical record is reasonable, the paper has a moral obligation to try just as hard to preserve the integrity of the present, i.e. they should print corrections, no matter how long ago the error was committed.



Brandon

Posted by: pingswept [TypeKey Profile Page] at August 26, 2007 11:55 PM

It would be nice if New York Times updated its archive with in-article errata: a small box at the top of the article, with dated corrections, before proceeding with the rest of the archived material.

They could start with Walter Duranty, move on to Tet.....

There's an interesting second problem in there, about how high placement in the Google algorithm ensures that you become the authority on a search term, regardless of how authoritative you might actually be. The NYT article from 1991 is not the best source of info about that person, but it becomes the most important item in Google Reality, because so many people link to the New York Times domain. Google Reality is not Web Reality, which as Pierre Salinger would remind, is not the same as World Reality.

If NYT moves from a frozen printed mental model to a more bloglike mental model then it becomes easier to correct mistakes. I'm not sure how Google will adapt to the people who are targeting its first page of results -- try learning about a movie or book sometime, and most of what you'll pull up are stores. It's a problem.

Posted by: John Dowdell [TypeKey Profile Page] at August 27, 2007 09:54 AM

>> With search engine optimization - or SEO, as it's commonly known - news organizations and other companies are actively manipulating the Web's memory. They're programming the Web to "remember" stuff that might otherwise have become obscure by becoming harder to find.


Not really, to be honest. They *are* making their content visible on the search engine that has a near monopoly on the search market at the moment, but Google is not the web, nor is it guaranteed to be the ruler of search in the English speaking world forever

Posted by: Martin Belam [TypeKey Profile Page] at August 27, 2007 10:06 AM

The net shouldn't forget. But in this case it ought to correct itself. The NYT shouldn't be serving up article summaries to the world that it now knows to be misleading. They could mark the story with a correction, or an editor's note.

(Google let's you manually assign importance to your own site's page through sitemaps. You can use that to force less relevant material down the results. We use it at Zopa to deprioritize our small-print & legal pages.)

Although, a Google search for Allen Krausnow now returns 10+ bloggers talking about how misleading the top results is. (Bloggers who will all be linking to each other...) So Google Reality is potentially self-healing. (Providing you're slandered in a way that high pagerank Web 2.0 bloggers care about.)

I wonder what sort of SEO religion the NYT caught. In my search engine optimisation java .net financial development experience this stuff is generally snake oil :-D

Posted by: Thomas [TypeKey Profile Page] at August 27, 2007 10:51 AM

It would be nice to make the issue of culpability moot, bypassing the moral dilemma with a true read-write Web, but we are only in Web 2.0. The infrastructure is not yet up to what's needed.

Tom Foremski has argued for the Right to Respond of wronged parties exposed in inaccurate data. And cleaning this data is part of the goal of Web 3.0. Dilemmas such as you detail here are part of the surge towards this I think.

I've written this up better, I hope, in a post today with some links for those interested:
http://www.hunterhost.com/65/justice-truth-money/

Posted by: Ross Hunter [TypeKey Profile Page] at August 27, 2007 12:35 PM

Okay--here's a novel idea that some of us bloggers use, that might cost the NYTimes some bucks to implement, but is probably the most ethical: go into the old articles, put an "Update" on the top that then links to corrected information or to follow-up articles on the same story.

Yes, I know...it would cost *money* to do this, and lord knows we can't throw money at a fair and ethical solution to a rather nasty problem.

Posted by: tish grier [TypeKey Profile Page] at August 27, 2007 02:56 PM

Who is responsible?
Google, the New York Times or ourself, who we seem to trust in Googles rank?
See my small cartoon.

Bye,
Oliver

Posted by: Oliver Widder [TypeKey Profile Page] at August 27, 2007 04:37 PM

What? Are you saying that the NYT prints calumny (Sorry, but that is what you describe) and cannot correct itself *once* a day? And that it's Google fault? They were wrong not to correct that the day they printed it: sorry to have them realize how unprofessional most journalists have been for a long time, but I can assure you I am the least surprised given how journalistic prejudi--- sorry: angle, distorts reality.

I am stunned too that allowing the involved parties to express themselves (Isn't that what journalists are supposed to do *before* writing a paper?) is seen as an impossible chore. I am a bit surprised to see that Nick doesn't consider this option either: as technology is always to blame, it's necessarily SEOs fault is the NYT has been lying through it's teeth in 1995.

Posted by: Bertil [TypeKey Profile Page] at August 31, 2007 08:39 AM

Post a comment

Thanks for signing in, . Now you can comment. (sign out)

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)


Remember me?


carrshot5.jpg Subscribe to Rough Type

Now in paperback:
shallowspbk2.jpg Pulitzer Prize Finalist

"Riveting" -San Francisco Chronicle

"Rewarding" -Financial Times

"Revelatory" -Booklist

Order from Amazon

Visit The Shallows site

The Cloud, demystified: bigswitchcover2thumb.jpg "Future Shock for the web-apps era" -Fast Company

"Ominously prescient" -Kirkus Reviews

"Riveting stuff" -New York Post

Order from Amazon

Visit Big Switch site

Greatest hits

The amorality of Web 2.0

Twitter dot dash

The engine of serendipity

The editor and the crowd

Avatars consume as much electricity as Brazilians

The great unread

The love song of J. Alfred Prufrock's avatar

Flight of the wingless coffin fly

Sharecropping the long tail

The social graft

Steve's devices

MySpace's vacancy

The dingo stole my avatar

Excuse me while I blog

Other writing

Is Google Making Us Stupid?

The ignorance of crowds

The recorded life

The end of corporate computing

IT doesn't matter

The parasitic blogger

The sixth force

Hypermediation

More

The limits of computers: Order from Amazon

Visit book site

Rough Type is:

Written and published by
Nicholas Carr

Designed by

JavaScript must be enabled to display this email address.

What?