« When economists blog | Main | The user-unfriendliness of enterprise apps »

Veropedia and the Wikipedia mine

October 29, 2007

A year ago, I wondered "why no devious entrepreneurs have made a concerted effort to take Wikipedia's content, which is of course free to be reused in any way, and reformat and rebrand it as an attractive commercial site." Why not mine Wikipedia's unexploited commercial value?

Companies like Metaweb, through its Freebase service, and Radar Networks, through its Twine service, have begun mining Wikipedia, at least indirectly. They use the site's content to help populate the Semantic Web databases they're building. Embedded in Wikipedia's structure and links is a lot of information about the relationships between things, and this information may have considerable commercial value in building a more intelligent web.

Now, we have a startup - Veropedia - that hopes to more directly exploit Wikipedia by presenting, in effect, a premium edition of the free encyclopedia. It's a cream-skimming - or cream-scraping - operation that aims to make money through advertising. As Slashdot explains, Veropedia is an effort "to collect the best of Wikipedia's content, clean it up, vet it, and save it in a quality stable version that cannot be edited. To qualify for inclusion in Veropedia, a Wikipedia article must contain no cleanup tags, no 'citation needed' tags, no disambiguation links, no dead external links, and no fair use images after which candidates for inclusion are reviewed by recognized academics and experts."

Veropedia claims that its relationship with Wikipedia is symbiotic - that even as it sucks up the Wik's economic value it will help improve the quality of its host. As one of Veropedia's developers explains, "Veropedia has a very comprehensive article checker that points out just about every flaw with an article that a computer program can find. But articles aren’t edited on Veropedia. Veropedia contributors must go and edit the article on Wikipedia, fixing up all the flaws, until a quality version is ready for importation to Veropedia. So everyone wins: both Wikipedia and Veropedia get improved articles."

This is a slick variation on the sharecropping business model. Veropedia doesn't even have to invest in the upkeep of the plantation; it just swoops in, grabs the most valuable crops, and sells them down the street at its own farm stand. To switch back to the mining metapher, the Veropedians don't have to get their hands dirty - they leave the shovels and pickaxes to the Wikipedians. Here's how the self-described "group of Wikipedians" who started Veropedia put it: "if you think of Wikipedia as a diamond mine, we think of ourselves as jewelers who provide a finished product to the public."

Will it work? Probably not. A quick look at the Veropedia site doesn't reveal much added value. It looks like a shoestring operation that's pretty much doing straight screen-scrapes at this point. In following links, you end up bouncing back to Wikipedia in a confusing way. And I don't see any evidence of these "recognized academics and experts" who are supposed to be reviewing and "verofying" everything. If you're going to monetize Wikipedia's content through an alternative encyclopedia site, you first have to overcome Wikipedia's momentum - and that means offering users clear and compelling benefits. Veropedia doesn't.

There may be diamonds in them thar Wikipedian hills - or at least some sparkling chunks of cubic zirconia - but Veropedia is still miles from pay dirt.


Forking or repackaging Wikipedia is one of those ideas that sounds a lot easier than it is.

No fork would have the fantastic Google-ranking-power that Wikipedia has.

Posted by: Seth Finkelstein [TypeKey Profile Page] at October 29, 2007 01:05 PM

Okay, so some other site comes up with an algorithm that exposes what's goofy on Wikipedia. Then Wikipedia comes up with one too and the other guys are unnecessary. That seems a losing game.

The "algorithm" that seems in place is that I think people are learning how to use Wikipedia. If it's a verifiable fact, like a date something happened, it's almost certainly right. If it's a matter of opinion, it's probably biased left, but you can read the article to see if the other side got in there, or if it has a strong whiff of "needs cleanup." And if something ever got mentioned in The Simpsons or inspired an indie band song, it's absolutely certain to be in there.

Beyond basic fact-checking or instant briefing on something (which one was the Thirty Years War again?), though, you should know that you need to go somewhere else. (I realize lots of high schoolers don't see it that way, but add that to the things you LEARN in high school.) I used to keep the mini Columbia Encyclopedia by my desk in case I suddenly needed to know when Bertrand Russell died. Wikipedia serves the same function-- and no more.

Posted by: Mgmax [TypeKey Profile Page] at October 29, 2007 02:21 PM

Wikipedia's strength isn't its accuracy but its comprehensiveness.

My 7th-grade daughter asks about the US Constitution for her history class. Where do I send her? Wikipedia. I know it will be accurate enough; she needs to know which amendment covers cruel and unusual punishment, and the subtleties of the Supremes' changing stance on the death penalty are irrelevant.

I saw a picture of Robert Plant and wondered how old he was. Wikipedia.

Who was it who grabbed the high ground at Gettysburg? I know Wikipedia will remind me it was Buford and I don't need to dig out my copy of The Killer Angels.

Who makes disk partitioning software? It's in Wikipedia.

I know that for any common knowledge item, it will almost certainly be in there, and the info will be accurate enough. One-stop shopping, Internet style. It's faster and easier than Google or Live Search.

If I really want to understand Robert Plant's singing, I'll listen to how he handles phrases on Song to the Siren compared to the Tim Buckley original. When my daughter hits high school and really wants to dig into this country's thinking on the death penalty, I'll suggest to her the actual court decisions. And it wasn't until I stood on the broad field called Oak Ridge and looked back at Cemetary Hill that I began to understand just what Buford's insight meant.

But below that level -- for most of us, most of the time -- Wikipedia suffices. It's good enough.

Attacking it for accuracy is like complaining that the Red Sox defense isn't as good as that of the Rockies. It's true, but it's not dispositive in terms of the end result.

Posted by: George Geist [TypeKey Profile Page] at October 29, 2007 03:12 PM

Wiki is a very useful site however it does have a very distinct left leaning slant. Like all sources, one needs to evaluate content critically more than an algorithm can provide.

Posted by: Sophos [TypeKey Profile Page] at October 29, 2007 07:11 PM

> Wiki is a very useful site however it does have a very distinct left leaning slant.

Maybe so. Wikipedians "do nuance". If that makes them elitist and intellectual and people perceive them as left leaning, so be it. In a time when the right has allowed themselves to be represented by the likes of Fox News, I can't say I sympathize.

I'm quite sure properly attributed, factual information from the right is welcome in the Wikipedia.

Posted by: smarvelous [TypeKey Profile Page] at October 30, 2007 03:25 PM

"The facts have a liberal bias." - Steven Colbert

Posted by: Seth Finkelstein [TypeKey Profile Page] at October 30, 2007 05:28 PM

"Maybe so. Wikipedians "do nuance". If that makes them elitist and intellectual and people perceive them as left leaning, so be it."

Yeah, intellectual, that's the first word that comes to mind when noticing that there's 2000 words on the history of France and 10,000 words on the history of the Buffyverse.

Posted by: Mgmax [TypeKey Profile Page] at October 30, 2007 09:58 PM

Checked Veropedia's Henry James article, which I hauled through the FA gauntlet on Wikipedia. Vero's version was a word-for-word scrape from Wikipedia. If this is all the site offers, why bother? At least answers.com combined the article with content about James from several other sources.

Posted by: Casey Abell [TypeKey Profile Page] at October 31, 2007 11:55 AM

By the way, the "recent changes" list on Vero shows something like forty edits per day. Even Citizendium does at least ten times that pace.

Posted by: Casey Abell [TypeKey Profile Page] at October 31, 2007 12:02 PM

Well, you speak of Veropedia like it is trying to be a for-profit commercial venture. This is speculation. You also say that "to switch back to the mining metapher (sic), the Veropedians don't have to get their hands dirty - they leave the shovels and pickaxes to the Wikipedians."

The truth is, however, that all of the contributors to Veropedia were selected by Danny Wool. Each one of them is a heavy contributor to Wikipedia. Many of the uploaded articles were actually created or substantially worked upon by the Veropedia members who uploaded them.

Danny has plans to speak to academics and experts in the field to get their opinion of the articles that have been uploaded.

There are a number of issues with the project: the parser needs work on it's interface, it is actually just a scraper at the moment of a particular revision so it has bugs, and there are many more great articles to upload. But... it's a new project.

Anyway, at the very least, it's not doing anyone any harm. I don't really see where the problem is, but then again I'm one of those who upload via the site.

Given that you don't really like Web 2.0, and you dislike Wikipedia even more, your comments are really just pure speculation. You don't really have any more clue than the next man about Veropedia!

Posted by: Ta.bu.shi.da.yu [TypeKey Profile Page] at November 1, 2007 08:24 AM

Well, you speak of Veropedia like it is trying to be a for-profit commercial venture. This is speculation.

Veropedia is registered in the state of Florida as a for-profit commercial venture, as documented here.

Posted by: Nick Carr [TypeKey Profile Page] at November 1, 2007 11:40 AM

Post a comment

Thanks for signing in, . Now you can comment. (sign out)

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)

Remember me?

carrshot5.jpg Subscribe to Rough Type

Now in paperback:
shallowspbk2.jpg Pulitzer Prize Finalist

"Riveting" -San Francisco Chronicle

"Rewarding" -Financial Times

"Revelatory" -Booklist

Order from Amazon

Visit The Shallows site

The Cloud, demystified: bigswitchcover2thumb.jpg "Future Shock for the web-apps era" -Fast Company

"Ominously prescient" -Kirkus Reviews

"Riveting stuff" -New York Post

Order from Amazon

Visit Big Switch site

Greatest hits

The amorality of Web 2.0

Twitter dot dash

The engine of serendipity

The editor and the crowd

Avatars consume as much electricity as Brazilians

The great unread

The love song of J. Alfred Prufrock's avatar

Flight of the wingless coffin fly

Sharecropping the long tail

The social graft

Steve's devices

MySpace's vacancy

The dingo stole my avatar

Excuse me while I blog

Other writing

Is Google Making Us Stupid?

The ignorance of crowds

The recorded life

The end of corporate computing

IT doesn't matter

The parasitic blogger

The sixth force



The limits of computers: Order from Amazon

Visit book site

Rough Type is:

Written and published by
Nicholas Carr

Designed by

JavaScript must be enabled to display this email address.