While most of the discussion at this week’s Open Source Business Conference was refreshingly pragmatic, focused on the commercial role and prospects of open source software, there were a few more cosmic moments. Notably, Mitch Kapor brought a bit of Wikimania to the proceedings, offering a Zen-like “meditation” on Wikipedia as a harbinger of a much broader open-source movement in the future. (Wikipreneur Ross Mayfield summarizes the talk.) Kapor believes that the community-run online encyclopedia explodes the myth “that someone has to be in charge” as well as the assumption “that experts count.” He argues that Wikipedia shows you can create high-quality products through the contributions of a broad, democratic community of amateurs, a self-governing collective operating on the internet without any hierarchy. That, in Kapor’s view, is “the next big thing.”
Kapor’s argument hinges on the contention that Wikipedia is actually good. In recent months, the quality of Wikipedia’s content has come under considerable criticism, accused of everything from libel to infantilism. Like many of the encyclopedia’s defenders, Kapor counters those criticisms by citing a recent article in the journal Nature that ostensibly proves that the quality of Wikipedia is “roughly equivalent” to that of the venerable Encyclopedia Britannica. The Nature article has become something of a get-out-of-jail-free card for Wikipedia and its fans. Today, whenever someone raises questions about the encyclopedia’s quality, the readymade retort is: “Nature says it’s as good as Britannica.”
Kapor’s remarks inspired me to take a look at that much-cited Nature article. I found that it was something less than I had expected. It is not one of the peer-reviewed, expert-written research articles for which the journal is renowned. (UPDATE: I confirmed this with the article’s author, Jim Giles. In an e-mail to me, he wrote, “The article appeared in the news section and is a piece of journalism, so it did not go through the normal peer review process that we use when considering academic papers.”) Rather, it’s a fairly short, staff-written piece based on an informal survey carried out by a group of Nature reporters. The reporters chose 50 scientific topics that are covered by both Wikipedia and Britannica, selecting entries that were of relatively similar length in both publications. For each topic, they also chose an academic expert. They then sent copies of both entries to the respective experts, asking them to list any “errors or critical omissions” appearing in the writeups. They received 42 responses.
The article itself doesn’t actually go into much detail about the survey’s findings. It says that the “expert-led investigation” revealed that “the difference in accuracy [between the encyclopedias] was not particularly great: the average science entry in Wikipedia contained around four inaccuracies; Britannica, about three.” But Nature subsequently released “supplementary information” about the survey, including more details on the methodology and a full list of the errors cited by the experts. (In total, Wikipedia had 162 errors while Britannica had 123.) Read together, the article and the supplementary information indicate that the survey probably exaggerated Wikipedia’s overall quality considerably.
First and most important, the survey looked only at scientific subjects. As has often been noted, Wikipedia’s quality tends to be highest in esoteric scientific and technological topics. That’s not surprising. Because such topics tend to be unfamiliar to most people, they will tend to attract a narrower and more knowledgeable group of contributors than will more general-interest subjects. Who, after all, would contribute to an entry on “kinetic isotope effect” or “Meliaceae” (both of which were in the Nature survey) than those who have some specialized understanding of the topic? The Nature survey, in other words, played to Wikipedia’s strength.
That’s fine. Nature is, after all, a scientific journal. But, unfortunately, the narrowness of the survey has tended to get lost in media coverage of it. CNET, for instance, ran a story on the survey under the headline “Study: Wikipedia as Accurate as Britannica.” The story reported that “Nature chose articles from both sites in a wide range of topics” and that it found that “Wikipedia is about as good a source of accurate information as Britannica.” Such incomplete, if not misleading, descriptions have informed subsequent coverage. For example, one prominent technology blogger covering Kapor’s speech this week wrote simply that “a recent study showed that Wikipedia is just as accurate as the Encyclopedia Britannica.”
Second, the Nature reporters filtered out some of the criticisms offered by the experts. They note, in the supplementary information, that the experts’ reviews were “examined by Nature’s news team and the total number of errors estimated for each article. In doing so, we sometimes disregarded items that our reviewers had identified as errors or critical omissions. In particular, as we were interested in testing the entries from the point of view of ‘typical encyclopaedia users’, we felt that experts in the field might sometimes cite omissions as critical when in fact they probably weren’t – at least for a general understanding of the topic. Likewise, the ‘errors’ identified sometimes strayed into merely being badly phrased – so we ignored these unless they significantly hindered understanding.” Since the reporters don’t document the “errors or critical omissions” that they subjectively filtered out, it’s impossible to judge whether they applied more to one publication than the other. But the Nature article implies that, beyond the errors and omissions tallied by the survey, the expert reviewers offered considerable criticism of the quality of the writing in Wikipedia: “Several Nature reviewers [commented] that the Wikipedia article they reviewed was poorly structured and confusing.” The article notes further that such criticism of readability “is common among information scientists, who also point to other problems with article quality, such as undue prominence given to controversial scientific theories.” The findings of the Nature survey, in other words, appear to filter out criticisms of Wikipedia’s quality that the Nature reporters decided went beyond their definition of “accuracy.”
Third, in reporting the results, the Nature reporters view all inaccuracies as being equal. In reality, of course, there are considerable variations in the degree and importance of the inaccuracies. Fortunately, in the supplementary information, Nature documents all the errors and omissions cited by the expert reviewers. I am no expert in the subjects covered by the survey – and my judgement may be mistaken – but my sense in reading through the lists was that the inaccuracies in Wikipedia tended to be more substantial than those in Britannica. Here, for example, is a comparison of the inaccuracies noted in the first three entries (they’re arranged alphabetically):
1. I would not use the term ‘early Homo sapiens’. Instead, use Homo heidelbergensis.
1. Cro-Magnons (early Homo sapiens) did not use the Acheulean!!
2. Date range is off, its about 1.5 my to 200 ka
3. The following statement is inaccurate and poorly written: ‘The period during which these these tools were innovated is usually thought to be the early Paleolithic era or the beginning of the middle Paleolithic era.’
4. I have no idea what this following statement means: ‘However, the Acheulean industry continued to be used by some primitive hominid cultures up until 100,000 years ago.’ It’s not correct.
5. This is an awful set of sentences: ‘by efficient scavengers, who were still preyed upon frequently by larger animals and often bewildered by their environment. Adversely, Acheulean tools gave their masters the ability to hunt and defend themselves successfully and gave them the distinction of being equally as deadly as the greatest predators of the prehistoric Earth.’ Early hominins were probably hunting and scavenging. Acheulean hominins also likely scavenged and hunted. Acheuelean tools are often associated with large carcasses, suggesting that they had access to large quantities of meat. The sentence about Acheulean hominins abilities is overstated.
6. Regarding Asia, I would say West and Southern Asia. Acheulean hominins did not spread to Eastern Asia.
7. The statement ‘It flourished roughly 400,000 to 100,000 years ago in Eastern Europe and Northern Asia.’ has nothing to do with the Acheulean, I am not sure what it means.
1. A very minor error is that Agent Orange is considered by the Vietnamese to be the cause of the diseases listed in the second paragraph from the 1970s to the present, not just from the 1970s to the ’90s.
2. The entry should include the statement that other mixtures containing dioxin were also sprayed, including Agents Purple, Pink and Green, albeit in lesser amounts.
1. This entry implies that it was the herbicides that are problematic, which is not the case. It was dioxin, a byproduct of manufacture of 2,4,5-T that is of concern. Dioxin is persistent in the environment and in the human body, whereas the herbicides are not. In addition, there was a significant amount of dioxin in Agents Purple, Pink and Green, all of which contained 2, 4, 5 – T as well. However, we have less information on these compounds and they were used in lesser quantities.
2. The entry is on the verge of bias, at least. By use of the word “disputedly” in the second sentence there is at least an implication that the evidence of harm to exposed persons is in question. That is not the case, and the World Health Organization has identified dioxin as a “known human carcinogen”, and other organizations such as the US National Academy of Sciences has documented harmful effects to US Air Force personnel.
1. The aldol REACTION is not the same as the aldol CONDENSATION.
2. Sodium hydroxide is by no means the only base to be used in the aldol and acid catalysed aldol reactions also occur (usually with concomitant loss of water).
3. The reaction steps in the second reaction sequence should be equilibria up to the dehydration step.
4. In particular, there is no mention of the acid catalysed process and scant mention of related reactions
1. The mechanisms of base and acid catalysed aldol reactions should have every step as an equilibrium process
2. The acid catalysed process should include the dehydration step, which occurs spontaneously under acid conditions and, being effectively irreversible, pulls the equilibrium through to product.
3. The statement that LDA is avoided [if] at all possible as it is difficult to handle is rubbish. Organic chemists routinely use this reagent – which they either make as required or use commercially available material.
If you were to state the conclusion of the Nature survey accurately, then, the most you could say is something like this: “If you only look at scientific topics, if you ignore the structure and clarity of the writing, and if you treat all inaccuracies as equivalent, then you would still find that Wikipedia has about 32% more errors and omissions than Encyclopedia Britannica.” That’s hardly a ringing endorsement.
The problem with those who would like to use “open source” as a metaphor, stretching it to cover the production of encyclopedias, media, and other sorts of information, is that they tend to focus solely on the “community” aspect of the open-source-software model. They ignore the fact that above the programmer community is a carefully structured hierarchy, a group of talented individuals who play a critical oversight role in filtering the contributions of the community and ensuring the quality of the resulting code. Someone is in charge, and experts do count.
The open source model is not a democratic model. It is the combination of community and hierarchy that makes it work. Community without hierarchy means mediocrity.