The wisdom of a crowd is often in the eye of the beholder, but most of us understand that, at its most basic level, “crowd wisdom” refers to a fairly simple phenomenon: when you ask a whole bunch of random people a question that can be answered with a number (eg, what’s the population of Swaziland?) and then you add up all the answers and divide the sum by the number of people providing those answers – ie, calculate the average – you’ll frequently get a close approximation of the actual answer. Indeed, it’s often suggested, the crowd’s average answer tends to me more accurate than an estimate from an actual expert. As the science writer Jonah Lehrer put it in a column in the Wall Street Journal on Saturday:

The good news is that the wisdom of crowds exists. When groups of people are asked a difficult question – say, to estimate the number of marbles in a jar, or the murder rate of New York City – their mistakes tend to cancel each other out. As a result, the average answer is often surprisingly accurate.

To back this up, Lehrer points to a new study by a group of Swiss researchers:

The researchers gathered 144 Swiss college students, sat them in isolated cubicles, and then asked them to answer [six] questions, such as the number of new immigrants living in Zurich. In many instances, the crowd proved correct. When asked about those immigrants, for instance, the median guess of the students was 10,000. The answer was 10,067.

Neat, huh?

Except, well, it’s not quite that clear-cut. In fact, it’s not clear-cut at all. If you read the paper, you’ll find that the crowd did not “prove correct” in many instances. The only time the crowd proved even close to correct was in the particular instance cited by Lehrer – and that was only because Lehrer used the median answer rather than the mean. In most cases, the average answer provided by the crowd was wildly wrong.

Peter Freed, a neuroscience researcher at Columbia, let loose on Lehrer in a long, amusing blog post, arguing that he (Lehrer) had misread the evidence in the study. Freed pointed out that if you look at the crowd’s average answers – “average” as in “mean” – to the six questions the researchers posed, you’ll find that they are, as Freed says, “horrrrrrrrrrrrrendous”:

… the crowd was hundreds of percents – yes, hundreds of percents – off the mark. They were less than 100% off in response to only one out of the six questions! At their worst – to take a single value, as Lehrer wrongly did with the 0.7% [median] – the 144 Swiss students, as a true crowd (unlike the 0.7%), guessed that there had been 135,051 assaults in 2006 in Switzerland – in fact there had been 9,272 – an error of 1,356%.

Or, as the researchers themselves report:

In our case, the arithmetic mean performs poorly, as we have validated by comparing its distance to the truth with the individual distances to the truth. In only 21.3% of the cases is the arithmetic mean closer to the truth than the individual first estimates.

So, far from providing evidence that supports the existence of the wisdom-of-crowds effect, the study actually suggests that the effect may not be real at all, or at least may be a much rarer phenomenon than we assume.

But since this is statistics, that’s by no means (no pun intended) the end of the story. As the researchers go on to explain, it’s quite natural for a crowd’s average answer, calculated as the mean, to be way too high – and hence ridiculously unwise. That’s because, while individuals’ underestimates for these kinds of questions are bounded at zero, there’s no upper bound to their overestimates. “In other words,” as the researchers write, “a minority of estimates are scattered in a fat right tail,” which ends up skewing the mean far beyond any semblance of “wisdom.”

Fortunately (or not), the arcane art of statistics allows you to correct for the crowd’s errors. By massaging the results – “tuning” them, as the researchers put it – you can effectively weed out the overestimates and (presto-chango) manufacture a wisdom-of-crowds effect. In this case, the researchers performed this magic by calculating the “geometric mean” of the group’s answers rather than the simple “arithmetic mean”:

As a large number of our subjects had problems choosing the right order of magnitude of their responses, they faced a problem of logarithmic nature. When using logarithms of estimates, the arithmetic mean is closer to the logarithm of the truth than the individuals’ estimates in 77.1% of the cases. This confirms that the geometric mean (i.e., exponential of the mean of the logarithmized data) is an accurate measure of the wisdom of crowds for our data.

Got that?

Well, it further turns out that the median answer – the centermost individual answer – in a big set of answers often replicates, roughly, the geometric mean. Again, that’s no big surprise. The median, like the geometric mean, serves to neutralize wildly wrong guesses. It hides the magnitude of people’s errors. The researchers point this fact out in their paper, but Freed, having criticizing Lehrer for a sloppy reading of the study, seems to have overlooked that point. Which earns Freed a righteous tongue-lashing from another blogger, the physics professor Chad Orzel:

Freed’s proud ignorance of the underlying statistics completely undermines everything else. His core argument is that the “wisdom of crowds” effect is bunk because the arithmetic mean of the guesses is a lousy estimate of the real value. Which is not surprising, given the nature of the distribution – that’s why the authors prefer the geometric mean. He blasts Lehrer for using a median value as his example, without noting that the median values are generally pretty close to the geometric means – all but one are within 20% of the geometric mean – making the median a not-too-bad (and much easier to explain) characterization of the distribution.

You get the sense that this could go on forever. And I sort of hope it does, because I enjoyed Lehrer’s original column (the main point of which, by the way, was that the more a crowd socializes the less “wise” it becomes), and I enjoyed Freed’s vigorous debunking of Lehrer’s reading of (one part of) the study, and I also enjoyed Orzel’s equally vigorous debunking of (one part of) Freed’s debunking.

But beyond the points and counterpoints, there is a big picture here, and it can be described this way: Even in its most basic expression, the wisdom-of-crowds effect seems to be exaggerated. In many cases, including the ones covered by the Swiss researchers, it’s only by using a statistical trick that you can nudge a crowd’s responses toward accuracy. By looking at the geometric mean rather than the simple arithmetic mean, the researchers performed the statistical equivalent of cosmetic surgery on the crowd: they snipped away those responses that didn’t fit the theoretical wisdom-of-crowds effect that they wanted to display. As soon as you start massaging the answers of a crowd in a way that gives more weight to some answers and less weight to other answers, you’re no longer dealing with a true crowd, a real writhing mass of humanity. You’re dealing with a statistical fiction. You’re dealing, in other words, not with the wisdom of crowds, but with the wisdom of statisticians. There’s absolutely nothing wrong with that – from a purely statistical perspective, it’s the right thing to do – but you shouldn’t then pretend that you’re documenting a real-world phenomenon.

Freed gets at this point in a comment he makes on Orzel’s post:

Statistics’ dislike of long right tails is *not a scientific position.* It is an aesthetic position that, at least personally, I find robs us of a great deal of psychological richness … [T]o understand the behavior of a crowd – a real world crowd, not a group of prisoners in segregation – or of society in general, right tails matter, and extreme opinions are over-weighted.

The next time somebody tells you about a wisdom-of-crowds effect, make sure you ask them whether they’re talking about a real crowd or a statistically enhanced crowd.

DlimanAlso see this recent paper in the Proceedings of the National Academy of Sciences: How social influence can undermine the wisdom of crowd effect. Crowds estimates work best when members don’t compare notes!

Michael JohnstonThe prevailing notion (the wisdom of the crowd) about The Wisdom of Crowds, therefore, is itself wrong. Toss in a little bit of the herd mentality and you have the perfect explanation for the mortgage mess and a lot of other things that the crowd should have avoided in their infinite collective wisdom.

Nick CarrDliman, That’s actually the same paper discussed in this post.

Luciano FuentesThere’s also a kind of meta-proof in the nature of the exchanges.

When the patient response of an intelligent, well-intentioned neuroscientist is itself challenged, by an equally intelligent response by the physicist, does it not highlight the tremendous weight you must attach to context and subtlety when trying to answer any question?

These two are quibbling about the nature and interpretation of statistical distributions after all – something I’ve never seen presented in any text as a starting-point for debate.

Constance CampanaDr. Martin Luther King said in his famous essay “Letter From Birmingham Jail” that “groups are more immoral than individuals.” I don’t trust crowds or myself in one.

Seth Finkelstein> it’s only by using a statistical trick that you can nudge a crowd’s responses toward accuracy.

Sigh. Nick. I like this post in terms of the general tenor, but sometimes a humanities background is just not the best, well, let’s say there’s probably some neurological area of the brain that isn’t developed. As opposed to an area which is strengthened by studying mathematics.

I’ve always wanted to debunk the “wisdom of crowds”, because it’s wrong for many reasons. One reason you’re approaching here is that one has to already know what the correct aggregation procedure is in the first place. If you know that, you’ve already solved the problem, so there’s basically no benefit. The wisdom isn’t in the crowds -they’re just dumb data for the processing. The wisdom is in the aggregation procedure (this is why Google works!).

But nothing says the aggregation procedure is a simple average. That’s just a story that’s told, and it’s not even true. Talking about “ask them whether they’re talking about a real crowd or a statistically enhanced crowd” is almost nonsensical – it’s basically “ask them whether they mean a simple average or more complicated analysis”. Which is not really the point.

Marajit.wordpress.comThe wisdom of the crowd – according to Johan Lehrer. Another case of mis-choosing your prophets. But then Lehrer is a journalist. He seems motivated by the same thing you are Mr Carr. Though while you and a few others point out the inadequacies as independents, he has voice on a very influential newspaper. Probably his views reach far more of the crowd, and in a democracy that’s as far as it goes. We’re taught to vote, not understand. The result? People read the headlines not the rough type. “The Wall Street Journal says I’m wise.” And the eccentric critics? They show no common ground that would inspire faith. Blind faith is alive and well in the land of the secular.

Nick CarrSigh. Seth, we seem to be in raging agreement, but perhaps your lack of a humanities background prevents you from seeing that. :-)

At a purely statistical level, I fully appreciate that “nothing says the aggregation procedure is a simple average.” If your goal is to take a distribution of estimates with a fat right tail and derive the most accurate estimate from that distribution, then of course you’re going to apply a geometric mean, or a median, rather than a simple average. The point is that after you’ve done that, you shouldn’t say that what you’ve found is “the wisdom of a crowd.” What you’ve found is a statistical means of removing the “crowdiness” from a crowd (ie, its tendency to err on the high side). As you say, the wisdom from this exercise comes not from the crowd; it’s comes from the aggregation procedure. As I say, it’s the wisdom of statisticians, not the wisdom of crowds. As Freed says, you’ve left the real world for the world of statistics (so don’t pretend you’re documenting a real world phenomenon). The three points are the same point, and it is indeed “really the point.”

To make this as clear as possible (even for math majors), I’ve added a sentence to the end of the penultimate paragraph.

Nick

CwThe problem is that data which has not been statistically sorted is all but useless. Crowds can include people with IQ’s below 70, children, mass murders, the insane… need I go on? Statistical sorting and elimination of the outer bounds is a fundamental piece of statistics and, so long as it is done well (i.e. there is a legit outer bounds) it can be useful.

Of course, this is why the median is preferable in many circumstances — averages, even with the elimination of outliers, are often deceptive.

Nick CarrCw,

Exactly. In order to be rendered “wise,” a crowd needs to be statistically purified of its weakest members. At which point, of course, it is no longer a crowd.

Nick CarrBy the way, here are the six questions asked in the study:

1. What is the population density in Switzerland in inhabitants per square kilometer?

2. What is the length of the border between Switzerland and Italy in kilometers?

3. How many more inhabitants did Zurich gain in 2006?

4. How many murders were officially registered in Switzerland in 2006?

5. How many rapes were officially registered in Switzerland in 2006?

6. How many assaults were officially registered in Switzerland in 2006?

If you went out and posed these questions to experts in the relevant disciplines – Swiss demographics, European geography, and Swiss crime – I’m going to guess the experts would kick the crowd’s collective ass, using whatever statistical measure you want to apply to the crowd’s answers (arithmetic mean, geometric mean, or median). Anybody want to disagree with that?

In fact, if anybody knows such experts, please ask them the questions, record their answers and post them here in a comment. (Don’t let them look up the answer – they have to draw on their own expertise.) Thanks.

Philip KlopNick,

It’s still a real-world phenomenon, and not just because satisticians and their tricks are as much part of the real world as you and I ;)

The basic question underlying all this is whether an anonymous “crowd” dataset can yield more accurate estimates than those made by individual “experts”. And the answer seems to be “yes”!

A smart statistician who wants to make a prediction and has no prior information about the problem, should prefer the crowd data to an expert’s opinion. What particular method he decides to use to extract this information from the dataset, has nothing to do with the fundamental question. He is free to use whatever tool he chooses. There’s no prescription that he can only use the mean, or the median, or…

You seem to be saying that “if you filter the anonymous dataset it’s no longer a crowd”, but it is! The extremes are extreme only relative to the other answers, no crowd member is excluded a priori. If filtering out the extremes of the dataset yields a more accurate prediction, then that’s what any rational person should do. The input is a crowd dataset, the output should be a number, regardless of the method one uses. At which point in the procedure the crowd loses its “crowdness” is completely irrelevant (a dataset is not a crowd to begin with; “the wisdom of crowds” is just a catchy label), all that matters is the accuracy of the final answer.

Nick CarrPhilip,

At which point in the procedure the crowd loses its “crowdness” is completely irrelevant (a dataset is not a crowd to begin with; “the wisdom of crowds” is just a catchy label), all that matters is the accuracy of the final answer.Right. I totally agree. But (if I may repeat myself again) what is irrelevant to the statistical problem is quite relevant to the ideological aura that surrounds discussions of “the wisdom of the crowd.” (The phrase has been sold as more than a meaningless “catchy label,” even though that’s how you see it.) If, as you say, you have to remove the “crowdness” from the crowd in order to find the “wisdom of the crowd,” then you kind of have a semantical problem, no? You can’t take a crowd of people, turn it into a crowd of numbers, massage the numbers to get an optimal result, and then go back and say, “See, the crowd of people is wise!” That’s cheating.

Is there anyone else out there who can explain the point I’m trying to make? Because I seem to be failing miserably.

Seth Finkelstein> Is there anyone else out there who can explain the point I’m trying to make? Because I seem to be failing miserably.

Nick, the point as I see it is the deliberate mystification of data-processing as some sort of populist, ANTI-EXPERT, new discovery. This is partially fueled from the reality that new large masses of data and new computational applications have yielded some amazing stuff (e.g. Google). But that’s also spawned an inevitable hucksterism and con-jobs which are trying to sell it for their own ends in business and politics.

See a blog post I wrote a while back:

“Wikipedia, De-skilling, and The Wisdom Of Darts”

http://sethf.com/infothought/blog/archives/001017.html

“There’s a well-known experiment in picking stocks: dartboards are competitive with individual money managers – but nobody talks about the “wisdom of darts” (because there are no DartBoard 2.0 salesmen …).”

However, I think you’re “failing miserably” because you’re getting caught up in focusing on the details of data-processing itself – claiming, as I see it, that if every single data point is not used, that’s somehow wrong or cheating. Well, no, excluding outliers is quite reasonable sometimes – it can be cheating, but it’s not so _per se_. There’s entire statistical literature about when it’s justified or not. This is what I’m talking about with the humanities/mathematics divide.

In essence, you’re getting the right answer for some of the wrong reasons. There’s a math joke about this process:

64 (imagine slashes through the sixes)

—

16 = 4 (cancel the sixes)

Yes, 64/16 is equal to four, but “cancel the sixes” is not how you do it.

That’s sort of like this statement of yours above:

“As Freed says, you’ve left the real world for the world of statistics (so don’t pretend you’re documenting a real world phenomenon)”

Statistics *is* the real world, that’s the whole point of it. You’re trying to make a social point, but at times you’re phrasing it, wrongly, as some sort of technological methological objection.

Seth FinkelsteinFixed link (if anyone is still reading …)

Wikipedia, De-skilling, and The Wisdom Of Darts

And note mentioned in

Junk science – the oil of the new web

Philip KlopNick,

The problem demands that the crowd be somehow reduced to a single number, thus the “crowdness” is inevitably lost, independent of the particulars of the reduction. This is not cheating, but the very rule according to which the game is to be played.

As to your larger point, let me draw the following analogy. Is the wisdom of “The Shallows” contained in the book itself, in the brain of its author, or in that of the educated reader? Outside the context of a

philosophicaldebate, you can’t reasonably take issue with someone taking any of those positions (as long as he does so non-exclusively).Similarly, in the context of an informal debate (including all scholarly debates except those dealing specifically with the nature of information), it’s fair to say that the wisdom is contained in the crowd (the collective author), the dataset (the book) and the statistician (the educated reader).

Philip KlopSeth,

Nobody talks about the wisdom of darts because the implication is supposed to be that moneymanagers are charlatans (they perform no better than “random”), not that darts are wise. If darts would perform better than random (something crowds certainly do!), they could become promising contenders for the predicate “wise”.

I think context is pivotal in determining how effective a “wisdom of the crowd” approach will be: it can be succesful when you can reasonably expect a collective to have more relevant extractable information than a single expert (people are never just guessing, they are always extrapolating from whatever information they have), but is a waste of time otherwise. In a sense, it’s similar to a Monte Carlo method, but with lots of potential for difficult to detect bias.

Seth FinkelsteinPhilip,

Note my following remark – “because there are no DartBoard 2.0 salesmen”. Imagine the possible line “You, small investor, can benefit from our DARTBOARD 2.0. You don’t need those elite priest money managers, who think they’re so smart, because they studied finance and accounting. With the DARTBOARD 2.0 kit we’ll sell you for a small portion of their fees, you can do even better, with the magic of the

wisdom of darts.”(The ironic thing is that this pitch is actually true :-) ).

What I’m trying to express is that there’s reasons behind all this. It’s tied into a longstanding campaign to denigrate expertise, and replace it with marketing and propaganda . It’s not new – pointy-headed intellectuals, the Common Man archetype, etc. But it’s worth analyzing this particular formation. Also, I fear it’s disturbingly successful, I think due to connections to wider successful trends against scholarship and replacing that with partisan advocacy.

Every confidence hustler tells the mark how wise the sucker supposedly is, and this con is no exception.

Jake HammerI just want to quibble with this:

“Statistics *is* the real world, that’s the whole point of it.”

If you’re appealing to same senses of the terms here as NC, this is false. ‘Statistics’ is not ‘the real world’. An intuition pump from a clearer case: I can (a) drop a ball off the roof of my building, and then (b) analyze this phenomenon as an energy transformation problem. But a!=b. To claim that the energy transformation problem “is” the ball dropping from the height is, to be charitable, naive. Likewise for the converse.

Nick CarrRight. The map is not the territory.

So, anyway, I’ve been pondering this some more, with a view to understanding what conception of “the crowd” (the actual people making actual individual judgments) each of the statistical averaging techniques is expressing.

Sir Francis Galton, back in his original 1907 article on estimating the weight of an ox, preferred the median because he thought it best expressed the judgment of the crowd as a democratic entity. He wrote: “According to the democratic principle of ‘one vote one value,’ the middlemost estimate expresses the vox populi, every other estimate being condemned as too low or too high by a majority of the voters.” So the median represents the “wisdom” of the crowd not as a group of individuals but as a democratic polity – extreme views are weeded out. (Indeed, Galton used the median even though the mean actually provided a more accurate estimate in this particular case.)

James Surowiecki, at the beginning of his book

The Wisdom of Crowds, suggests that the mean is the right measure for “the collective wisdom” of the crowd. He writes, referring to Galton’s experiment: “If the crowd were a single person, that [ie, the mean] was how much it would have guessed the ox weighed.” The mean treats the crowd not as a democratic polity but as a collective of individuals, each of whose judgments receives an equal weight. There’s no attempt to weed out extreme judgments because they are every bit as much a part of the crowd’s “collective wisdom” as more accurate judgments. I would argue that when most of us think about the wisdom-of-crowds effect, we’re thinking in Surowiecki’s terms – and hence we’re thinking of the simple arithmetic mean.OK. So I understand the conceptions of the crowd implicit in both median (polity) and mean (collective wisdom). I’m not sure I understand the conception of the crowd – again, I emphasize I’m talking about the human crowd – implicit in the geometric mean. If someone might enlighten me, I would appreciate it. What version of the real world is the geometric mean giving us?

Seth FinkelsteinJake – As I hoped was clear from context, I meant “Statistics *is* the real world, that’s the whole point of it” in reply to “As Freed says, you’ve left the real world for the world of statistics (so don’t pretend you’re documenting a real world phenomenon)”. That is, the whole point of statistics is to document real world phenomenon – how to make sense of a collection that otherwise would have no meaning. The idea that the arithmetic mean of all data points is somehow not statistics and hence “real world”, but any more sophisticated analysis is somehow “statistics” and hence less real, seems to be just because the former is simple but the latter is less familiar.

(it reminds me that in the history of mathematics, people first had a lot of trouble with zero, and then with negative numbers – these weren’t consider “real world” when they came into widespread use. But now even humanities majors consider them among the things an educated person should understand :-)).

Nick, when you say “James Surowiecki, at the beginning of his book The Wisdom of Crowds, suggests that the mean is the right measure for “the collective wisdom” of the crowd.” – NO NO NO. I don’t think he meant AT ALL any formulation like “In every case, in every situation, “collective wisdom” is defined to be the arithmetic mean”. You’re misreading a single example as a general statement. If you “would argue that when most of us think about the wisdom-of-crowds effect, we’re thinking in Surowiecki’s terms – and hence we’re thinking of the simple arithmetic mean.”, I’d say that’s not Surowiecki’s terms, and it’s a straw-man oversimplification of the idea of data-processing/aggregation.

“I’m not sure I understand the conception of the crowd – again, I emphasize I’m talking about the human crowd – implicit in the geometric mean. If someone might enlighten me, I would appreciate it. What version of the real world is the geometric mean giving us?”

This is getting a bit caught up in metaphor. In politics, we talk about “the center between left and right”, but there’s no literal left and right and hence a center (and what is “distance”?), We recognize that’s just a handy terminology (and tedious people sometimes belabor that when they note is doesn’t work everywhere). If you’re drawing societal decisions implications from arithmetic mean, and median, then the geometric mean, is, I don’t know, some sort of proportional voting scheme where larger entities get more votes than smaller ones, but done with limiting factors to keep the relative power somewhat balanced. There’s probably a better analogy, but I can’t think of it now.

Nick CarrSeth, Your reformulation of my formulation of Surowiecki’s point is quite a distortion. Here’s what Surowiecki writes:

Note that Surowiecki, who is obviously thoroughly familiar with Galton’s work, chooses to use the mean rather than following Galton himself in choosing to use the median. (Galton never even mentioned the mean in the original paper, so far as I can see, only in a follow-up letter.) Surowiecki is not going against Galton for no reason. He has thought about it, and he has explicitly decided to use the mean rather than the median (as I too would have done, by the way). Surowiecki is here, as I wrote, “suggest[ing] that the mean is the right measure for ‘the collective wisdom’ of the crowd.” Please note my use of the verb “suggest.” I am not saying, in your words, “In every case, in every situation, ‘collective wisdom’ is defined to be the arithmetic mean.” As we both know, Surowiecki goes on in the book to discuss a variety of manifestations of “the wisdom of crowds” as well as the knuckleheadedness of crowds.

Anyway, the point of my goddam comment (here he goes again) is that an intelligent person, looking to represent a particular real-world conception of a group of human beings, might choose either the median or the arithmetic mean as a statistical technique. But what conception of a real-world crowd is the geometric mean attempting to represent? Seth’s game, but labored, explanation doesn’t work for me because it’s an attempt to *retrofit* a statistical technique chosen purely for its statistical efficacy back to a real-world crowd. That’s different from choosing the geometric mean with the intent of representing some conception of a crowd.

Nick

ZjelvehHi Nick,

I have to respectfully disagree with you here.

There are many measures of central tendency and the arithmetic average is just one of them.

It happens that when the underlying distribution is symmetric (as in the galton example above) then the mean does a good job of capturing the average (as does the median).

But when we’re dealing with non-symmetric and skewed distributions, then the arithmetic mean loses some of its usefulness. For example, in the case of U.S. income, the mean income is significantly higher than median income. Which is the better measure of the average? (BTW, the dictionary definition of ‘average’ is NOT the arithmetic mean.)

And on page 3 in the paper you mention the authors describe that their data is not normally distributed. I try to explain a little bit why the geometric mean might make more sense in a post here.

I’m not a Nassim Taleb disciple, but one of the nice things I’ve heard him say is that we have to become better at distinguishing processes that are generated from a normal distribution versus those generated from other types of distributions.

Zubin

Nick CarrZubin,

You’re not disagreeing with me, respectfully or otherwise. I agree with everything you write. I understand why the geometric mean is useful in the statistical analysis of right-skewed (or left-skewed) distributions. My point is that the skewedness of the distribution tells us something important about the wisdom of the crowd (and its limits).

The skewedness is not just a statistical phenomenon. It is also a human phenomenon.

Thanks, though, for the comment,

Nick

ZjelvehAs you say, this is an argument that could go on forever.

Rereading your original post and your recent comment, it seems that your main contention is that the researchers appear to have first looked at their results through the arith. mean, didn’t see a wisdom of crowds, and then tried another method to see if they could pull out a desired effect. As you likely know, this is not at all an unusual workflow for the typical researcher. But it’s also frowned upon by most hardcore-statisticians and I am sympathetic to that. It’s the negative connotation that used to be associated with ‘data mining’.

But, it doesn’t necessarily follow from this that we shouldn’t look for wisdom-of-crowd effects through the geometric mean or whatever else might work.

If there is a theory to support it — as I believe there might be here based on the observed distribution and properties of the geo. mean — then why not?