In The Google Enigma, I argue that the key to understanding Google’s business is complements theory. Literally everything that happens on the internet is a complement to the company’s main search and ad-serving businesses. Any net activity not only provides a potential new opportunity to distribute ads; it also provides a new opportunity to collect data, and it’s the richness of Google’s data that underpins its success in both search and ad-serving (which, in a virtuous cycle, bring in even more data).
In an illuminating post today, Brad Burnham explains why the collection of data is of such fundamental importance to Google’s competitive strategy:
In economic terms data has an increasing marginal utility [while] most physical objects have a decreasing marginal utility. When it is raining my first umbrella keeps me dry, a second may be handy if the first blows out, but a third is unlikely to be used. This is true of shirts, steaks, houses, of almost anything you can think of except data. Data has the opposite characteristic. Each incremental point of data adds value to the ones you all ready have. It is easy to see this in the context of an advertising network. If the ad network knows that a user is female it can show more relevant ads. But, If the ad network knows that female’s age, it can do even better, and data about location, household income, and recent web sites visited all add value to the existing data points, making it possible to show more and more relevant ads. Google’s services all benefit from additional data albeit in different ways.
More than that, Google’s core profit-making activity – distributing tailored ads – benefits from all the data collected by all its other services. In many cases, one can hypothesize, Google can afford to operate services at a loss simply because the data it collects about users, their preferences, and their patterns of activity is so valuable to its core business.
“So what does all this mean about the market for web services?” asks Burnham. “It means that we all need to begin to think about the degree to which Google’s enormous data asset will allow it to dominate this important sector.” Because data has an increasing marginal utility, having access to more data can provide, over the long run, a profound competitive advantage across a wide swath of web businesses – particularly those supported by advertising. That kind of dominance can end up putting a damper on entrepreneurship and innovation. “The source of the threat here,” Burnham explains, “is a data differential. Google has so much more data at their fingertips that even if a startup does a much better job leveraging data to deliver recommendations, Google could potentially provide a better value proposition to the end user with an inferior algorithm powered by more data, sourced from a broader range of services.”
Whether it’s increasing returns to scale, the network effect, or the increasing marginal utility of data, the great decentralized net unleashes strong forces that promote the centralization and consolidation of information, profits, and control. The big get bigger, and then they get bigger still.
I agree with the conclusion of the article, but I’m not sure data always has increasing marginal utility. Sure knowing she is female helps. Knowing she is single helps. Knowing her age helps.
Does knowing her favorite Golden Girls character help?
So do I, the statement from Brad Burnham suggests a possible antidote to data collection;
“The shift that unlocks another era of innovation will occur when users begin to understand their role in this ecosystem and have the tools at hand to direct what is now an unconscious contribution in a way that insures continued innovation on their behalf.”
Alan
Alan, so far the average man seems to know less and less of his role in the variety of “ecosystems” he participates in (often unknowingly), and as these roles multiply there’s no reason to believe one would be more conscious of them.
Mr Burnham’s analogies are flawed (even specious). If you force me to take a second identical umbrella, I’ll force you to take a second piece of identical data — in this case the umbrella actually has more marginal utility than the second piece of identical (redundant) data.
A better analogy is umbrella, raincoat, boots. Or steak, wine, salad, dessert. These have fantastic marginal utility. At least as much as additional (different) data.
Along the lines of what Chris said, ask any investor: “more data” does not necessarily make you a better stock picker. Especially if the data is BS (or manipulated, or fake!). Same goes for advertising.
Of course the people selling you the data want you to believe it does, and the reality does not fit in with someone who might be looking to unload one of his investments (twitter, etc) onto Google :)
Besides, almost everyone I know uses adblockers/filters anyway.
Even Google conceded to the opposite argument that more data does not have great marginal utility beyond 18 months or so. They announced that they would not be storing search results beyond a set period of time.
A quick read of Taleb’s book – Black Swan or Fooled by Randomness – would do the reader (and the blogger) much good.
-Anshu
I’m not sure data always has increasing marginal utility
Not every piece of data will necessarily provide added value, but in general data exhibits increasing marginal utility, as Google’s success clearly shows.
so far the average man seems to know less and less of his role in the variety of “ecosystems” he participates in (often unknowingly), and as these roles multiply there’s no reason to believe one would be more conscious of them.
Precisely correct. The theoretical “antidote” is an antidote to a problem that most people don’t consider a problem.
Mr Burnham’s analogies are flawed (even specious).
I don’t know. They seemed to me like a nice way to illustrate an economic concept in an easily understandable way.
almost everyone I know uses adblockers/filters anyway
You run with a rarefied crowd.
Even Google conceded to the opposite argument that more data does not have great marginal utility beyond 18 months or so. They announced that they would not be storing search results beyond a set period of time.
Google conceded no such thing. I believe it continues to store search terms indefinitely, but it did agree, in response to privacy concerns, to de-link the terms to some personal identifiers after 18-24 months. But saying that the marginal utility of search terms may decay over time is hardly the same as saying that collecting lots of data doesn’t help Google enhance the value of its services and compete successfully with rivals.
There is also something else going on here that is fundamentally wrong.
What Burnham is really saying – and as we’ve seen over and over with Google – is that they are not putting their customers first.
It appears that their every intention to provide a product (particularly a business application) is driven first and foremost about the data and information that it can capture – rather than a commitment to serving the customer’s best interest.
This is a strategy that will continue to haunt them and bite them.
Consumers are more likely to “accept” Google’s ulterior motives – but businesses are not as likely.
I agree with others that more data isn’t always better, and there are many interesting angles to this question.
More data of some sorts is probably only of increasing value to people who might do ill with it. Who other than the Stasi want to know exactly where you go and when, what sexual peccadilloes you have, etc.? (Okay, sex shops and escort services might want to know the latter.)
Nick: Google…continues to store search terms indefinitely, but it did agree, in response to privacy concerns, to de-link the terms to some personal identifiers after 18-24 months
Google’s action here concedes the big problem: that collection of vast amounts of detail is liable to cause a backlash, which is definitely not in the interest of Google, advertisers, or anyone else with innocent uses for that data. All it’s going to take is some scandal where, for example, a president is impeached because he was caught looking for three seconds at an Amazon page for “Perverts of the Caribbean,” to which he was sent by some malware that landed on his computer. Anyone who’s ever had their identity stolen or an error in their credit report knows a big downside to the collection and sale of personal data.
But this whole area is very interesting.
Forgot this point: Yahoo has suffered considerable embarrassment over ratting out Chinese dissidents, and this illustrates a thinning line between data gathered by companies like Google and Yahoo, and government databases.
Right on cue, the FBI has announced a big biometric database project:
http://news.yahoo.com/s/nm/20071222/tc_nm/fbi_biometrics_dc
Let’s have a show of hands for those who trust the FBI not to abuse personal data.
I’m not paranoid about stuff like this, but it becomes easier every day to understand people who are.
“Whether it’s increasing returns to scale, the network effect, or the increasing marginal utility of data, the great decentralized net unleashes strong forces that promote the centralization and consolidation of information, profits, and control. The big get bigger, and then they get bigger still.”
First, I disagree that the marginal utility of data always increases so even if Google has/had/will have more data than Yahoo or MSN – it may not be a competitive edge.
Second, I don’t need to buy the world’s books and keep them in my house to get smarter. The strategy of storing data in-house (by Google) goes somewhat against the model of the net – after all there is a reason we have URIs. Right? If one were to assume that storing data in-house is a competitive edge then shouldn’t the same apply to enterprises too – why should a bank outsource to a SaaS banking application vendor?
Third, there are other entities beyond the obvious few – Google, Yahoo!, Amazon, etc. that have a lot of user specific data. This includes banks and credit card companies, healthcare providers, grocery stores with membership programs, telcos and so on. In fact, Telcos – if they were to ever get their act together – could easily mine the social graph.
In summary, I do agree with you that Google is in a pretty good position. But it is my view that it has more to do with its ability to mine the data than the quantity of the data. In today’s world, there is no dearth of data.
Nick
I am a bit surprised by this post and I find it quite irrelevant in the light of recent Google announcements. Why do you think that “data” is the ultimate “ad” weapon? There is one which is a lot simpler: how about having your personal adsistant? How about I tell and train google and let it eavesdrop on what I do such that I only get the kind of ads I am really interested in. The adsistant would work every time I access media: TV, Phone, Internet. Ideally, I would like it to work when I ask for it, but that’s not going to fly.
Don’t you get it? Google has now a stronghold on each channel, they can deliver a single ubiquitous adsistant. Don’t you think advertisers will spend their advertising dollars on somewhere else than Google? With that kind of money, Google is going to buy shows to display on YouTube. Current TV networks are going to become Content Producers. The Ad market is going to go away completely from them.
I am actually surprised that Apple has not entered the ad market. After all they are one of the best potential competitor of Google and if they don’t they will be marginalized (both AppleTV and iPhone -and iTunes), because at the end of the day, I, as a consumer, view a personal adsistant as a killer app to help me live (and spend) better. I don’t see how “data” really plays a major role here, sure it is there to train the adsistant, but the name of the game is really about controlling channels. Microsoft, as usual, is about ten years behind.
JJ-
Anshu,
> In fact, Telcos – if they were to ever get their act
> together – could easily mine the social graph.
How can I put this ? Cough, cough — They are. Big time. Only thing is: it’s so good, it’s really concerning, and they can’t advertize it too much, but if you have a marketing pitch for: “I know who you friends are, let me lower your bills to all of you at once.” that is not creepy, you are rich.
“Mr Burnham’s analogies are flawed (even specious). If you force me to take a second identical umbrella, I’ll force you to take a second piece of identical data — in this case the umbrella actually has more marginal utility than the second piece of identical (redundant) data.”
This is a fair point. I was comparing data as a class of things to specific items of clothing. The correct analogy is probably between data as a class and clothing as a class or between a specific data point and a specific item of clothing. It is true that a single data point probably has a decreasing marginal utility to a single user.
But I think dubdub is wrong to dismiss the idea that data has increasing marginal utility. Viewed at as a class of things, clothing still has a decreasing marginal utility. You can only use so much of it. But data as a class seems to be increasingly valuable at least to a point.
I acknowledge that it is too simplistic to say that clothing has a decreasing marginal utility and that data has an increasing marginal utility. If all you have is a shirt, then a pair of pants is useful. A second pair of pants may also be useful for fancy occasions, so both data and clothing initially have an increasing marginal utility. It is also probably true that the incremental value of an additional data point at some point begins to fall. The difference is the shape of the curves. It seems intuitively that the data’s increasing marginal utility lasts longer and that when it begins to decrease the down slope is flatter that with clothing. (I realize that the idea of fashion really messes up clothing as an analogy and that the marginal utility of another pair of shoes may be different for different people – so don’t take any of this too seriously)
But, I remain convinced that data is different than clothing and other physical goods. Data interacts with other data in a way that clothing does not. In many contexts. one data point seems to add value to another data point so that the sum of the two is more useful than either data point is by itself. It is this characteristic of data that may be the engine that powers network effects. Data is also a non rival good meaning that we can both poses it – not true with shirts. Finally data can be re-used in more different contexts than physical goods. It may be some combination of these characteristics that leads to the notion that data has an increasing marginal utility.
Data collection and the protection of (personal) data is an important issue. There are many government funded agencies that report abuse of information. And it works to report a complaint. All reader from the Netherlands my write a report to this non-profit organisation.