Category Archives: Uncategorized

Google in the middle

Three truths:

1. Google is a middleman made of software. It’s a very, very large middleman made of software. Think of what Goliath or the Cyclops or Godzilla would look like if they were made of software. That’s Google.

2. The middleman acts in the middleman’s interest.

3. The broader the span of the middleman’s control over the exchanges that take place in a market, the greater the middleman’s power and the lesser the power of the suppliers.

For much of the first decade of the Web’s existence, we were told that the Web, by efficiently connecting buyer and seller, or provider and user, would destroy middlemen. Middlemen were friction, and the Web was a friction-removing machine.

We were misinformed. The Web didn’t kill mediators. It made them stronger. The way a company makes big money on the Web is by skimming little bits of money off a huge number of transactions, with each click counting as a transaction. (Think trillions of transactions.) The reality of the web is hypermediation, and Google, with its search and search-ad monopolies, is the king of the hypermediators.

Which brings us to everybody’s favorite business: the news. Newspapers, or news syndicators like the Associated Press, bemoan the power of the middlemen, or aggregators, to get between them and their readers. They particularly bemoan the power of Google, because Google wields, by far, the greatest power. The editor of the Wall Street Journal, Robert Thomson, calls Google a “tapeworm.” His boss, Rupert Murdoch, says Google is engaged in “stealing copyrights.”

Others see Thomson and Murdoch as hypocritical crybabies. To them, Google is the good guy, the benevolent middleman that fairly parcels out traffic, by the trillions of page views, to a multitude of hungry web sites. It’s the mommy bird dropping little worm fragments into the mouths of all the baby birds. Scott Rosenberg points out that Google makes it simple for newspapers or any other site operators to opt out of its general search engine and all of its subsidiary search services, including Google News. “Participation in Google is voluntary,” he writes. Yet no one opts out. Participation is not only voluntary but “is also pretty much universal, because of the benefits. When users are seeking what you have, it’s good to be found.”

Rosenberg is correct, but he misses, or chooses not to acknowledge, the larger point. When a middleman controls a market, the supplier has no real choice but to work with the middleman – even if the middleman makes it impossible for the supplier to make money. Given the choice, most people will choose to die of a slow wasting disease rather than to have their head blown off with a bazooka. But that doesn’t mean that dying of a slow wasting disease is pleasant.

As Tom Slee explains, Google’s role as the dominant middleman in the digital content business resembles Wal-Mart’s role as the dominant middleman in the consumer products business. Because of the vastness of Wal-Mart’s market share, consumer goods companies have little choice but to sell their wares through the retailing giant, even if the retailing giant squeezes their profit margin to zilch. It’s called leverage: Play by our rules, or die.

Sometimes “voluntary” isn’t really “voluntary.”

When it comes to Google and other aggregators, newspapers face a sort of prisoners’ dilemma. If one of them escapes, their competitors will pick up the traffic they lose. But if all of them stay, none of them will ever get enough traffic to make sufficient money. So they all stay in the prison, occasionally yelling insults at their jailer through the bars on the door.

None of this, by the way, should be taken as criticism of Google. Google is simply pursuing its own interests – those interests just happen to be very different from the interests of the news companies. What Google can, and should, be criticized for is its disingenuousness. In an official response to the recent criticism of its control over news-seeking traffic, Google rolled out one of its lawyers, who put on his happy face and wrote: “Users like me are sent from different Google sites to newspaper websites at a rate of more than a billion clicks per month. These clicks go to news publishers large and small, domestic and international – day and night. And once a reader is on the newspaper’s site, we work hard to help them earn revenue. Our AdSense program pays out millions of dollars to newspapers that place ads on their sites.”

Wow. “A billion clicks.” “Millions of dollars.” Such big numbers. What Google doesn’t mention is that the billions of clicks and the millions of ad dollars are so fragmented among so many thousands of sites that no one site earns enough to have a decent online business. Where the real money ends up is at the one point in the system where traffic is concentrated: the Google search engine. Google’s overriding interest is to (a) maximize the amount and velocity of the traffic flowing through the web and (b) ensure that as large a percentage of that traffic as possible goes through its search engine and is exposed to its ads. One of the most important ways it accomplishes that goal is to promote the distribution of as much free content as possible through as many sites as possible on the web. For Google, any concentration of traffic at content sites is anathema; it would represent a shift of power from the middleman to the supplier. Google wants to keep that traffic fragmented. The suppliers of news have precisely the opposite goal.

Take a look at the top topic on Google News right now:

googlenews.jpg

Look, in particular, at the number of stories on this topic that Google already has in its database: 11,264. That’s a staggeringly large number. To Google, it’s a beautiful number. To the 11,264 news sites competing for a measly little page view, and the infinitesimal fraction of a penny the view represents, it’s death.

As I’ve written before, the essential problem facing the online news business is oversupply. The cure isn’t pretty. It requires, first, a massive reduction of production capacity – ie, the consolidation or disappearance of lots of news outlets. Second, and dependent on that reduction of production capacity, it requires news organizations to begin to impose controls on their content. By that, I don’t mean preventing bloggers from posting fair-use snippets of articles. I mean curbing the rampant syndication, authorized or not, of full-text articles. Syndication makes sense when articles remain on the paper they were printed on. It doesn’t make sense when articles float freely across the global web. (Take note, AP.)

Once the news business reduces supply, it can begin to consolidate traffic, which in turn consolidates ad revenues and, not least, opens opportunities to charge subscription fees of one sort or another – opportunities that today, given the structure of the industry, seem impossible. With less supply, the supplier gains market power at the expense of the middleman.

The fundamental problem facing the news business today does not lie in Google’s search engine. It lies in the structure of the news business itself.

U. of Phoenix nixes Twitter U.

Wendy Paul, executive director of public relations for the University of Phoenix, offers an official response to my April Fools post: “University of Phoenix is not going to deliver courses via Twitter. With the limited characters you can post on Twitter, this wouldn’t be a feasible platform for a robust and quality academic curriculum.”

Typical ivory-tower elitist.

Google lifts its skirts

Yesterday was a remarkable day for the small, slightly obsessed band of Google data-center watchers of which I am one. Around each of the company’s sprawling server farms is a high metal fence patrolled by a particularly devoted squad of rent-a-cops, who may or may not be cyborgian in nature. Ordinary humans seeking a peek at the farms have been required to stand at the fence and gaze at the serene exteriors of the buildings, perhaps admiring the way the eponymous clouds of steam rise off the cooling towers in the morning:

steam.jpg

[photo by Toshihiko Katsuda]

Everything inside the buildings was left to the imagination.

No more. Yesterday, without warning, Google lifted its skirts and showed off its crown jewels. (I think you may need to be Scottish to appreciate that rather grotesquely mixed metaphor.) At the company’s Data Center Energy Summit, it showed a video of the computer-packed shipping containers that it confirmed are the building blocks of its centers (proving that Robert X. Cringely was on the money after all), provided all sorts of details about the centers’ operations, and, most shocking of all, showed off one of its legendary homemade servers.

When Rich Miller, of Data Center Knowledge fame, posted a spookily quiet video of the server yesterday – the video looks like a Blair Witch Project outtake – I initially thought it was an April Fools joke:

But then I saw some sketchy notes about the conference that Amazon data-center whiz James Hamilton had posted on his blog, and it started to become clear that it was no joke:

Containers Based Data Center

· Speaker: Jimmy Clidaras

· 45 containers (222KW each/max is 250Kw – 780W/sq ft)

· Showed pictures of containerized data centers

· 300×250’ of container hanger

· 10MW facility

· Water side economizer

· Chiller bybass …

The server pictured in Miller’s video was the real deal – down to the ingeniously bolted-on battery that allows short-term power backup to be distributed among individual servers rather than centralized in big UPS stacks, as is the norm in data-center design.

Now, CNET’s Stephen Shankland provides a further run-down of the Google disclosures, complete with a diagram of the container-based centers and close-up shots of those idiosyncratic servers, the design of which, said Googler Ben Jai, was “our Manhattan Project.”

GoogleServer.jpg

[photo by Stephen Shankland]

I was particularly surprised to learn that Google rented all its data-center space until 2005, when it built its first center. That implies that The Dalles, Oregon, plant (shown in the photo above) was the company’s first official data smelter. Each of Google’s containers holds 1,160 servers, and the facility’s original server building had 45 containers, which means that it probably was running a total of around 52,000 servers. Since The Dalles plant has three server buildings, that means – and here I’m drawing a speculative conclusion – that it might be running around 150,000 servers altogether.

Here are some more details, from Rich Miller’s report:

The Google facility features a “container hanger” filled with 45 containers, with some housed on a second-story balcony. Each shipping container can hold up to 1,160 servers, and uses 250 kilowatts of power, giving the container a power density of more than 780 watts per square foot. Google’s design allows the containers to operate at a temperature of 81 degrees in the hot aisle. Those specs are seen in some advanced designs today, but were rare indeed in 2005 when the facility was built.

Google’s design focused on “power above, water below,” according to [Jimmy] Clidaras, and the racks are actually suspended from the ceiling of the container. The below-floor cooling is pumped into the hot aisle through a raised floor, passes through the racks and is returned via a plenum behind the racks. The cooling fans are variable speed and tightly managed, allowing the fans to run at the lowest speed required to cool the rack at that moment …

[Urs] Holzle said today that Google opted for containers from the start, beginning its prototype work in 2003. At the time, Google housed all of its servers in third-party data centers. “Once we saw that the commercial data center market was going to dry up, it was a natural step to ask whether we should build one,” said Holzle.

I have to confess that I suddenly feel kind of empty. One never fully appreciates the pleasure of a good mystery until it’s uncloaked.

UPDATE: In an illuminating follow-up post, James Hamilton notes that both the data-center design and the server that Google showed off at the meeting are likely several generations behind what Google is doing today. So it looks like the mystery remains at least partially cloaked.

Twitter U.

Realtime is going to college. The University of Phoenix, having pioneered web-based learning and built one of the largest “virtual campuses” in Second Life, is now looking to become the dominant higher-education institution on Twitter. The biggest for-profit university in the world, UoP will roll out this fall a curriculum of courses delivered almost entirely through the microblogging service, according to an article in the new issue of Rolling Stone (not yet posted online). The first set of courses will be in the school’s Business and Management, Technology, and Human Services programs and will allow students to earn “certificates.” But the school plans to rapidly expand the slate of Twitter courses, according to dean of faculty Robert Stanton, and will within three years “offer full degree programs across all our disciplines.” Stanton tells Rolling Stone that Twitter, as a “near-universal, bidirectional communication system,” offers a “powerful pedagogical platform ideally suited to the mobile, fast-paced lives of many of our students.”

Most of the instruction in the Twitter courses will be done through the 140-character “tweets” for which the service is famous, though instructors are also expected to occasionally refer to longer online documents by including “short URL” links in the tweets. “The goal,” says Stanton, “is to keep instruction within the Twitter system to the extent possible. We see the 140-character text limit as more an opportunity than a challenge. It further condenses and democratizes higher education, delivering knowledge and other relevant content to the student in a low-cost and efficient manner.” All examinations will be conducted through exchanges of tweets, according to Stanton.

That sounds bizarre to me, but I admit to being behind the times when it comes to virtual learning. Why not snippetize education? After all, you have to connect with students using the platforms they understand, and things like weighty textbooks and musty classrooms seem increasingly twentieth century.

Potemkinpedia

Today’s Sunday Times features an interesting essay on Wikipedia by Noam Cohen, Rough Type’s Journalist of the Week (my last post was inspired by his article on ghosttwittering). Cohen draws an elaborate parallel between Wikipedia and a city:

With its millions of visitors and hundreds of thousands of volunteers, its ever-expanding total of articles and languages spoken, Wikipedia may be the closest thing to a metropolis yet seen online … The search for information resembles a walk through an overbuilt quarter of an ancient capital. You circle around topics on a path that appears to be shifting … Wikipedia encourages contributors to mimic the basic civility, trust, cultural acceptance and self-organizing qualities familiar to any city dweller. Why don’t people attack each other on the way home? Why do they stay in line at the bank? … The police may be an obvious answer. But this misses the compact among city dwellers. Since their creation, cities have had to be accepting of strangers — no judgments — and residents learn to be subtly accommodating, outward looking.

It’s a nice conceit, and not unilluminating.

But Cohen gets carried away by his metaphor. There’s more than a hint of the Potemkin Village in his idealized portrait of Wikipedia:

It is [the site’s] sidewalk-like transparency and collective responsibility that makes Wikipedia as accurate as it is. The greater the foot traffic, the safer the neighborhood. Thus, oddly enough, the more popular, even controversial, an article is, the more likely it is to be accurate and free of vandalism.

Except, well, that’s not entirely true. One of the main reasons that the most popular and most controversial Wikipedia articles have come to be more “accurate and free of vandalism” than they used to be has nothing to do with “sidewalk-like transparency and collective responsibility.” It’s the fact that Wikipedia has imposed editorial controls on those articles, restricting who can edit them. Wikipedia has, to play with Cohen’s metaphor, erected a lot of police barricades, cordoning off large areas of the site and requiring would-be editors to show their government-issued ID cards before passing through.

If, as a stranger, you visit a relatively unpopular and noncontroversial Wikipedia article – like, say, “Toothpick” – you’ll find a welcoming tab at the top that encourages you to “edit this page”:

toothpick.jpg

But if you go to a popular and controversial article, you’ll almost certainly find that the “edit this page” tab is nowhere to be seen (in its place is an arcane “view source” tab). The welcome mat has been removed and replaced by a barricade. Here, for instance, is the page for “George W. Bush”:

bush.jpg

And here’s the page for “Barack Obama”:

obama.jpg

And here’s the page for “Islam”:

islam.jpg

And here’s the page for “Jimmy Wales”:

wales.jpg

And here’s the page for “Britney Spears”:

spears.jpg

And here’s the page for “Sex”:

sex.jpg

You get the picture.

All these pages are what Wikipedia calls “protected,” which means that only certain users are allowed to edit them. The editing of “semi-protected” pages is restricted to “autoconfirmed users” – that is, users who have formally registered on the site and who “pass certain thresholds for age and editcount” – and the editing of “fully protected” pages is limited to official Wikipedia administrators. (Another set of “page titles” are under “creation protection” to prevent them from being created in the first place.) Many of Wikipedia’s most-visited pages are currently under some form of protection, usually semi-protection.

The reason for instituting such controls is, according to Wikipedia, “to prevent vandalism to popular pages.” Accuracy, in other words, requires top-down controls as well as bottom-up collective action. So when Cohen declares that “sidewalk-like transparency and collective responsibility” are what “makes Wikipedia as accurate as it is,” he’s not telling us the whole story. He’s giving us the official Chamber of Commerce view.

Now, for the great majority of people who consult Wikipedia, such distinctions don’t matter. They’re not interested in how the sausage is made. As long as the information is accurate enough and informative enough for their purposes, they’re content. But, as Cohen notes toward the end of his article, arguments about how Wikipedia works involve “a true clash of ideas.” In Wikipedia and other online communities we see both the possibilities and the limitations of “collective responsibility.” And so, when someone raises a Potemkin facade, it’s important to peek behind it.

The shame of it is that Cohen’s metaphor, and article, would have become even richer had he given us the full story of how order is maintained on the crowded streets of Wiki City.

An amateur among professionals

The new issue of Wired has a nifty little article by Rex Sorgatz (wasn’t that the name of Julia’s rich dickhead husband in Brideshead Revisited?) that provides a flowchart-styled guide to the blowhards of the Internet. Sorgatz manages to squeeze me into the group, smackdab between Michael Arrington and Jeff Jarvis. I am, of course, at once pleased and humbled to be included among such august company. But I also feel a slight sense of shame in that I don’t think I’ve fully earned the honor. I have enough self-awareness to know this: I’ll never be able to blow as hard as those guys.

The coming of the megacomputer

Here’s an incredible, and telling, data point. In a talk yesterday, reports the Financial Times’ Richard Waters, the head of Microsoft Research, Rick Rashid, said that about 20 percent of all the server computers being sold in the world “are now being bought by a small handful of internet companies,” including Microsoft, Google, Yahoo and Amazon.

Recently, total worldwide server sales have been running at around 8 million units a year. That means that the cloud giants are gobbling up more than a million and a half servers annually. (What’s not clear is how Google fits into these numbers, since last I heard it was assembling its own servers rather than buying finished units.)

Waters says this about Rashid’s figure: “That is an amazing statistic, and certainly not one I’d heard before. And this is before cloud computing has really caught on in a big way.” What we’re seeing is the first stage of a rapid centralization of data-processing power – on a scale unimaginable before. At the same time, of course, the computing power at the edges, ie, in the devices that we all use, is also growing rapidly. An iPhone would have qualified as a supercomputer a few decades ago. But because the user devices draw much of their functionality (and data) from the Net, it’s the centralization trend that’s the key one in reshaping computing today.

Rashid also pointed out, according to Waters, that “every time there’s a transition to a new computer architecture, there’s a tendency simply to assume that existing applications will be carried over (ie, word processors in the cloud). But the new architecture actually makes possible many new applications that had never been thought of, and these are the ones that go on to define the next stage of computing.” The consolidation of server sales into the hands of just a few companies also portends a radical reshaping of the server industry, something already apparent in the vigorous attempts by hardware vendors to position themselves as suppliers to the cloud.