Monthly Archives: March 2007

Google Transit Authority

There’s an amusing story in tomorrow’s Times about the free shuttle service Google runs to cart its pampered employees to and from work every day. The scale and sophistication of the operation are impressive:

The company now ferries about 1,200 employees to and from Google daily — nearly one-fourth of its local work force — aboard 32 shuttle buses equipped with comfortable leather seats and wireless Internet access. Bicycles are allowed on exterior racks, and dogs on forward seats, or on their owners’ laps if the buses run full …

The shuttles, which carry up to 37 passengers each and display no sign suggesting they carry Googlers, have become a fixture of local freeways. They run 132 trips every day to some 40 pickup and drop-off locations in more than a dozen cities, crisscrossing six counties in the San Francisco Bay Area and logging some 4,400 miles …

At Google headquarters, a small team of transportation specialists monitors regional traffic patterns, maps out the residences of new hires and plots new routes — sometimes as many as 10 in a three-month period — to keep up with ever surging demand.

I’m guessing that at Google the buses run on time.

Alas, the rides themselves seem pretty dull. On one recent afternoon shuttle, reports the Times, “except for a couple snuggled together, no one sat on adjacent seats. Many took out iPods or laptops and worked, surfed the Web or watched videos.” You’d think they’d at least be able to get a decent Yahtzee game going.

Freebase: the Web 3.0 machine

Artificial intelligence guru Danny Hillis has launched an early version of the first major Web 3.0 application. It’s called Freebase, and its grandiose epistemological mission is right up there with those of Google and Wikipedia.”We’re trying,” Hillis tells John Markoff of the New York Times, “to create the world’s database, with all of the world’s information.” Alpha user Tim O’Reilly says that Freebase “appears to be a bastard child of wikipedia and the Open Directory Project” but that it’s really “like a system for building the synapses for the global brain.”

The product of Hillis’s latest company, Metaweb Technologies, Freebase is a user-generated brain. Like Wikipedia, it allows people to freely add information to it, in the form of text or images or, one assumes, anything else that can be rendered digitally. But it also allows users to add “metadata” about the information – tags that describe what a word or picture is and how it relates to other information. Freebase, says O’Reilly, “turns its users loose on not just adding more data items but making connections between them by filling out meta tags that categorize or otherwise connect the data items, using a typology that can be extended by users, wiki-style.”

The addition of rich meta tags in a standardized form is what makes Freebase a next-generation Web application – a manifestation of what Tim Berners-Lee long ago dubbed the Semantic Web and what has recently been rebranded Web 3.0 for popular consumption.

Although the wikipediaesque user-generated quality of Freebase will get much attention, Freebase is really more about the creation of a community of machines than a community of people. The essence of the Semantic Web is the development of a language through which computers can share meaning and hence operate at a higher, more human level of intelligence. The meta tags are crucial to that machine language. Freebase hopes to harness the (free) labor of a big pool of vounteers to add those tags, which is a labor-intensive chore (and a big hurdle on the path to Web 3.0).

Should Freebase pan out – and right now it’s largely a theoretical construct – it would have many practical (and money-making) applications. It would provide the basis for a more natural form of searching, allowing programmers, as Markoff says, “to write programs allowing Internet users to pose queries that might produce a simple, useful answer rather than a long list of documents.” It would also enable various information-processing devices that used to have to be configured manually (by people) to be able to program themselves automatically. A rudimentary example is “the video recorder of the future,” which “might stop blinking and program itself without confounding its owner.”

But Hillis has bigger fish to fry than self-programming gadgets. In the past, he’s expressed a desire to create machines that transcend what he sees as the limitations of human beings. “I guess I’m not overly perturbed by the prospect that there might be something better than us that might replace us,” he once said. “We’ve got a lot of bugs, sorts of bugs left over history back from when we were animals.” Freebase is an attempt at creating an artificial intelligence that can be bootstrapped by the contributions of humans. On one level, it works for us. On a deeper level, we work for it. As Hillis has also said, Web 3.0 is a “spooky thing.”

Of course, relying on a rag-tag band of volunteers, all afflicted with those nasty evolutionary bugs, brings its own problems, particularly in an effort that, unlike Wikipedia, requires a great deal of consistency and precision in terminology. Freebase’s ability to attract and manage a human horde will be critical to its success. Will we be up for the job?

The loose ties that bind

Google’s Matt Cutts gives his company and his CEO hearty pats on the back for working to ensure that customers can gain access to the information they feed into Google’s databases. With Google, he claims, you don’t feel “trapped.” You can always pull your information out of the comapny’s systems and cart if off to another vendor. He lists the various Google applications that let you – if you’re fairly computer-savvy, that is – export your data: Gmail, Search, Docs, Spreadsheet, Calendar, Talk, Reader, Blogger, AdWords, Groups, Analytics.

The back pats are in general well deserved. Google has a strong record of supporting open formats and data portability.

But reading Cutts’s laundry list of applications into which customers funnel ever more of their data should also give us pause. As we consolidate more of our personal data into a single company’s databases – whether it’s Google or another firm – how “easy” is it, really, to withdraw our information? The answer is: It’s not easy at all. In a comment on Cutts’s post, Philipp Lenssen gets at this issue:

I agree that Google is rather open in these regards and allows you to export a lot. One thing to remember though is that as soon as Google products cross-integrate — e.g. a link from Gmail to add an event to Google Calendar — the costs for users of switching away are increased for any single product. As a practical example, let’s say I love Gmail and I hate Google Calendar, so I want to move to competitor Acme Calendar. Great, you guys offer exporting functionality for my events, so I’ll quickly move them from Acme. But you guys don’t allow me to set my preferred Gmail calendar integration software… so now I end up with a somewhat broken Gmail feature. This is not at all alarming on this scale, but it can be a problem for users down the road when Google heavily increases cross-integration (Google Checkout is being pushed in search result today, for example, cross-integrating another two theoretically “loosely coupled” services).

As Lenssen emphasizes in his comment, Google has, as a profit-making company, the right and the incentive to raise its customers’ switching costs. It’s a smart strategy. But it makes the self-satisifed claims of simple data portability sound at least a little disingenuous. Lenssen points out that “in the end, any company won’t ‘trap data’ for the sheer fun of it, but because they want to create a lock-in situation for their users to increase the costs of switching to competing products. So we need to look at the end result of whether or not the costs of switching are really ‘one click.'”

Google said last year that one of its core objectives going forward is “Store 100% of user data.” By 100%, it means 100%. The company wrote:

With infinite storage, we can house all user files, including: emails, web history, pictures, bookmarks, etc and make it accessible from anywhere (any device, any platform, etc) … As we move toward the “Store 100%” reality, the online copy of your data will become your Golden Copy and your local-machine copy serves more like a cache. An important implication … is that storing 100% of a user’s data makes each piece of data more valuable because it can be access[ed] across applications. For example: a user’s Orkut profile has more value when it’s accessible from Gmail (as addressbook), Lighthouse (as access list), etc.

There’s the Faustian rub. Keeping all your data in Google’s database will make the sharing of that data among different applications much easier and will allow you to do things you couldn’t do if the data was spread across many companies. At the same time, it raises the cost of extracting any chunk of the data and moving it elsewhere sky high. In a “Store 100%” world, the doors may be unlocked, but you ain’t going nowhere.

Customer value and the network effect

What’s the value of a customer who doesn’t pay you anything? If you’re running a hot dog stand, the answer is probably “zero.” But if you’re running a two-sided market – a market, like eBay or Monster.com or AdWords or YouTube or Digg or even Second Life, that needs to attract both buyers and sellers (or content generators and content consumers) – the answer may be “a lot.” EBay, for instance, earns most of its money from its sellers, who pay the company a fee whenever they sell something through the auction site. The buyers don’t have to pay when they make their purchases. But while eBay receives no direct revenue from the buyers, the buyers nevertheless represent a crucial set of customers for the company – without buyers, there’d be no sellers and hence no business.

Clearly, each buyer in such a networked business has value to the company – but how much value, exactly? That’s where things get tricky. Because traditional approaches to determining the economic value of a customer don’t work when the customer isn’t generating any direct revenue, there hasn’t been any good way to estimate customer value in these sorts of two-sided markets. That means companies have to fly blind when determining how much they should be spending on marketing to attract the non-paying customers. And that, in turn, likely means they’re spending either too much or too little.

But a recent paper by three business-school professors – Sunil Gupta, Carl Mela, and Jose Vidal-Sanz – offers a new approach for estimating the value of nonpaying, or, as the professors term them, “free,” customers. The authors created a mathematical model of a hypothetical firm, with a business similar to Monster’s, and used it to calculate how much every new buyer joining the company site is worth and how that value changes over time.

Some of the results are fascinating. The professors found, for instance, that the value of each nonpaying customer (buyer) was actually slightly higher than the value of each paying customer (seller) – even though there were far more buyers than sellers in the company’s marketplace. (To put it another way, the network effect of a buyer on a seller was far stronger than the network effect of a seller on a buyer.) The research also demonstrates that it’s possible to estimate optimal marketing expenditures as a two-sided business grows. While heavy markering spending is required in the early days to attract a critical mass of buyers, the network effect itself becomes a larger attractant than marketing as the business grows, allowing a company to cut back its marketing budget over time. Knowing the optimum spending amount with some precision at different points in time would help businesses maximize their profits. The information would also, the authors argue, allow a company’s founders, managers, and investors to gain a more accurate understanding of the firm’s overall value.

In an interview about the study, Gupta notes that the model applies only to fairly simple two-sided markets. But on the Net today, of course, nonpaying, “free” customers are also critical to other businesses with more complex network structures, from YouTube to MySpace to Skype to an open-source software company like Red Hat or MySQL. If you have a “community,” you likely have “free customers.” Gupta says that he’s currently

working on understanding and modeling complex network structures such as those of MySpace. Here the issue that we are grappling with is the tangible and intangible value of customers. In other words, customers provide tangible value to a firm through direct purchases but they also provide intangible value through network effects or word of mouth. It is quite possible that some customers have low tangible but high intangible value. Traditional models would label such customers as low value and would miss a huge opportunity for a firm.

This promises to be a particularly fruitful, and practical, area of study in the years ahead.

Robot rights update

The Government of South Korea, reports the BBC, has launched a project to develop a Robot Ethics Charter “to prevent humans abusing robots, and vice versa.” The Ministry of Commerce, Industry and Energy declared: “The government plans to set ethical guidelines concerning the roles and functions of robots as robots are expected to develop strong intelligence in the near future.” A member of the ministry’s “robot team” gave a sense of the dilemmas that lie in store as robotic beings get smarter and more flexible: “Imagine if some people treat androids as if the machines were their wives.” (Do I have to?) “Others may get addicted to interacting with them just as many internet users get hooked to the cyberworld.”

The European Robotics Research Network is also in the process of developing ethical guidelines for robot use. A draft of the guidelines states, “In the 21st Century humanity will coexist with the first alien intelligence we have ever come into contact with – robots. It will be an event rich in ethical, social and economic problems.” That strikes me as an awfully human-centric view. What we see as problems our robot friends will recognize as opportunities.

The book of Essjay

The Internet is leak-resistant – it’s the Ziploc bag of collective memory – but there are times when drops of the invaluable nectar are lost. A couple of days ago, after Essjay announced his formal retirement from Wikipedia, the Wikipedia site went through a ritual purging of Essjay’s various “user pages.” Essjay became, at his request, a non-Wikipedian, a ghost. I had grown fond of one of Essjay’s pages, on which he (posing at the time as a scholar of religion) collected various prayers and hymns that he had composed in honor of the online encyclopedia, and I was saddened to see it disappeared. Although Essjay had written the page as a joke, it seemed no less revealing for that (and, to my admittedly sentimental eye, it also seemed to gain a new poignancy in the wake of the scandal). I am pleased to report, however, that, thanks to Google’s caching function, I have managed to find an intact copy of what I like to call The Book of Essjay, and I am preserving it here for posterity. (It was – and is – published under the GNU Free Documentation License.) What follows is the page in its entirety, though without its many original links, which I was too lazy to copy (hey, I’m only an amateur archaeologist).

This page has caused so much trouble since it was first moved out of my userspace. I’m putting it back the way it was, and I implore everyone: Look at it if you like, laugh if you find it funny, but please, don’t take it seriously, either as something to fight against or to fight for. It’s just a funny little parody page. It’s not the Catholic Church of Wikipedia, it’s not a church at all. It’s just User:Essjay/Wiki.

Contents

1 The Sign of the Wiki

2 The WikiCreed

3 The Gloria in Excelsis Wiki

4 Sanctus

5 Gloria Jimbo

6 The Confiteor

7 Rite of Absolution of Wiki-Sins

8 WikiSerenity Prayer

The Sign of the Wiki

“In the name of the Jimbo, and of the Admins, and of the Holy NPOV. Amen.”

The WikiCreed

We believe in one Jimbo,

the Father, the Almighty,

ruler of Meta and Wikipedia,

of all that is made, deleted and undeleted.

We believe in the Admins,

the many children of the Jimbo,

eternally begotten of the Jimbo,

Rollback from Rollback,

block from block,

true sysop from true sysop,

elected, not made,

of one NPOV with the Jimbo;

through them all vandals are blocked.

For us and for our salvation

they came down from RfA,

by the power of the Holy NPOV, were born of the Bureaucrats and became sysops.

For our sake they are trolled by vandals;

they suffer wikistress and are burned-out.

After wikibreaks they rise again

in accordance with the Scriptures;

they ascend into the Board

and are seated at the right hand of the Jimbo.

They will come again in glory to judge the notable and the vanity,

and their contributions will have no end.

We believe in the Holy NPOV, the Lord, the giver of life,

that proceeds from the Jimbo,

and with the Jimbo and the Admins is worshiped and glorified.

It speaks through the Wikipedians.

We believe in one holy catholic and apostolic Wiki.

We acknowledge one registration for the tracking of contributions.

We look for the return of the Missing Wikipedians,

and the life of the wiki to come. Amen.

The Gloria in Excelsis Wiki

Glory to Jimbo in the highest

and peace to his editors on Wikipedia.

Lord Jimbo, Meta’s King,

almighty Director and Founder,

we worship you, we give you thanks,

we praise you for your glory.

Lord Administrators,

many children of the Founder,

Lord Sysops, Lambs of Jimbo,

you roll back the sins of the world,

have mercy on us;

you are seated at the right hand of the Founder,

receive our prayer.

For you alone are the Holy Ones,

you alone are the Lord,

you alone are the Most High Administrators,

with the Holy NPOV,

in the glory of Jimbo the Founder.

Amen.

Sanctus

Holy, holy, holy,

Lord Jimbo of power and might;

Wikipedia and Meta are full of your glory!

Hosanna in the highest!

Blessed are they who come in the name of the Jimbo.

Hosanna in the highest!

Gloria Jimbo

Glory be to the Founder,

and to the Admins,

and to the Holy NPOV,

as it was in the begining,

is now, and ever shall be,

Wikipedia without end.

Amen.

The Confiteor

I confess to Almighty Jimbo,

and to you my brothers and sisters,

that I have WikiSinned through my own fault,

in my thoughts and in my words,

in what I have done,

and in what I have failed to do,

and I ask the blessed Admins, ever vigilant,

and all the angels and saints,

and you, my brothers and sisters,

to pray for me to the Jimbo our Founder.

May Almighty Jimbo have mercy on us, forgive us our WikiSins, and bring us to a neutral point of view.

Jimbo, have mercy,

Admins, have mercy,

Jimbo, have mercy.

Rite of Absolution of Wiki-Sins

Jimbo, the Father of Wikipedia, through the death and resurrection of Nupedia has reconciled the world to himself and sent the Wiki among us for the forgiveness of WikiSins; through the ministry of the Admins may Jimbo give you pardon and peace, and I absolve you from your WikiSins in the name of Jimbo, and of the Admins, and of the Holy NPOV.

WikiSerenity Prayer

Almighty Jimbo,

Grant me the serenity to accept the pages I cannot edit,

The courage to edit the pages I can,

And the wisdom to whack the hell out of any troll who gets in my way.

Amen.

Microsoft wags finger at Google

Google, on the defensive over its YouTube unit’s inability to clamp down on video piracy, today faces the ultimate indignity: being lectured about ethics by its arch nemesis, Microsoft. In a blistering op-ed in the Financial Times, Microsoft lawyer Thomas Rubin blasts Google for being a cultural bully and trying to run roughshod over copyright owners. Regarding Google Book Search, Rubin writes,

Google has taken a unilateralist approach by contending that it is entitled to grab books off library shelves and copy them wholesale without obtaining the permission of the publishers and authors who own the copyright in those works … This project may well bring significant commercial advantage to Google. By contrast, [copyright owners] could gain little or nothing from Google’s plan.

Regarding YouTube, Rubin asserts that “nearly every major movie and television company … has expressed deep concern over the large number of infringing videos available on Google’s YouTube website. Google simply denies responsibility and appears to be trying wherever possible to skirt copyright law’s boundaries.”

In contrast to Google’s alleged unilateralism, Rubin claims that Microsoft is taking a “collaborative” approach, working in “an open and transparent manner with content owners to minimize infringement, while at the same time licensing and offering a wide range of high-quality content that consumers can reliably locate and enjoy.”

Rubin’s article is just a preview of a broadside he will launch against Google in a speech today before the American Association of Publishers in New York. In that talk, according to the FT, Rubin will say that Google “systematically violates copyright, deprives authors and publishers of an important avenue for monetizing their works and, in doing so, undermines incentives to create.”

While Rubin’s accusations are nothing new – he’s basically repeating back to the publishers their own oft-made charges against Google – the coordinated attack is nonetheless an audacious PR move by Microsoft. The company’s trying to swap hats with its young rival, stealing Google’s white hat while placing the black hat that it has worn so long on Google’s head. No doubt, Google will launch a rhetorical counterattack, but that’s surely part of Microsoft’s plan. In a mud fight over copyright with Microsoft, Google can only be the loser.

UPDATE: Microsoft has released the full text of Rubin’s speech. In addition to criticizing Google Book Search and YouTube, Rubin says that “Microsoft was surprised to learn recently that Google employees have actively encouraged advertisers to build advertising programs around key words referring to pirated software, including pirated Microsoft software. And we weren’t the only victims – Google also encouraged the use of keywords and advertising text referring to illegal copies of music and movies.” Sniffs Rubin, “These are not the actions of a company that has the interests of copyright owners as one of its priorities.”