Are we missing the point about cloud computing?
That question has been rattling around in my mind for the last few days, as the chatter about the role of the cloud in business IT has intensified. The discussion to date has largely had a retrospective cast, focusing on the costs and benefits of shifting existing IT functions and operations from in-house data centers into the cloud. How can the cloud absorb what we’re already doing? is the question that’s being asked, and answering it means grappling with such fraught issues as security, reliability, interoperability, and so forth. To be sure, this is an important discussion, but I fear it obscures a bigger and ultimately more interesting question: What does the cloud allow us to do that we couldn’t do before?
The history of computing has been a history of falling prices (and consequently expanding uses). But the arrival of cloud computing – which transforms computer processing, data storage, and software applications into utilities served up by central plants – marks a fundamental change in the economics of computing. It pushes down the price and expands the availability of computing in a way that effectively removes, or at least radically diminishes, capacity constraints on users. A PC suddenly becomes a terminal through which you can access and manipulate a mammoth computer that literally expands to meet your needs. What used to be hard or even impossible suddenly becomes easy.
My favorite example, which is about a year old now, is both simple and revealing. In late 2007, the New York Times faced a challenge. It wanted to make available over the web its entire archive of articles, 11 million in all, dating back to 1851. It had already scanned all the articles, producing a huge, four-terabyte pile of images in TIFF format. But because TIFFs are poorly suited to online distribution, and because a single article often comprised many TIFFs, the Times needed to translate that four-terabyte pile of TIFFs into more web-friendly PDF files. That’s not a particularly complicated computing chore, but it’s a large computing chore, requiring a whole lot of computer processing time.
Fortunately, a software programmer at the Times, Derek Gottfrid, had been playing around with Amazon Web Services for a number of months, and he realized that Amazon’s new computing utility, Elastic Compute Cloud (EC2), might offer a solution. Working alone, he uploaded the four terabytes of TIFF data into Amazon’s Simple Storage Service (S3) utility, and he hacked together some code for EC2 that would, as he later described in a blog post, “pull all the parts that make up an article out of S3, generate a PDF from them and store the PDF back in S3.” He then rented 100 virtual computers through EC2 and ran the data through them. In less than 24 hours, he had his 11 million PDFs, all stored neatly in S3 and ready to be served up to visitors to the Times site.
The total cost for the computing job? Gottfrid told me that the entire EC2 bill came to $240. (That’s 10 cents per computer-hour times 100 computers times 24 hours; there were no bandwidth charges since all the data transfers took place within Amazon’s system – from S3 to EC2 and back.)
If it wasn’t for the cloud, Gottfrid told me, the Times may well have abandoned the effort. Doing the conversion would have either taken a whole lot of time or a whole lot of money, and it would have been a big pain in the ass. With the cloud, though, it was fast, easy, and cheap, and it only required a single employee to pull it off. “The self-service nature of EC2 is incredibly powerful,” says Gottfrid. “It is often taken for granted but it is a real democratizing force in lowering the barriers.” Because the cloud makes hard things easy, using it, Gottfrid told Business Week’s Stephen Baker, “is highly addictive.” The Times has gone on to use S3 and EC2 for other chores, and, says Gottfrid, “I have ideas for countless more.”
The moral of this story, for IT types, is that they need to look at the cloud not just as an alternative means of doing what they’re already doing but as a whole new form of computing that provides, today, a means of doing things that couldn’t be done before or that at least weren’t practicable before. What happens when the capacity constraints on computing are lifted? What happens when employees can bypass corporate systems to perform large-scale computing tasks in the cloud for pennies? What happens when computer systems are built on the assumption that they will be broadly shared rather than used in isolation?
I think we will find that a whole lot happens, and it will go well beyond IT-as-usual. When electricity became a utility – cheap and ubiquitous – it didn’t just reduce the cost of running existing factory machines. As I describe in my book The Big Switch, it allowed a creative fellow like Henry Ford to build an electrified assembly line and change manufacturing forever. It’s natural to see a new technology through the lens of the technology it supplants, but that’s a blinkered view, and it can blind you to the future.
The NY times example is a poster child but it is just one case and to some extent on the fringes of Enterprise IT. How many times do you really run into the situation like that of NY times in an year. not a lot.
for CC to be taken seriously by the CIO it has to have a significant impact to his portfolio and that at this point of time is really by cost savings as a result of moving to realtime-provisioning and pooled capacity . i.e How can the cloud absorb what we’re already doing?
For – “What does the cloud allow us to do that we couldn’t do before?” is a more long term question typically something like a CTO or innovation officer to look at. And frankly their only might be very answers coming from within the Enterprise. The answers will really come from outside. Refer http://www.gandalf-lab.com/blog/2008/05/realtime-analytics-for-rest-of-us.html
As an example : here is a guy http://www.webscalesolutions.com trying to build a small business in a really-really small niche that until now required massive investments in hardware and databases to even start a business.
Examples like these is what the Enterprises will leverage.
I couldn’t agree more with this post. People haven’t even begun to explore how disruptive the ‘cloud’ is on a prospective/innovative basis. When we launched our web app a year ago we used a mix of physical hardware and AWS services. This last weekend, we moved our entire footprint into the cloud… We think we may be one of the largest web apps to do this to date.
Cost was part of the equation for us, but it wasn’t the prime driver. Being 100% in the cloud opens up all sorts of otherwise unthinkable business model and growth opportunities for us – we can build and sell access to our systems in all sorts of ways that would be impossible in a hardware bound world – we just posted on the decision at http://drop.io/blog
Hi Nick.
My first comment on your blog, so feel free to edit/reject if I don’t have the correct style.
You’re absolutely right. The big issue is not that we can save some money through cloud computing (though that’s certainly very important) but rather that cloud computing will enable us to live and work in completely new ways.
Over on my blog http://cloudn.com I’ve recently been writing about how cloud computing will radically reshape the global enterprise computing industry, as that industry faces a perfect storm of new forces in 2008 – cloud computing, parallel computing, consumerization of software, and the live data explosion. I’ll mention just two areas where cloud computing will enable highly disruptive new capabilities.
Continuous Intelligence. With exponentially exploding volumes of live data to be continuously analyzed, filtered, correlated and matched, cloud computing will enable a radical shift from pull to push. Always-on cloud services will continuously digest new live data appearing all over the web, and automatically push highly personalized relevant knowledge extracted from those live feeds to millions of users in real-time. Information overload will change from being a problem to being a major driver of competitive advantage for businesses, and we will see a dramatic reduction in noise-to-signal ratios for individuals.
The Consumerized Enterprise. Twenty years ago (when much of today’s enterprise software was designed!), information workers had access to almost 100% of the data they needed to do their work internally within the organization. Today that figure is around 20% and its falling rapidly. Cloud computing will unleash a new generation of enterprise services that enables those workers to continuously analyze, model, filter and mashup data (live and historical) from all over their company and all over the web. The impact on business productivity will be profound.
Bill McColl
CEO, Cloudscale Inc.
I have two dedicated windows servers leased from GoDaddy, and I remote desktop into them to do my work. All of my personal and business applications and information are on these remote servers, and my local machines are used as essentially dumb terminals.
As a result, I can be equally productive anywhere there is an internet connection, and using most any computer.
Perhaps the cloud will lead enterprises along a similar road, where location of the work force becomes less and less important.
The NY Times example is the future of Cloud Computing: you cannot predict how smart people will use the Cloud.
But CIOs need facts to embrace this change, and they need to have a clear ROI. Nowadays, price of computing power in the Cloud is too high. It will go down in the future and the benefits will be clearer. But now we are still in the early-adopters phase, and the economics of scale are not impacting the customers yet.
@ Ted Murphy: Why two servers?
And how much do they cost, just out of curiosity?
Two very game-changing use, not unlike Tiff-to-PDF conversion, are still too needy to have really taken off, but might soon. Voice manipulation, notably text recognition and translation should be a massive phenomenon soon, thanks to clouds: Google started by making the video campaign searchable —and giving everybody the opportunity to play “Jon Steward have fun with Candidate A debating against Himself a few years ago”— well imagine that for every video: for academics this means seminar to paper, class to book. It won’t replace the needed time to have ideas sunk in, structure them, but — not unlike what you described in your Atlantic piece — will completely reshape our thinking process.
Another surprisingly demanding application is Semantic technologies. Those things get very complex, very fast. Any mildly intuitive results demands massive computers — but could really use pre-processed reasoning, something clouds could render feasible.
The shift of many of the components of the computing stack from a product to a service based economy and hence the provision of those components as standard services will lead to an acceleration in the rate of business evolution on the web.
Standard stuff – Herbert Simon, Theory of Hierarchy and componentisation. The rate of evolution of any system is dependent upon the organisation of its subsystems.
The capex to opex conversion and so forth are small fry compared to this effect. Nick, you already knew this.
For any one else, I covered this again recently at FOWA.
Excellent post, and very thought-provoking. Some of the comments so far seem to have still this from the point of view of the large enterprise, and that may miss the real growth area. “Gee whizz I can get commodity services I already run in my data centre” seems to be what they’re saying. The power is that small businesses, or individuals, can in minutes have 100 machines running their tasks for a few bucks. At the moment, the small businesses don’t know it, but they will, and suddenly the CIO of the NYT will find that he has no more or less access to large scale computing resources than anyone else. In fact, large legacy systems will become a liability. Now that is disruptive! That’s very much what you imply Nick, but I think it’s hard to communicate – it’s not just that people will be doing new things, but that *new people* will be doing the same old things only a few people were able to do before, and they will also be doing completely new things.
I was already thinking in these terms just this week, because outside the day-job with BCS I do some very basic IT work for a small local charity. I found myself rolling out Google Apps to them for email, and looking at EC2 as an online backup solution. I’ve even considered getting them onto USB keyrings with Ubuntu 8.10 on an encrypted volume, so they can just use whatever machine happens to be available and have their central storage in the cloud – not sure they’re ready for that yet! However it goes, they will in effect get enterprise-grade IT services at a per-seat cost much lower than enterprises usually pay – but most small businesses wouldn’t know about it or have the capability to access it. Mom-and-Pop local IT service businesses that sell to them are still pushing old school solutions to these guys, but that will change.
I really, really, really dislike the gee-whizziness of “cloud computing”. It’s like “in cyberspace” or “emergent” – a head-in-the-clouds (literally!) way of making a buzzword. I prefer “utility computing”.
The NYTimes example is our old friend “parallel processing” come back again tricked out in the 21st century version of timesharing.
Abstracting a one off, non linear process does not validate the entire hypothesis. The blog post also points out
1) the substantial technical challenges involved.
2) the fact that business management was NOT involved
3) pretty much only one person was able to make it happen.
So far, I fail to believe that scaling this to more wider scale applications will occur. Probably becuase of the entropy that will occur.
One respondent makes 3 objections:
1) the substantial technical challenges involved.
So don’t take on tasks that are challenging? I don’t think that’s what you mean.
2) the fact that business management was NOT involved.
This is bad because? As a technical manager I would be delighted with an employee that exhibited such initiative. There are too few!
3) pretty much only one person was able to make it happen.
Again, this is inappropriate because?
To me this concept threatens the very basis of IT organizations. Standardization, rigid processes, and SoP. Nick’s post is technical only in form—- This is really a discussion of people and their level of creativity.
The good news is, the vast majority of IT people who are not creative will be culled out in a form of natural selection as the momentum of utility computing increases.
It could not happen soon enough.
-mew
Interesting comments. Thanks.
I prefer “utility computing”. Me too, not least because it’s far more resistant to being hijacked by marketers and charlatans than is the cutesy “cloud computing.” Then again, I suppose that’s precisely why “cloud computing” took off.
Excellent post.
In the last 25 years in IT, I’ve watched the pendulum swing from centralized to distributed to centralized to distributed … the oscillations never cease. When we re-partition our applications in the next re-design, we can have endless debates on which portions run locally, which run on the back end tier(s), and which run in the cloud. And the endless oscillations that will follow… sigh… anyone got a 3-apex pendulum?
Nick –
That’s a good illustration from The New York Times.
Another timely example comes from our President-Elect’s use of text-messaging, if you consider the cellie infrastructure as a part of the cloud. This is doing something that we couldn’t do before — potentially, it’s crowd-sourcing Democracy.
Regarding the GoDaddy Dedicated Server option, you can get them for as low as $90/month. It’s a great solution, if not exactly the service’s originally intended use. I have a client that is running almost their entire operation in a similar fashion using a dedicated, hosted server at Rackspace (a more expensive option which we selected for their excellent service and support).
The only thing we aren’t using it for is e-mail; there are better managed solutions out there for that. Staff log in via remote desktop from wherever they happen to be to access documents or financial applications. It’s been more than a year since making the transition and the only hiccups have been printing related… an issue that is becoming less and less relevant as people become more comfortable reading and manipulating their documents online.
Compared to another client of a similar size in the same sector, the solution is dirt-cheap and maintenance-free. Unfortunately, that sector is real estate development, so the long-term outcomes of the effort probably aren’t going to be revealed, at least in this case.
Computational science gets a big boost, as would any “well known” super-computing field where there are more ideas on the shelf than there are CPUs at hand. Remember it wasn’t that long ago that SETI-at-home was such a clever hack…. well, it symbolizes a lot of pent up demand. In my limited but non-trivial experience doing scientific computing for modern labs: they’ll take as much cheap, just a few hours per pop, cluster-type supercomputing as they can get on the cheap. There are a lot of ideas for computations sitting on paper only (figuratively speaking) gathering dust on researcher’s shelves. So, that’s one thing.
Another is the dissolution of Google’s main advantages: a non-profit pay-for-use crawler cache and map reduce engine, etc. The ad business can be completely decentralized at a cost savings to advertisers and revenue benefit to publishers. Of course, Google Search itself has a distinctive shelf-life as utility computing evolves — but don’t ask me to embarrass myself with a predicted date or business model.
Another thing: chickens will come home to roost on privacy matters as scammers start using really clever forms of compute/bandwidth/storage-intensive data mining to focus their efforts.
On a different plain of reality: there will be a renewed effort by some firms to subsume the Internet and try to replace it with a gigantic terminal server. Watch for serious proposals to completely abandon I.P. other than nominally and supplant it with dedicated dumb-terminal connections, just like Bell envisioned when they first saw Unix.
-t
Hi Nick, great article. It underlines the importance of cloud computing. One point about the NYT TIFF to PDF conversion computational task, where Derek used AWS EC2 to accomplish the parallel computational task – I didn’t find the mention of data transfer cost of uploading 4TB TIFF images into S3.
Doing some simple computations – Amazon would charge about $409.60 for uploading 4TB data into S3, and would charge an additional $261.12 for downloading the processed PDF files, which were 1.5TB in size. That is about $670.72. In addition there will be bandwidth charges of this 5.5TB data transfer from the NYT datacenter, 4TB out and 1.5TB in, I am sure that will be of the order of $400-$600.
In addition to that – consider the amount of time it would take to transfer such a data. At 10Mbps, it would take 53.4 days to transfer this data.
Using Hadoop on EC2 is definitely a great idea, and is very helpful, however the locality of data also matters a lot. Moving data, in my opinion costs a lot, and sometimes undermines the computational costs.
Thanks,
Mukul.
http://mukulblog.blogspot.com
The Cloud’s Achilles Heel: First, I am a huge Nick fan, read all the books, and have delivered a vision for my enterprise for cloud computing; how and where IT services are delivered shouldn’t matter. That being said, I need to make a point here as to why most Fortune class enterprises will be slow to adapt. The “Edison” analogy has one flaw; the electric companies by in large “own” the grid from end to end. They own the “transport layer” and can provide reasonable assurance that electricity will be delivered 7×24 as expected. The Cloud services by contrast are independent of the transport layer; aka The WWW. Take Vonage for example. Great example of a cloud based service delivering important functionality at a reasonable cost. But enterprises will be very skeptical of that model because the transport is unreliable for enterprise class applications. So indeed there will be “one off” examples such as the NYT, but for the cloud to graduate into a full enterprise utility the transport layer issues need to be addressed. (Note: The issues in question are performance related; there is no assurance of end-to-end quality of service or throughput). So in the end we need to temper some the enthusiasm with the reality of the situation, at least for enterprise class services.
Mukul, Thanks. One further clarification: The New York Times uses S3 to serve the PDFs to site visitors, so the costs related to S3 uploading and storage have to be spread across that ongoing part of its online business rather than attributed to the EC2 task. You would, in judging the economics, have to compare the S3 costs to what the Times would have to spend to store and transmit the archive PDFs from its own systems.
ITDirector: A point well taken. In The Big Switch, I spend quite a bit of time discussing the vulnerabilities of the Net as a transport layer, and as you point out these shortcomings need to be taken into account in any balanced view of cloud computing. I expect the transport layer will become more robust, as businesses demand it, but there are still many open questions there.
Nick
I’m not completely sure how Vonage are a “cloud-based” service, they just sell access to a bundle of bridges from the internet SIP calls to the regular phone system? Nothing really happens in between.
Utility computing offered by Amazon (I prefer this term to the often misused and overhyped “cloud computing”) is a great enabler for companies of all sizes from NY Times to one person startups.
Personal example: recently I had to run some massive data mining calculations for my new fraud detection business. It would take several weeks to run this job using hardware currently available to – and those machines would not be able to do anything else.
Using EC2, I was able to do the calculations in several hours using several large instances with the workload evenly split among them.
I also didn’t have to worry about buying terabytes of storage either – Amazon’s EBS and S3 provide me with virtually unlimited storage capacity.
No contracts to sign, no setups or activations. 80 cents per hour per large machine – once the job is done, all of them are turned off and I don’t spend a penny more.
Moreover, now I can take many more jobs like these that I would not be able to perform previously.
It’s fair to say that without EC2, this business would not be possible.
I don’t have to take out a bank loan to buy new servers or sign monthly hosting contracts and then watch my money disappear when the servers are sitting idle but I still have to pay for them…
1.the same logic that enables users to rent 100 computer array more easily then buy them , enables them to rent any kind of specialized hardware easier then buying it.
and since the economics of special hardware (be it programable hardware or fixed hardware) , are orders of magnitude better then cpu’s , i can see increasing demand for specialized hardware at the expense of the cpu.
this might also lead to changes in the hardware value chain.
2.currently most academic work done isn’t released/commercialized to the public not even in binary form.
this is because of the large work needed to be done on commercialization including documentation , testing on various platforms , optimizations for said platforms , handling business issues.
but if for an academic work , i develop my algorithm on the cloud from the start , and can offer it as a web service with simple billing and IP managment , and simply offer it throught a marketplace , or put a link in my paper about the algorithm , this can break many of those barriers.
then developers could rapidly try my algorithm (without need to buy the whole library – only buying my alogirthm on demand) , integrate it with their code and other libraries , and would be able to sell it (also as a cloud service).
this could be a big change in software r&d.
3.let’s say i’m a big company . i have a big software , with one specific module thast doesn’t perform. i can put a special version of the software with option to plug a better module instead of the bad module – on the cloud. and in the innocentive model , outsource the work to the world.then people could test the fully integrated software , etc.
here the cloud performs as a sort of IP control mechanism.
this could also work for simulated systems , not only software.