You might be surprised to find out how big the infrastructure for cloud and Web 2.0 is. It is mind-blowing. Microsoft has acknowledged packing more than 1 million servers into its datacenters, and by some accounts that is fewer than Googleâ€™s massive server count but a bit more than Amazon. Â
Facebookâ€™s server count is said to have skyrocketed from 30,000 in 2012 to 180,000 just this past August, serving 900 million plus users. And the social media giant is even putting its considerable weight behind the Open Compute effort to make servers fit better in a rack and draw less power. The list of mega infrastructures also includes Tencent, Baidu and Alibaba and the roster goes on and on.
Even more jaw-dropping is that almost 99.9% of these hyperscale infrastructures are built with servers featuring direct-attached storage. Thatâ€™s right â€“ they do the computing and store the data. In other words, no special, dedicated storage gear. Yes, your Facebook photos, your Skydrive personal cloud and all the content you use for entertainment, on-demand video and gaming data are stored inside the server.
Direct-attached storage reigns supreme
Everything in these infrastructures â€“ compute and storage â€“ is built out of x-86 based servers with storage inside. Whatâ€™s more, growth of direct-attached storage is many folds bigger than any other storage deployments in IT. Rising deployments of cloud, or cloud-like, architectures are behind much of this expansion.
The prevalence of direct-attached storage is not unique to hyperscale deployments. Large IT organizations are looking to reap the rewards of creating similar on-premise infrastructures. The benefits are impressive: Build one kind of infrastructure (server racks), host anything you want (any of your properties), and scale if you need to very easily. TCO is much less than infrastructures relying on network storage or SANs.
With direct-attached you no longer need dedicated appliances for your database tier, your email tier, your analytics tier, your EDA tier. All of that can be hosted on scalable, share-nothing infrastructure. And just as with hyperscale, the storage is all in the server. No SAN storage required.
Open Compute, OpenStack and software-defined storage drive DAS growth
Open Compute is part of the picture. A recent Open Compute show I attended was mostly sponsored by hyperscale customers/suppliers. Many big-bank IT folks attended. Open Compute isnâ€™t the only initiative driving growing deployments of direct-attached storage. So is software-defined storage and OpenStack. Big application vendors such as Oracle, Microsoft, VMware and SAP are also on board, providing solutions that support server-based storage/compute platforms that are easy and cost-effective to deploy, maintain and scale and need no external storage (or SAN including all-flash arrays).
So if you are a network-storage or SAN manufacturer, you have to be doing some serious thinking (many have already) about how youâ€™re going to catch and ride this huge wave of growth.
Tags: Alibaba, Amazon, Baidu, cloud computing, DAS, direct attached storage, enterprise, enterprise IT, Google, hyperscale, Microsoft, Open Compute, OpenStack, Oracle, SAP, Tencent, VMware
Iâ€™ve been travelling to China quite a bit over the last year or so. Iâ€™m sitting in Shenzhen right now (If you know Chinese internet companies, youâ€™ll know who Iâ€™m visiting). The growth is staggering. Iâ€™ve had a bit of a trains, planes, automobiles experience this trip, and thatâ€™s exposed me to parts of China I never would have seen otherwise. Just to accommodate sheer population growth and the modest increase in wealth, there is construction everywhere â€“ a press of people and energy, constant traffic jams, unending urban centers, and most everything is new. Very new. It must be exciting to be part of that explosive growth. What a market. Â I mean â€“ come on â€“ there are 1.3 billion potential users in China.
The amazing thing for me is the rapid growth ofÂ hyperscale datacenters in China, which is truly exponential. Their infrastructure growth has been 200%-300% CAGR for the past few years. Itâ€™s also fantastic walking into a building in China, say Baidu, and feeling very much at home â€“ just like you walked into Facebook or Google. Itâ€™s the same young vibe, energy, and ambition to change how the world does things. And itâ€™s also the same pleasure â€“ talking to architects who are super-sharp, have few technical prejudices, and have very little vanity â€“ just a will to get to business and solve problems. Polite, but blunt. Weâ€™re lucky that they recognize LSI as a leader, and are willing to spend time to listen to our ideas, and to give us theirs.
Even their infrastructure has a similar feel to the USÂ hyperscale datacenters. The same only different. Â ;-)
A lot of these guys are growing revenue at 50% per year, several getting 50% gross margin. Those are nice numbers in any country. One has $100â€™s of billions in revenue. Â And theyâ€™re starting to push out of China. Â So far their pushes into Japan have not gone well, but other countries should be better. They all have unique business models. â€śWeâ€ť in the US like to say things like â€śAlibaba is the Chinese eBayâ€ť or â€śSina Weibo is the Chinese Twitterâ€ťâ€¦. But thatâ€™s not true â€“ they all have more hybrid business models, unique, and so their datacenter goals, revenue and growth have a slightly different profile. And there are some very cool services that simply are not available elsewhere. (You listening AppleÂ®, GoogleÂ®, TwitterÂ®, FacebookÂ®?) But they are all expanding their services, products and user base.Â Interestingly, there is very little public cloud in China. So there are no real equivalents to Amazonâ€™s services or Microsoftâ€™s Azure. I have heard about current development of that kind of model with the government as initial customer. Weâ€™ll see how that goes.
100â€™s of thousands of servers. Theyâ€™re not the scale of Google, but they sure are the scale of Facebook, Amazon, Microsoftâ€¦. Itâ€™s a serious market for an outfit like LSI. Really itâ€™s a very similar scale now to the US market. Close to 1 million servers installed among the main 4 players, and exabytes of data (weâ€™ve blown past mere petabytes). Interestingly, they still use many co-location facilities, but that will change. More important â€“ theyâ€™re all planning to probably double their infrastructure in the next 1-2 years â€“ they have to â€“ their growth rates are crazy.
Often 5 or 6 distinct platforms, just like the USÂ hyperscale datacenters. Database platforms, storage platforms, analytics platforms, archival platforms, web server platformsâ€¦. But they tend to be a little more like a rack of traditional servers that enterprise buys with integrated disk bays, still a lot of 1G Ethernet, and they are still mostly from established OEMs. In fact I just ran into one OEMâ€™s American GM, who I happen to know, in Tencentâ€™s offices today. The typical servers have 12 HDDs in drive bays, though they are starting to look at SSDs as part of the storage platform. They do use PCIeÂ® flash cards in some platforms, but the performance requirements are not as extreme as you might imagine. Reasonably low latency and consistent latency are the premium they are looking for from these flash cards â€“ not maximum IOPs or bandwidth â€“ very similar to their American counterparts. I thinkÂ hyperscale datacenters are sophisticated in understanding what they need from flash, and not requiring more than that. Enterprise could learn a thing or two.
Some server platforms have RAIDed HDDs, but most are direct map drives using a high availability (HA) layer across the server center â€“ HadoopÂ® HDFS or self-developed Hadoop like platforms. Some have also started to deploy microserver archival â€śbit buckets.â€ť A small ARMÂ® SoC with 4 HDDs totaling 12 TBytes of storage, giving densities like 72 TBytes of file storage in 2U of rack. While I can only find about 5,000 of those in China that are the first generation experiments, itâ€™s the first of a growing wave of archival solutions based on lower performance ARM servers. The feedback is clear – theyâ€™re not perfect yet, but the writing is on the wall. (If youâ€™re wondering about the math, thatâ€™s 5,000 x 12 TBytes = 60 Petabytesâ€¦.)
Yes, itâ€™s important, but maybe more than weâ€™re used to. Itâ€™s harder to get licenses for power in China. So itâ€™s really important to stay within the envelope of power your datacenter has. You simply canâ€™t get more. That means they have to deploy solutions that do more in the same power profile, especially as they move out of co-located datacenters into private ones. Annually, 50% more users supported, more storage capacity, more performance, more services, all in the same power. Thatâ€™s not so easy. I would expect solar power in their future, just as Apple has done.
Hereâ€™s where it gets interesting. They are developing a cousin to OpenCompute thatâ€™s called Scorpio. Itâ€™s Tencent, Alibaba, Baidu, and China Telecom so far driving the standard. Â The goals are similar to OpenCompute, but more aligned to standardized sub-systems that can be co-mingled from multiple vendors. There is some harmonization and coordination between OpenCompute and Scorpio, and in fact the Scorpio companies are members of OpenCompute. But where OpenCompute is trying to change the complete architecture of scale-out clusters, Scorpio is much more pragmatic â€“ some would say less ambitious. Theyâ€™ve finished version 1 and rolled out about 200 racks as a â€śtest caseâ€ť to learn from. Baidu was the guinea pig. Thatâ€™s around 6,000 servers. They werenâ€™t expecting more from version 1. Theyâ€™re trying to learn. Theyâ€™ve made mistakes, learned a lot, and are working on version 2.
Even if itâ€™s not exciting, it will have an impact because of the sheer size of deployments these guys are getting ready to roll out in the next few years. They see the progression as 1) they were using standard equipment, 2) theyâ€™re experimenting and learning from trial runs ofÂ Scorpio versions 1 and 2, and then theyâ€™ll work on 3) new architectures that are efficient and powerful, and different.
Information is pretty sketchy if you are not one of the member companies or one of their direct vendors. We were just invited to join Scorpio by one of the founders, and would be the first group outside of China to do so. If that all works out, Iâ€™ll have a much better idea of the details, and hopefully can influence the standards to be better for theseÂ hyperscale datacenter applications. Between OpenCompute and Scorpio weâ€™ll be seeing a major shift in the industry â€“ a shift that will undoubtedly be disturbing to a lot of current players. It makes me nervous, even though Iâ€™m excited about it. One thing is sure â€“ just as the server market volume is migrating from traditional enterprise toÂ hyperscale datacenter (25-30% of the server market and growing quickly), weâ€™re starting to see a migration to ChineseÂ hyperscale datacenters from US-based ones. They have to grow just to stay still. I mean â€“ come on â€“ there are 1.3 billion potential users in Chinaâ€¦.
Tags: Alibaba, Amazon, Apple, ARM, Baidu, China, China Telecom, datacenter, Facebook, Google, Hadoop, hard disk drive, HDD, hyperscale, Microsoft, OpenCompute, Scorpio, Shenzhen, Sina Weibo, solid state drive, SSD, Tencent, Twitter