Big data and Hadoop are all about exploiting new value and opportunities with data. In financial trading, business and some areas of science, it’s all about being fastest or first to take advantage of the data. The bigger the data sets, the smarter the analytics. The next competitive edge with big data comes when you layer in flash acceleration. The challenge is scaling performance in Hadoop clusters.
The most cost-effective option emerging for breaking through disk-to-I/O bottlenecks to scale performance is to use high-performance read/write flash cache acceleration cards for caching. This is essentially a way to get more work for less cost, by bringing data closer to the processing. The LSI® Nytro™ product has been shown during testing to improve the time it takes to complete Hadoop software framework jobs up to a 33%.
Flash cache cards increase Hadoop application performance
Combining flash cache acceleration cards with Hadoop software is a big opportunity for end users and suppliers. LSI estimates that less than 10% of Hadoop software installations today incorporate flash acceleration1. This will grow rapidly as companies see the increased productivity and ROI of flash to accelerate their systems. And Hadoop software adoption is also growing fast. IDC predicts a CAGR of as much as 60% by 20162. Drivers include IT security, e-commerce, fraud detection and mobile data user management. Gartner predicts that Hadoop software will be in two-thirds of advanced analytics products by 20153. Many thousands of Hadoop software clusters are already deployed.
Where flash makes the most immediate sense is with those who have smaller clusters doing lots of in-place batch processing. Hadoop is purpose-built for analyzing a variety of data, whether structured, semi-structured or unstructured, without the need to define a schema or otherwise anticipate results in advance. Hadoop enables scaling that allows an unprecedented volume of data to be analyzed quickly and cost-effectively on clusters of commodity servers. Speed gains are about data proximity. This is why flash cache acceleration typically delivers the highest performance gains when the card is placed directly in the server on the PCI Express® (PCIe) bus.
Combining the best of flash and HDDs to drive higher performance and storage capacity
PCIe flash cache cards are now available with multiple terabytes of NAND flash storage, which substantially increases the hit rate. We offer a solution with both onboard flash modules and Serial-Attached SCSI (SAS) interfaces to enable high-performance direct-attached storage (DAS) configurations consisting of solid state and hard disk drive storage. This couples the low-latency performance benefits of flash with the capacity and cost-per-gigabyte advantages of HDDs.
To keep the processor close to the data, Hadoop uses servers with DAS. And to get the data even closer to the processor, the servers are usually equipped with significant amounts of random access memory (RAM). An additional benefit: Smart implementation of Hadoop and flash components can reduce the overall server footprint and simplify scaling, with some solutions enabling up to 128 devices to share a very high bandwidth interface. Most commodity servers provide 8 or less SATA ports for disks, reducing expandability.
Hadoop is great, but flash-accelerated Hadoop is best. It’s an effective way, as you work to extract full value from big data, to secure a competitive edge.
I’ve been travelling to China quite a bit over the last year or so. I’m sitting in Shenzhen right now (If you know Chinese internet companies, you’ll know who I’m visiting). The growth is staggering. I’ve had a bit of a trains, planes, automobiles experience this trip, and that’s exposed me to parts of China I never would have seen otherwise. Just to accommodate sheer population growth and the modest increase in wealth, there is construction everywhere – a press of people and energy, constant traffic jams, unending urban centers, and most everything is new. Very new. It must be exciting to be part of that explosive growth. What a market. I mean – come on – there are 1.3 billion potential users in China.
The amazing thing for me is the rapid growth of hyperscale datacenters in China, which is truly exponential. Their infrastructure growth has been 200%-300% CAGR for the past few years. It’s also fantastic walking into a building in China, say Baidu, and feeling very much at home – just like you walked into Facebook or Google. It’s the same young vibe, energy, and ambition to change how the world does things. And it’s also the same pleasure – talking to architects who are super-sharp, have few technical prejudices, and have very little vanity – just a will to get to business and solve problems. Polite, but blunt. We’re lucky that they recognize LSI as a leader, and are willing to spend time to listen to our ideas, and to give us theirs.
Even their infrastructure has a similar feel to the US hyperscale datacenters. The same only different. ;-)
A lot of these guys are growing revenue at 50% per year, several getting 50% gross margin. Those are nice numbers in any country. One has $100’s of billions in revenue. And they’re starting to push out of China. So far their pushes into Japan have not gone well, but other countries should be better. They all have unique business models. “We” in the US like to say things like “Alibaba is the Chinese eBay” or “Sina Weibo is the Chinese Twitter”…. But that’s not true – they all have more hybrid business models, unique, and so their datacenter goals, revenue and growth have a slightly different profile. And there are some very cool services that simply are not available elsewhere. (You listening Apple®, Google®, Twitter®, Facebook®?) But they are all expanding their services, products and user base. Interestingly, there is very little public cloud in China. So there are no real equivalents to Amazon’s services or Microsoft’s Azure. I have heard about current development of that kind of model with the government as initial customer. We’ll see how that goes.
100’s of thousands of servers. They’re not the scale of Google, but they sure are the scale of Facebook, Amazon, Microsoft…. It’s a serious market for an outfit like LSI. Really it’s a very similar scale now to the US market. Close to 1 million servers installed among the main 4 players, and exabytes of data (we’ve blown past mere petabytes). Interestingly, they still use many co-location facilities, but that will change. More important – they’re all planning to probably double their infrastructure in the next 1-2 years – they have to – their growth rates are crazy.
Often 5 or 6 distinct platforms, just like the US hyperscale datacenters. Database platforms, storage platforms, analytics platforms, archival platforms, web server platforms…. But they tend to be a little more like a rack of traditional servers that enterprise buys with integrated disk bays, still a lot of 1G Ethernet, and they are still mostly from established OEMs. In fact I just ran into one OEM’s American GM, who I happen to know, in Tencent’s offices today. The typical servers have 12 HDDs in drive bays, though they are starting to look at SSDs as part of the storage platform. They do use PCIe® flash cards in some platforms, but the performance requirements are not as extreme as you might imagine. Reasonably low latency and consistent latency are the premium they are looking for from these flash cards – not maximum IOPs or bandwidth – very similar to their American counterparts. I think hyperscale datacenters are sophisticated in understanding what they need from flash, and not requiring more than that. Enterprise could learn a thing or two.
Some server platforms have RAIDed HDDs, but most are direct map drives using a high availability (HA) layer across the server center – Hadoop® HDFS or self-developed Hadoop like platforms. Some have also started to deploy microserver archival “bit buckets.” A small ARM® SoC with 4 HDDs totaling 12 TBytes of storage, giving densities like 72 TBytes of file storage in 2U of rack. While I can only find about 5,000 of those in China that are the first generation experiments, it’s the first of a growing wave of archival solutions based on lower performance ARM servers. The feedback is clear – they’re not perfect yet, but the writing is on the wall. (If you’re wondering about the math, that’s 5,000 x 12 TBytes = 60 Petabytes….)
Yes, it’s important, but maybe more than we’re used to. It’s harder to get licenses for power in China. So it’s really important to stay within the envelope of power your datacenter has. You simply can’t get more. That means they have to deploy solutions that do more in the same power profile, especially as they move out of co-located datacenters into private ones. Annually, 50% more users supported, more storage capacity, more performance, more services, all in the same power. That’s not so easy. I would expect solar power in their future, just as Apple has done.
Here’s where it gets interesting. They are developing a cousin to OpenCompute that’s called Scorpio. It’s Tencent, Alibaba, Baidu, and China Telecom so far driving the standard. The goals are similar to OpenCompute, but more aligned to standardized sub-systems that can be co-mingled from multiple vendors. There is some harmonization and coordination between OpenCompute and Scorpio, and in fact the Scorpio companies are members of OpenCompute. But where OpenCompute is trying to change the complete architecture of scale-out clusters, Scorpio is much more pragmatic – some would say less ambitious. They’ve finished version 1 and rolled out about 200 racks as a “test case” to learn from. Baidu was the guinea pig. That’s around 6,000 servers. They weren’t expecting more from version 1. They’re trying to learn. They’ve made mistakes, learned a lot, and are working on version 2.
Even if it’s not exciting, it will have an impact because of the sheer size of deployments these guys are getting ready to roll out in the next few years. They see the progression as 1) they were using standard equipment, 2) they’re experimenting and learning from trial runs of Scorpio versions 1 and 2, and then they’ll work on 3) new architectures that are efficient and powerful, and different.
Information is pretty sketchy if you are not one of the member companies or one of their direct vendors. We were just invited to join Scorpio by one of the founders, and would be the first group outside of China to do so. If that all works out, I’ll have a much better idea of the details, and hopefully can influence the standards to be better for these hyperscale datacenter applications. Between OpenCompute and Scorpio we’ll be seeing a major shift in the industry – a shift that will undoubtedly be disturbing to a lot of current players. It makes me nervous, even though I’m excited about it. One thing is sure – just as the server market volume is migrating from traditional enterprise to hyperscale datacenter (25-30% of the server market and growing quickly), we’re starting to see a migration to Chinese hyperscale datacenters from US-based ones. They have to grow just to stay still. I mean – come on – there are 1.3 billion potential users in China….
Tags: Alibaba, Amazon, Apple, ARM, Baidu, China, China Telecom, datacenter, Facebook, Google, Hadoop, hard disk drive, HDD, hyperscale, Microsoft, OpenCompute, Scorpio, Shenzhen, Sina Weibo, solid state drive, SSD, Tencent, Twitter