Scaling compute power and storage in space-constrained datacenters is one of the top IT challenges of our time. With datacenters worldwide pressed to maximize both within the same floor space, the central challenge is increasing density.
At IBM we continue to design products that help businesses meet their most pressing IT requirements, whether it’s optimizing data analytics, data management, the fastest growing workloads such as social media and cloud delivery or, of course, increasing compute and storage density. Our technology partners are a crucial part of our work, and this week at AIS we are teaming with LSI to showcase our new high-density NeXtScale computing platform and x3650 M4 HD server. Both leverage LSI® SAS RAID controllers for data protection, and the x3650 M4 HD server features an integrated leading-edge LSI 12Gb/s SAS RAID controller.
NeXtScale System – ideal for HPC, cloud service providers and Web 2.0
The NeXtScale System®, an economical addition to the IBM System® family, maximizes usable compute density by packing up to 84 x86-based systems and 2,016 processing cores into a standard 19-inch rack to enable seamless integration into existing infrastructures. The family also enables organizations of all sizes and budgets to start small and scale rapidly for growth. The NeXtScale System is an ideal high-density solution for high-performance computing (HPC), cloud service providers and Web 2.0.
The System x3650 M4 HD, IBM’s newest high-density storage server, is designed for data-intensive analytics or business-critical workloads. The 2U rack server supports up to 62% more drive bays than the System x3650 M4 platform, providing connections for up to 26 2.5-inch HDDs or SSDs. The server is powered by the Intel Xeon processor E5-2600 family and features up to 6 PCIe 3.0 slots and an onboard LSI 12Gb/s SAS RAID controller. This combination gives a big boost to data applications and cloud deployments by increasing the processing power, performance and data protection that are the lifeblood of these environments.
IBM dense storage solutions to help drive data management, cloud computing and big data strategies
Cloud computing and big data will continue to have a tremendous impact on the IT infrastructure and create data management challenges for businesses. At IBM, we think holistically about the needs of our customers and believe that our new line of dense storage solutions will help them design, develop and execute on their data management, cloud computing and big data strategies.
Tags: 12Gb/s SAS RAID controller, AIS, cloud service providers, dense computing, enterprise, enterprise IT, high performance computing, HPC, IBM, Intel Xeon processor, NeXtScale computing platform, PCIe 3.0, web 2.0, x3650 M4 HD server
Each new generation of NAND flash memory reduces the fabrication geometry – the dimension of the smallest part of an integrated circuit used to build up the components inside the chip. That means there are fewer electrons storing the data, leading to increased errors and a shorter life for the flash memory. No need to worry. Today’s flash memory depends upon the intelligence and capabilities of the solid state drive (SSD) controller to help keep errors in check and get the longest life possible from flash memory, making it usable in compute environments like laptop computers and enterprise datacenters.
Today’s volume NAND flash memory uses a 20nm and 19nm manufacturing process, but the next generation will be in the 16nm range. Some experts speculate that today’s controllers will struggle to work with this next generation of flash memory to support the high number of write cycles required in datacenters. Also, the current multi-level cell (MLC) flash memory is transitioning to triple-level cell (TLC), which has an even shorter life expectancy and higher error rates.
Can sub-20nm flash survive in the datacenter?
Yes, but it will take a flash memory controller with smarts the industry has never seen before. How intelligent? Sub-20nm flash will need to stretch the life of the flash memory beyond the flash manufacturer’s specifications and correct far more errors than ever before, while still maintaining high throughput and very low latency. And to protect against periodic error correction algorithm failures, the flash will need some kind of redundancy (backup) of the data inside the SSD itself.
When will such a controller materialize?
LSI this week introduced the third generation of its flagship SSD and flash memory controller, called the SandForce SF3700. The controller is newly engineered and architected to solve the lifespan, performance, and reliability challenges of deploying sub-20nm flash memory in today’s performance-hungry enterprise datacenters. The SandForce SF3700 also enables longer periods between battery recharges for power-sipping client laptop and ultrabook systems. It all happens in a single ASIC package. The SandForce SF3700 is the first SSD controller to include both PCIe and SATA host interfaces natively in one chip to give customers of SSD manufacturers an easy migration path as more of them move to the faster PCIe host interface.
How does the SandForce SF3700 controller make sub-20nm flash excel in the datacenter?
Our new controller builds on the award-winning capabilities of the current SandForce SSD and flash controllers. We’ve refined our DuraWrite™ data reduction technology to streamline the way it picks blocks, collects garbage and reduces the write count. You’ll like the result: longer flash endurance and higher read and write speeds.
The SandForce SF3700 includes SHIELD™ error correction, which applies LDPC and DSP technology in unique ways to correct the higher error rates from the new generations of flash memory. SHIELD technology uses a multi-level error correction schema to optimize the time to get the correct data. Also, with its exclusive Adaptive Code Rate feature, SHIELD leverages DuraWrite technology’s ability to span internal NAND flash boundaries between the user data space and the flash manufacturer’s dedicated ECC field. Other controllers only use one size of ECC code rate for flash memory – the one largest size designed to support the end of the flash’s life. Early in the flash life, a much smaller size ECC is required, and SHIELD technology scales down the ECC accordingly, diverting the remaining free space as additional over provisioning. SHIELD partially increases the ECC size over time as the flash ages to correct the increasing failures, but does not use the largest ECC size until the flash is nearly at the end of its life.
Why is this good? Greater over provisioning over the life of the SSD improves performance and increases the endurance. SHIELD also allows the ECC field to grow even larger after it reaches its specified end of life. The big takeaway: All of these SHIELD capabilities increase flash write endurance many times beyond the manufacturer’s specification. In fact at the 2013 Flash Memory Summit exposition in Santa Clara, CA, SHIELD was shown to extend the endurance of a particular Micron NAND flash by nearly six times.
That’s not all. The SandForce SF3700 controller’s RAISE™ data reliability feature now offers stronger protection, including full die failure and more options for protecting data on SSDs with low (e.g., 32GB & 64GB) and binary (256GB vs. 240GB) capacities.
So what about end user systems?
The beauty of all SandForce flash and SSD controllers is its onboard firmware, which takes the one common hardware component – the ASIC – and adapts it to the user’s storage environment. For example, in client applications the firmware helps the controller preserve SSD power to enable users of laptop and ultrabook systems to remain unplugged longer between battery recharges. In contrast, enterprise environments require the highest possible performance and lowest latency. This higher performance draws more power, a tradeoff the enterprise is willing to make for the fastest time-to-data. The firmware makes other similar tradeoffs based on which storage environment it is serving.
Although most people consider the enterprise and client storage needs are very diverse, we think the new SandForce SF3700 Flash and SSD controller showcases the perfect balance of power and performance that any user hanging ten can appreciate.
With the much anticipated launch of 12gb/s SAS MegaRAID and 12Gb/s SAS expanders featuring DataBolt™ SAS bandwidth-aggregation technologies, LSI is taking the bold step of moving beyond traditional IO performance benchmarks like IOMeter to benchmarks that simulate real-world workloads.
In order to fully illustrate the actual benefit many IT administrators can realize using 12Gb/s SAS MegaRAID products on their new server platforms, LSI is demonstrating application benchmarks on top of actual enterprise applications at AIS.
For our end-to-end 12Gb/s SAS MegaRAID demonstration, we chose Benchmark Factory® for Databases running on a MySQL Database. Benchmark Factor, a database performance testing tool that allows you to conduct database workload replay, industry-standard benchmark testing and scalability testing, uses real database application workloads such as TPC-C, TPC-E and TPC-H. We chose the TPC-H benchmark, a decision-support benchmark, because of its large streaming query profile. TPC-H shows the performance of decision support systems – which examine large volumes of data to simplify the analysis of information for business decisions – making it an excellent benchmark to showcase 12Gb/s SAS MegaRAID technology and its ability to maximize the bandwidth of PCIe 3.0 platforms, compared to 6Gb/s SAS.
The demo uses the latest Intel R2216GZ4GC servers based on the new Intel® Xeon® processor E5-2600 v2 product family to illustrate how 12Gb/s SAS MegaRAID solutions are needed to take advantage of the bandwidth capabilities of PCIe® 3.0 bus technology.
When the benchmarks are run side-by-side on the two platforms, you can quickly see how much faster data transfer rates are executed, and how much more efficiently the Intel servers handles data traffic. We used Quest Software’s Spotlight® on MySQL tool to monitor and measure data traffic from end storage devices to the clients running the database queries. More importantly, the test also illustrates how many more user queries the 12Gb/s SAS-based system can handle before complete resource saturation – 60 percent more than 6Gb/s SAS in this demonstration.
What does this mean to IT administrators? Clearly, resourse utilization is much higher, improving total cost of ownership (TCO) with their database server. Or, conversley, 12Gb/s SAS can reduce cost per IO since fewer drives can be used with the server to deliver the same performance as the previous 6Gb/s SAS generation of storage infrastructure.
The staggering growth of smart phones, tablets and other mobile devices is sending a massive flood of data through today’s mobile networks. Compared to just a few years ago, we are all producing and consuming far more videos, photos, multimedia and other digital content, and spending more time in immersive and interactive applications such as video and other games – all from handheld devices.
Think of mobile, and you think remote – using a handheld when you’re out and about. But according to the Cisco® VNI Mobile Forecast 2013, while 75% of all videos today are viewed on mobile devices by 2017, 46% of mobile video will be consumed indoors (at home, at the office, at the mall and elsewhere). With the widespread implementation of IEEE® 802.11 WiFi on mobile devices, much of that indoor video traffic will be routed through fixed broadband pipes.
Unlike residential indoor solutions, enterprise and public area access infrastructures – for outdoor connections – are much more diverse and complicated. For example, the current access layer architectures include Layer 2/3 wiring closet switches and WiFi access points, as shown below. Mobile service providers are currently seeking architectures that enable them to take advantage of both indoor enterprise and public area access infrastructure. These architectures must integrate seamlessly with existing mobile infrastructures and require no investment in additional access equipment by service providers in order for them to provide a consistent, quality experience for end users indoors and outdoors. For their part, mobile service providers must:
The following figure shows the three possible paths mobile service providers wanting to offer indoor enterprise/public can take. Approach 1 is ideal for enterprises trying to improve coverage in particular areas of a corporate campus. Approaches 2 and 3 not only provide uniform coverage across the campus but also support differentiating capabilities such as the allocation of application and mobility-centric radio spectrum across WiFi and cellular frequencies. A key factor to consider when evaluating these approaches is the extent to which equipment ownership is split between the enterprise and the mobile service provider. Approaches 2 & 3 increase capital expenditures for the operator because of the radio heads and small cells that need to be deployed across the enterprise or public campus. At AIS, LSI is demonstrating approach 2.
The last but certainly not least important consideration between approaches 2 and 3 is whether these indoor/outdoor small cells employ self-organizing network (SON) techniques. For service providers, the small cells ideally would be self-organizing and the macro cells serve any additional management functions. The advantage of approach 3 is that it offloads more of the macro cell traffic and makes various campus small cells self-organizing, significantly reducing operational costs for the service provider.
LSI’s Accelerating Innovation Summit in San Jose has given me a sneak peak of some solutions our partners are putting together to solve datacenter challenges. Such is the case with EMC’s ScaleIO business unit (EMC recently acquired ScaleIO), which has rolled out some nifty software that helps streamline VDI (Virtual Desktop Infrastructure) scaling.
As I shared in a previous blog, VDI deployments are growing like gangbusters. It’s easy to see why. The manageability and security benefits of virtualized desktop environment are tough to beat. Deploying and supporting hundreds of desktops as VDI instances on a single server lets you centralize desktop management and security. Another advantage is that patches, security updates, and hardware and software upgrades demand much less overhead. VDI also dramatically reduces the risk that desktop users will breach security by making it easier to prevent data from being copied onto portable media or sent externally.
Mass boots drag down VDI performance
But as with all new technologies, a number of performance challenges can crop up when you move to a virtual world. In enterprise-scale deployments, VDI performance can suffer when the IT administrator attempts to boot all those desktops Monday morning or reboot after Patch Tuesday. What’s more, VDI performance can drop significantly when users all log in in at the same time each morning. In addition, virtualized environments sometimes are unfriendly to slews of users trying to access files simultaneously, making them wait because of the heavy traffic load. One bottleneck often is legacy SAN-connected storage since file access requests are queued through a single storage controller. And of course increasing the density of virtual desktops supported by a server can exacerbate the whole performance problem.
VDI’s are ripe for distributed storage, and the ScaleIO ECS (Elastic Converged Storage) software is a compelling solution, incorporating an elastic storage infrastructure that scales both capacity and performance with changing business requirements. The software pools local direct attached storage (DAS) on each server into a large storage repository. If desktops are moved between physical servers, or if a server fails, the datacenter’s existing high-speed network moves data to the local storage of the new server.
LSI Nytro and ScaleIO ECS software boost VDI session number, reduce costs
In an AIS demonstration, the ScaleIO ECS software leverages the application acceleration of the LSI® Nytro™ MegaRAID® card to significantly increase the number of VDI sessions the VDI server could support, reducing the cost of each VDI session by up to 33%. Better yet, application acceleration gives users shorter response times than they see on their laptops. By using the ScaleIO ECS software and Nytro MegaRAID card, customers get the benefits high-availability storage and intelligent flash acceleration at a more budget-friendly price point than comparable SAN-based solutions.
Back in the 1990s, a new paradigm was forced into space exploration. NASA faced big cost cuts. But grand ambitions for missions to Mars were still on its mind. The problem was it couldn’t dream and spend big. So the NASA mantra became “faster, better, cheaper.” The idea was that the agency could slash costs while still carrying out a wide variety of programs and space missions. This led to some radical rethinks, and some fantastically successful programs that had very outside-the-box solutions. (Bouncing Mars landers anyone?)
That probably sounds familiar to any IT admin. And that spirit is alive at LSI’s AIS – The Accelerating Innovation Summit, which is our annual congress of customers and industry pros, coming up Nov. 20-21 in San Jose. Like the people at Mission Control, they all want to make big things happen… without spending too much.
Take technology and line of business professionals. They need to speed up critical business applications. A lot. Or IT staff for enterprise and mobile networks, who must deliver more work to support the ever-growing number of users, devices and virtualized machines that depend on them. Or consider mega datacenter and cloud service providers, whose customers demand the highest levels of service, yet get that service for free. Or datacenter architects and managers, who need servers, storage and networks to run at ever-greater efficiency even as they grow capability exponentially.
(LSI has been working on many solutions to these problems, some of which I spoke about in this blog.)
It’s all about moving data faster, better, and cheaper. If NASA could do it, we can too. In that vein, here’s a look at some of the topics you can expect AIS to address around doing more work for fewer dollars:
And, I think you’ll find some astounding products, demos, proof of concepts and future solutions in the showcase too – not just from LSI but from partners and fellow travelers in this industry. Hey – that’s my favorite part. I can’t wait to see people’s reactions.
Since they rethought how to do business in 2002, NASA has embarked on nearly 60 Mars missions. Faster, better, cheaper. It can work here in IT too.
Tags: 12Gb/s SAS, AIS, big data analytics, cloud infrastructure, cloud services, datacenter, flash, flash memory, hyperscale datacenters, NAS, NASA, SAN, SDN, shareable DAS, software-defined networks, sub-20nm flash, triple-level cell flash, VDI, web 2.0
You may have noticed I’m interested in Open Compute. What you may not know is I’m also really interested in OpenStack. You’re either wondering what the heck I’m talking about or nodding your head. I think these two movements are co-dependent. Sure they can and will exist independently, but I think the success of each is tied to the other. In other words, I think they are two sides of the same coin.
Why is this on my mind? Well – I’m the lucky guy who gets to moderate a panel at LSI’s AIS conference, with the COO of Open Compute, and the founder of OpenStack. More on that later. First, I guess I should describe my view of the two. The people running these open-source efforts probably have a different view. We’ll find that out during the panel.
I view Open Compute as the very first viable open-source hardware initiative that general business will be able to use. It’s not just about saving money for rack-scale deployments. It’s about having interoperable, multi-source systems that have known, customer-malleable – even completely customized and unique – characteristics including management. It also promises to reduce OpEx costs.
Ready for Prime Time?
But the truth is Open Compute is not ready for prime time yet. Facebook developed almost all the designs for its own use and gifted them to Open Compute, and they are mostly one or two generations old. And somewhere between 2,000 and 10,000 Open Compute servers have shipped. That’s all. But, it’s a start.
More importantly though, it’s still just hardware. There is still a need to deploy and manage the hardware, as well as distribute tasks, and load balance a cluster of Open Compute infrastructure. That’s a very specialized capability, and there really aren’t that many people who can do that. And the hardware is so bare bones – with specialized enclosures, cooling, etc – that it’s pretty hard to deploy small amounts. You really want to deploy at scale – thousands. If you’re deploying a few servers, Open Compute probably isn’t for you for quite some time.
I view OpenStack in a similar way. It’s also not ready for prime time. OpenStack is an orchestration layer for the datacenter. You hear about the “software defined datacenter.” Well, this is it – at least one version. It pools the resources (compute, object and block storage, network, and memory at some time in the future), presents them, allows them to be managed in a semi-automatic way, and automates deployment of tasks on the scaled infrastructure. Sure there are some large-scale deployments. But it’s still pretty tough to deploy at large scale. That’s because it needs to be tuned and tailored to specific hardware. In fact, the biggest datacenters in the world mostly use their own orchestration layer. So that means today OpenStack is really better at smaller deployments, like 50, 100 or 200 server nodes.
The synergy – 2 sides of the same coin
You’ll probably start to see the synergy. Open Compute needs management and deployment. OpenStack prefers known homogenous hardware or else it’s not so easy to deploy. So there is a natural synergy between the two. It’s interesting too that some individuals are working on both… Ultimately, the two Open initiatives will meet in the big, but not-too-big (many hundreds to small thousands of servers) deployments in the next few years.
And then of course there is the complexity of the interaction of for-profit companies and open-source designs and distributions. Companies are trying to add to the open standards. Sometimes to the betterment of standards, but sometimes in irrelevant ways. Several OEMs are jumping in to mature and support OpenStack. And many ODMs are working to make Open Compute more mature. And some companies are trying to accelerate the maturity and adoption of the technologies in pre-configured solutions or appliances. What’s even more interesting are the large customers – guys like Wall Street banks – that are working to make them both useful for deployment at scale. These won’t be the only way scaled systems are deployed, but they’re going to become very common platforms for scale-out or grid infrastructure for utility computing.
Here is how I charted the ecosystem last spring. There’s not a lot of direct interaction between the two, and I know there are a lot of players missing. Frankly, it’s getting crazy complex. There has been an explosion of players, and I’ve run out of space, so I’ve just not gotten around to updating it. (If anyone engaged in these ecosystems wants to update it and send me a copy – I’d be much obliged! Maybe you guys at Nebula ? ;-)).
An AIS keynote panel – What?
Which brings me back to that keynote panel at AIS. Every year LSI has a conference that’s by invitation only (sorry). It’s become a pretty big deal. We have some very high-profile keynotes from industry leaders. There is a fantastic tech showcase of LSI products, partner and ecosystem company’s products, and a good mix of proof of concepts, prototypes and what-if products. And there are a lot of breakout sessions on industry topics, trends and solutions. Last year I personally escorted an IBM fellow, Google VPs, Facebook architects, bank VPs, Amazon execs, flash company execs, several CTOs, some industry analysts, database and transactional company execs…
It’s a great place to meet and interact with peers if you’re involved in the datacenter, network or cellular infrastructure businesses. One of the keynotes is actually a panel of 2. The COO of Open Compute, Cole Crawford, and the co-founder of OpenStack, Chris Kemp (who is also the founder and CSO of Nebula). Both of them are very smart, experienced and articulate, and deeply involved in these movements. It should be a really engaging, interesting keynote panel, and I’m lucky enough to have a front-row seat. I’ll be the moderator, and I’m already working on questions. If there is something specific you would like asked, let me know, and I’ll try to accommodate you.
You can see more here.
Yea – I’m very interested in Open Compute and OpenStack. I think these two movements are co-dependent. And I think they are already changing our industry – even before they are ready for real large-scale deployment. Sure they can and will exist independently, but I think the success of each is tied to the other. The people running these open-source efforts might have a different view. Luckily, we’ll get to find out what they think next month… And I’m lucky enough to have a front row seat.