Software-defined datacenters (SDDC) and software-defined storage (SDS) are big movements in the industry right now. Just read the trade press or attend any conference and you’ll see that – it’s a big deal. We’re seeing for-pay vendors providing solutions, as well as strong ecosystems evolving around open source solutions. It’s not surprising why – there is a need for enterprises to deploy large scale compute clusters, and that takes either deep expertise that’s very rare, or orchestration tools that have not existed in the past. It’s the “necessity being the mother of invention” thing…
So datacenters are being forced to deploy large-scale clusters to handle the scale of compute needed, and the amount of data that is being captured, analyzed and stored. As an industry then, we’re being forced to simplify applications as well as the management and deployment of these large scale clusters. That’s great for datacenters. It’s even better that we’re figuring out how to provide those expanded resources and manage them for less money, and with fewer people to manage them. (well, it’s probably good for everyone but the sys admins…)
These new technologies are the key enabler. This blog, the second in my three-part series (based on interesting questions I was asked by CEO & CIO, a Chinese business magazine) examines how SDDC and SDS are helping enterprises get more out of their datacenter gear. You can read part 1 here.
CEO & CIO: What are your views on software-defined storage? What’s the development roadmap of LSI in achieving software-defined storage?
We see SDS as one of a number of vital changes underway in the datacenter. SDS promises to span some or all of file, object, key-value and block in order to pool resources and to simplify the infrastructure required in a datacenter, as well as to smooth the migration to object or key-value storage over time. Great examples of these SDS solutions are: Ceph, Swift, Cinder, Gluster, VSAN / VVOLs …. The model brings great benefits in datacenter management, resource pooling and allocation and usability. The main problem is performance – and by that I do not mean extreme performance. I mean poor performance that damages TCO, reduces efficiency of infrastructure and increases costs. Much worse than you would get otherwise. These solutions work, but compromise resource efficiency. Many require flash integrated in the system to simply maintain existing performance. However, this is a permanent change in how storage is used and deployed, and it’s a good change.
While block is what underlies most storage and will continue to for some time, the system and application level view is changing. We view SDS as having great synergy with LSI’s architectural direction – shared DAS infrastructure and ability to add “above the block” capability like quality of service (QoS), direct key/value hardware, etc, and bring improved performance and resource efficiency. Together, SDS + LSI innovation = resource pooling and allocation, including flash and cool/cold storage, management and virtual machine (VM) agility, performance and resource efficiency.
As a result, there has been tremendous interest from SDS vendors to work with us, to demonstrate prototype systems, and to make solutions better. We are working with many SDS partners to provide complete solutions. This is not a one-size-fits-all world, so there will be several solutions. Those solutions are not ready yet, but they’re coming, and will probably displace the older file and block storage systems we know and love.
CEO & CIO: Industry giants such as Intel have outlined their visions for software-defined datacenters. Chinese Internet giants have also put forward similar plans. What views does LSI have on software-defined datacenter?
If you view the AIS keynote, you’ll see we believe this is a critical part of the future datacenter. But just one critical part. Interestingly, we had Intel present as well during AIS.
SDDC creates a critical control plane for the datacenter. It is the software abstraction model that enables resource pooling. Resource pooling of compute, storage and network, with memory in the near future. It enables the automation and allocation of tasks and resources in the datacenter. The leading models are VMware® SDDC and OpenStack® software, but there are others that are important too. They’re just a little less public right now. Anyway – it’s way too early to predict which will be dominant. Just like SDS, SDDC exchanges simplified control and abstraction for performance and efficiency. As a result, it’s not a very useful concept, at least not at hyperscale levels, without hardware that really, truly supports and enables it. As the datacenter has changed from a compute-centric model to a dataflow model, the storage and network and, soon, memory become very important. They dictate the useful work that can be gotten from the datacenter.
I believe we are, as an industry, at the start of the hardware transition to support these. We are building hardware solutions for storage and network that are being designed into products today. We are working very closely with three of the largest datacenters in the US, and two in China to build not just the SDDC, but the pooled hardware infrastructure that is needed to make it work.
It’s critical to understand that SDDC solutions really work, but often the performance and efficiency is – well – terrible. That’s been the evolution in computer science and computer architecture since the beginning. You raise the abstraction level, which simplifies development and support, but either causes poor performance or requires more hardware capability that is architected to support those abstractions.
As a result, it’s really difficult to talk about SDDCs without a rack-scale architecture to support them. So we are working closely with the key SDDC software solutions/vendors, even the ones I didn’t list, to integrate and optimize the solutions to make the SDDC actually work. We have been working very closely with VMware and the OpenStack community, and we are changing the way the software plane interacts with the pooled resources. Again, there has been so much interest in our shared DAS, incorporating flash in the same architecture and management, and our Axxia® SDN control plane processor for networks.
I talk about rack-scale architectures to support SDS in the second half of this keynote and in my blog “China: A lot of talk about resource pooling, a better name for disaggregation.”
Summary: So I believe SDS is a big movement, it’s a good thing, and it’s here to stay. But… the performance is poor today. Very poor. That’s where we come in, with hardware that enables SDS and not only makes performance acceptable, but helps make it excellent, and improves efficiency and cost too. And SDDC is also a massive movement that will define the future datacenter. But it is intertwined with the rack-level concepts of pooling or disaggregation to make it really compelling. Again – that’s where we come in.
These were good questions that were interesting to answer. I hope it’s interesting to you too. I’ll post some more soon about how the Chinese Internet giants differ from other customers, and about forward-looking technologies.
Tags: AIS, Axxia, CEO & CIO Magazine, Ceph, China, Cinder, cold storage, cool storage, datacenter, direct attached storage, disaggregation, ecosystem, Gluster, hyperscale datacenter, key-value storage, object storage, OpenStack, pooling, QoS, quality of service, rack scale architecture, SDDC, SDS, shared DAS, software-defined datacenter, software-defined storage, Swift, virtual machine, VM, VMware, VSAN, VVOL
It’s the start of the new year, and it’s traditional to make predictions – right? But predicting the future of the datacenter has been hard lately. There have been and continue to be so many changes in flight that possibilities spin off in different directions. Fractured visions through a kaleidoscope. Changes are happening in the businesses behind datacenters, the scale, the tasks and what is possible to accomplish, the value being monetized, and the architectures and technologies to enable all of these.
A few months ago I was asked to describe the datacenter in 2020 for some product planning purposes. Dave Vellante of Wikibon & John Furrier of SiliconANGLE asked me a similar question a few weeks ago. 2020 is out there – almost 7 years. It’s not easy to look into the crystal ball that far and figure out what the world will look like then, especially when we are in the midst of those tremendous changes. For some context I had to think back 7 years – what was the datacenter like then, and how profound have the changes been over the past 7 years?
And 7 years ago, our forefathers…
It was a very different world. Facebook barely existed, and had just barely passed the “university only” membership. Google was using Velcro, Amazon didn’t have its services, cloud was a non-existent term. In fact DAS (direct attach storage) was on the decline because everyone was moving to SAN/NAS. 10GE networking was in the future (1GE was still in growth mode). Linux was not nearly as widely accepted in enterprise – Amazon was in the vanguard of making it usable at scale (with Werner Vogels saying “it’s terrible, but it’s free, as in free beer”). Servers were individual – no “PODs,” and VMware was not standard practice yet. SATA drives were nowhere in datacenters.
An enterprise disk drive topped out at around 200GB in capacity. Nobody used the term petabyte. People, including me, were just starting to think about flash in datacenters, and it was several years later that solutions became available. Big data did not even exist. Not as a term or as a technology, definitely not Hadoop or graph search. In fact, Google’s seminal paper on MapReduce had just been published, and it would become the inspiration for Hadoop – something that would take many years before Yahoo picked it up and helped make it real.
Analytics were statistical and slow, and you had to be very explicitly looking for something. Advertising on the web was a modest business. Cold storage was tape or MAID, not vast pools of cheap disks in the cloud at absurdly low price points. None of the Chinese web-cloud guys existed… In truth, at LSI we had not even started looking at or getting to know the web datacenter guys. We assumed they just bought from OEMs…
No one streamed mainstream media – TV and movies – and there were no tablets to stream them to. YouTube had just been purchased by Google. Blu-ray was just getting started and competing with HD-DVD (which I foolishly bought 7 years ago), and integrated GPS’s in your car were a high-tech growth area. The iPhone or Android had not launched, Danger’s Sidekick was the cool phone, flip phones were mainstream, there was no App store or the billions of sales associated with that, and a mobile web browser was virtually useless.
Dell, IBM, and HP were the only real server companies that mattered, and the whole industry revolved around them, as well as EMC and NetApp for storage. Cisco, Lenovo and Huawei were not server vendors. And Sun was still Sun.
7 years from now
So – 7 years from now? That’s hard to predict, so take this with a grain of salt… There are many ways things could play out, especially when global legal, privacy, energy, hazardous waste recycling, and data retention requirements come into play, not to mention random chaos and invention along the way.
Compute-centric to dataflow-centric
Major applications are changing (have changed) from compute-centric to dataflow architectures. That is big data. The result will probably be a decline in the influence of processor vendors, and the increased focus on storage, network and memory, and optimized rack-level architectures. A handful of hyperscale datacenters are leading the way, and dragging the rest of us along. These types of solutions are already being deployed in big enterprise for specialized use cases, and their adoption will only increase with time. In 7 years, the main deployment model will echo what hyperscale datacenters are doing today: disaggregated racks of compute, memory and storage resources.
The datacenter is now being viewed as a profit growth enabler, rather than a cost center. That implies more compute = more revenue. That changes the investment profile and the expectations for IT. It will not be enough for enterprise IT departments to minimize change and risk because then they would be slowing revenue growth.
Customers and vendors
We are in the early stages of a customer revolt. Whether it’s deserved or not is immaterial, though I believe it’s partially deserved. Large customers have decided (and I’m doing broad brush strokes here) that OEMs are charging them too much and adding “features” that add no value and burn power, that the service contracts are excessively expensive and that there is very poor management interoperability among OEM offerings – on purpose to maintain vendor lockin. The cost structures of public cloud platforms like Amazon are proof there is some merit to the argument. Management tools don’t scale well, and require a lot of admin intervention. ISVs are seen as no better. Sure the platforms and apps are valuable and critical, but they’re really expensive too, and in a few cases, open source solutions actually scale better (though ISVs are catching up quickly).
The result? We’re seeing a push to use whitebox solutions that are interoperable and simple. Open source solutions – both software and hardware – are gaining traction in spite of their problems. Just witness the latest Open Compute Summit and the adoption rate of Hadoop and OpenStack. In fact many large enterprises have a policy that’s pretty much – any new application needs to be written for open source platforms on scale-out infrastructure.
Those 3 OEMs are struggling. Dell, HP and IBM are selling more servers, but at a lower revenue. Or in the case of IBM – selling the business. They are trying to upsell storage systems to offset those lost margins, and they are trying to innovate and vertically integrate to compensate for the changes. In contrast we’re seeing a rapid increase planned from self-built, self-architected hyperscale datacenters, especially in China. To be fair – those pressures on price and supplier revenue are not necessarily good for our industry. As well, there are newer entrants like Huawei and Cisco taking a noticeable chunk of the market, as well as an impending growth of ISV and 3rd party full rack “shrink wrapped” systems. Everybody is joining the party.
Storage, cold storage and storage-class memory
Stepping further out on the limb, I believe (but who really knows) that by 2020 storage as we know is no longer shipping. SMB is hollowed out to the cloud – that is – why would any small business use anything but cloud services? The costs are too compelling. Cloud storage is stratified into 3 levels: storage-class memory, flash/NVM and cool/cold bulk disk storage. Cold storage is going to be a very, very important area. You need to save that data, but spend zero power, and zero $ on storing it. Just look at some of the radical ideas like Facebook’s Blu-ray jukebox to address that, which was masterminded by a guy I really like – Gio Coglitore – and I am very glad is getting some rightful attention. (http://www.wired.com/wiredenterprise/2014/02/facebook-robots/)
I believe that pooled storage class memory is inevitable and will disrupt high-performance flash storage, probably beginning in 2016. My processor architect friends and I have been daydreaming about this since 2005. That disruption’s OK, because flash use will continue to grow, even as disk use grows. There is just too much data. I’ve seen one massive vendor’s data showing average servers are adding something like 0.2 hard disks per year and 0.1 SSDs per year – and that’s for the average server including diskless nodes that are usually the most common in hyperscale datacenters. So growth in spite of disruption and capacity growth.
Data will be pooled, and connected by fabric as distributed objects or key/value pairs, with erasure coding. In fact, Object store (key/value – whatever) may have “obsoleted” block storage. And the need for these larger objects will probably also obsolete file as we’re used to it. Sure disk drives may still be block based, though key/value gives rise to all sorts of interesting opportunities to support variable size structures, obscure small fault domains, and variable encryption/compression without wasting space on disk platters. I even suspect that disk drives as we know them will be morphing into cold store specialty products that physically look entirely different and are made from different materials – for a lot of reasons. 15K drives will be history, and 10K drives may too. In fact 2” drives may not make sense anymore as the laptop drive and 15K drive disappear and performance and density are satisfied by flash.
Enterprise becomes private cloud that is very similar structurally to hyperscale, but is simply in an internal facility. And SAN/NAS products as we know them will be starting on the long end of the tail as legacy support products. Sure new network based storage models are about to emerge, but they’re different and more aligned to key/value.
Rack-scale architectures will have taken over clustered deployments. That means pooled resources. Processing will be pools of single socket SoC servers enabling massive clusters, rather than lots of 2- socket servers. These SoCs might even be mobile device SoCs at some point or at least derived from that – the economics of scale and fast cadence of consumer SoCs will make that interesting, maybe even inevitable. After all, the current Apple A7 in the iphone 5S is a dual core, 64-bit V8 ARM at 1.4GHz and the whole iPhone costs as much as mainstream server processor chips. In a few years, an 8 or 16 core equivalent at 1.5GHz or 2GHz is not hard to imagine, and the cost structure should be excellent.
Rapidly evolving open source applications will have morphed into eventually consistent dataflow tasks. Or they will be emerging in-memory applications working on vast data structures in the pooled storage class memory at the rack or larger scale, which will add tremendous monetary value to businesses. Whatever the evolutionary paths – the challenge for the next 10 years is optimizing dataflow as the amount used continues to exponentially grow. After all – data has value in aggregate, so why would you throw anything away, even as the amount we generate increases?
Clusters will be autonomous. Really autonomous. As in a new term I love: “emergent.” It’s when you can start using big data analytics to monitor the datacenter, and make workload/management and data placement decisions in real time, automatically, and the datacenter begins to take on un-predicted characteristics. Deployment will be autonomous too. Power on a pod of resources, and it just starts working. Google does that already.
Layer 2 datacenter network switches will either be disappearing or will have migrated to a radically different location in the rack hierarchy. There are many ways this can evolve. I’m not sure which one(s) will dominate, but I know it will look different. And it will have different bandwidth. 100G moving to 400G interconnect fabric over fiber.
So there you have it. Guaranteed correct…
Different applications and dataflow, different architectures, different processors, different storage, different fabrics. Probably even a re-alignment of vendors.
Predicting the future of the datacenter has not been easy. There have been, and are so many changes happening. The businesses behind them. The scale, the tasks and what is possible to accomplish, the value being monetized, and the architectures and technologies to enable all of these. But at least we have some idea what’s ahead. And it’s pretty different, and exciting.
Tags: 10 gigabit ethernet, 2020, Amazon, Apple, China, Cisco, cloud storage, cold storage, datacenter, Dell, EMC, Facebook, flash, Google, Hadoop, HP, Huawei, hyperscale datacenter, IBM, iPhone, kaleidoscope, Lenovo, NAS, NetApp, non-volatile memory, NVM, Open Compute, OpenStack, rack scale architecture, SAN, SoC, Sun, VMware, YouTube
Many of you may have heard of a poem written by Robert Fulgham 25 years ago called “All I Really Need to Know I Learned in Kindergarten.” In it he provides such pearls of wisdom like “Play fair,” “Clean up your own mess,” “Don’t take things that aren’t yours” and “Flush.” By now you’re wondering what any of this has to do with storage technology. Well the #1 item on the kindergarten knowledge list is “Share Everything.” And from my perspective that includes DAS (direct-attached storage).
Sharable DAS has been a primary topic of discussion at this year’s annual LSI Accelerating Innovation Summit (AIS). During one keynote session I proposed a continuum of data sharing, spanning from traditional server-based DAS to traditional external NAS and SAN with multiple points in between – including external DAS, simple pooled storage, advanced pooled storage, shared storage and HA (high-availability) shared storage. Each step along the continuum adds incremental features and value, giving datacenter architects the latitude to choose – and pay for – only the level of sharing absolutely required, and no more. This level of choice is being very warmly received by the market as storage requirements vary widely among Web-cloud, private cloud, traditional enterprise, and SMB configurations and applications.
Sharable DAS pools storage for operational benefits and efficiencies
Sharable DAS, with its inherent storage resource pooling, offers a number of operational benefits and efficiencies when applied at the rack level:
LSI rolls out proof-of-concept Rack Scale architecture using sharable DAS
In addition to just talking about sharable DAS at AIS, we also rolled out a proof-of-concept Rack Scale architecture employing sharable DAS. In it we configured 20 servers with 12Gb/s SAS RAID controllers, a prototype 40-port 12Gb/s SAS switch (that’s 160 12Gb/s SAS lanes) and 10 JBODs with 12Gb/s SAS for a total of 200 disk drives – all in a single rack. The drives were configured as a single storage resource pool with our media sharing (ability to spread volumes across multiple disk drives and aggregate disk drive bandwidth) and distributed RAID (ability to disperse data protection across multiple disk drives) features. This configuration pools the server storage into a single resource, delivering substantial, tangible performance and availability improvements, when compared to 20 stand-alone servers. In particular, the configuration:
I’m sure you’ll agree with me that Rack Scale architecture with sharable DAS is clearly a major step forward in providing a wide range of storage solutions under a single architecture. This in turn provides a multitude of operational efficiencies and performance benefits, giving datacenter architects wide latitude to employ what is needed – and only what is needed.
Now that we’ve tackled the #1 item on the kindergarten learning list, maybe I’ll set my sights on another item, like “Take a nap every afternoon.”