A couple of years ago I got a DSLR (digital single-lens reflex) camera. After using a compact digital camera, the DSLR opened a new world of photography for me. It was great to have the option to shoot six frames per second, use different lenses and fine-tune shutter speed, exposure and other parameters.
Learning to take my photography to a higher level, from auto to manual settings, was quite an experience. Through research and talking to friends and photographers, I discovered that I needed to learn these fundamentals:
Experimenting with each of these variables was a frustrating test of Murphy’s Law. Just when I thought I had found the right setting for, say, shutter speed, the other two would be thrown out of whack. You start to get used to shooting in some conditions, like low light, but it’s always a balancing act to understand the relationship among the three settings and how they influence each other. Should I go auto, do what the camera dictates, and settle for unsatisfying results, or go manual and struggle to get the results I want?
Fundamentals of database performance
At LSI I do a lot of database performance testing of systems, and it reminds me of photography. Many factors can affect system performance, but these are the fundamental ones:
Consider the many systems that sport fast multi-core processors but use slow non-volatile data storage that bottleneck performance. As for memory, this IT maxim comes to mind: “Memory is like time. You can never have enough.” It would be great to always have enough memory for your working set, but this is seldom the case for large, high-transaction systems. With database systems, maintaining enough data in memory requires retrieving data from slower non-volatile data storage or, depending on the operating system resources available, reducing the working set of data that resides in memory.
Server components: Working together to optimize system performance
Non-volatile storage traditionally has been the system bottleneck, but this is changing with the growing adoption of fast PCIe® NAND flash and is evident with in-memory database (IMDB) systems. I recently tested the customer preview of SQL Server 2014 In-Memory OLTP (online transaction processing), with the durable tables feature, code-named Hekaton, on Windows Server® 2012 R2.
With SQL Server 2014, tables and indexes reside in-memory, not in disk-based storage, significantly increasing the performance of short, highly concurrent transactions. Even with this fast in-memory processing capability, there is a performance tradeoff to maintain durability (See my post “In-memory shopping tip: Look for durability as much as speed.”) Keep in mind that SQL server logging still needs to be performed on non-volatile storage, so the faster response of the storage the better.
With SQL Server logging, if the log file on non-volatile data storage is not fast enough, the processor will be underutilized. One way to shift more work to the processor is to deploy the LSI® Nytro™ WarpDrive® card for faster log writes. WarpDrive is one of several LSI non-volatile NAND-flash storage products that can help boost the performance and efficiency of your server components.
System performance optimization is an art that involves testing workload types. Changing one component affects the others, so striking the right balance among the server components is key to getting the results you want. Deploying the fastest processor available might seem like the best, most obvious way to goose system performance, but it’s even more important to optimize processor, memory and storage performance to the specific workload.
During the past few years, the deployment of cloud architectures has accelerated to support various consumer and enterprise applications such as email, word processing, enterprise resource planning, customer relationship management and the like. Traditionally, co-located servers, storage and networking moved to the cloud en masse in the form of a service, with overlying applications that have been and remain very insensitive to delay and jitter.
But the fast-emerging next generation of business applications require much tighter service level agreements (SLA) from cloud providers. Applications such as Internet of Things, smart grids, immersive communications, hosted clients and gaming are some good examples. These use cases tend to be marked by periods of high interactivity, so delay and jitter for the network, computer and storage must be minimized. During times of normal interactivity, the applications are in steady-state condition, requiring minimal SLAs from the infrastructure resources.
Emerging use cases drive demand for two-tier cloud architectures
These emerging use cases are driving the rise of two-tier cloud architectures. The key for these architectures to succeed is efficiency: they must be cost-effective to deploy and guarantee a tight SLA for applications while leaving the rest of the carrier and cloud infrastructure unchanged. What’s more, the application service needs to move closer to the end user, but only for the duration of the real-time interaction. These measures help ensure that the customer’s application-specific requirements for delay and jitter are met without requiring major upgrades to the carrier or cloud infrastructures.
In this two-tier cloud architecture, the first cloud tier, also referred to in the industry as centralized cloud, is where the applications typically reside. The second cloud tier is invoked on demand, and the application’s virtual machine along with its relevant network, application and storage data shift to this tier. Keep in mind that the second tier can be instantiated as part of an existing service provider network element or as a stand-alone infrastructure element closer to the end user.
A connected patient heart monitor provides a useful example. During most of its operational time, the device may be collecting data only periodically, and with no need for any interactions with medical staff. But when the heart monitor detects an abnormality, the application hosted in the cloud must instantly be moved closer to the user in order to provide interactivity. For this use case, the second tier cloud must host the application, assess the patient’s condition, retrieve relevant historical information and alert the medical staff for a possible medical response.
The key, then, is to move applications from tier one to tier two clouds seamlessly. LSI® Axxia® multi-core communication processors feature an architectural scalability for network acceleration and computer cluster capabilities that provide this seamless bridge between the two clouds. In order for the two-tier cloud architectures to thrive, they need three fundamental elements:
a. On-demand resource provisioning
Many cloud datacenters are squarely focused on deploying end-to-end resource provisioning tools to improve efficiency. Not the least among these is the fast-growing end-to-end orchestration ecosystem for OpenStack® software, though there are many proprietary solutions. End-to-end orchestration tools need to be aware of all the second-tier cloud datacenter components. In some cases, OpenStack is even being deployed to boot up second tier cloud components. However, a big challenge remains – maintaining a steady state and full capabilities of various distributed second-tier cloud components.
b. Efficient virtual machine movements
For tiered cloud architectures to thrive, they must also transfer enough network, application and storage data to sustain continuing operations of the application at the second tier. However, many of today’s virtual machine migration solutions are not geared to moving datacenter resources efficiently. In a two-tier cloud architecture, the virtual machine migration may traverse many hops of carrier infrastructure, increasing total cost of ownership (TCO). In addition, complete virtual machine images must be transferred before the destination station can start the machines, extending the time it takes for the second tier to take control. The upshot is that optimized solutions need to be developed to enable seamless virtual machine migrations.
c. Network and storage acceleration of resource-constrained tier-two clouds
Unlike the first cloud tier, the second cloud tier is bound to be resource-constrained, requiring significant data acceleration for both the networking and storage layers. A 16-core full SMP ARM®-based processor like the LSI Axxia 5500 processor, with its processor cores and, more importantly, its fully programmable acceleration engines for offloading security, deep packet inspection, traffic management and other functions is well-suited for network acceleration of the second cloud tier. Keep in mind that specific acceleration needs vary based on the location of the second tier cloud. For example, the acceleration requirements of the second cloud tier would differ depending on whether it is part of a service provider access aggregation router or located on a remote lamp post. The need for security acceleration, in particular, increases tremendously in cases where data associated with particular data events must be authenticated before further processing. To support these various acceleration needs, the second cloud tier can be built out of fairly homogeneous and scalable ARM-based hardware components with differing acceleration builds tuned to specific tasks running on it.
Momentum for greater connectivity builds
Momentum behind billions of connected things/machines across industrial and consumer applications to create a more connected, interactive world is building. Two-tier clouds and other innovative architectures are emerging at an accelerated pace to meet demand for this higher order of connectivity. And it is solutions like the LSI Axxia processor that promise to enable the scalable, flexible acceleration required for these emerging two-cloud architectures.
Tags: ARM, Axxia multi-core communications processor, cloud datacenter, enterprise applications, immersive communications, Internet of Things, network acceleration, Networking, servers, smart grids, Storage, storage acceleration, two-tier cloud architectures, virtual machines
Open Compute and OpenStack are changing the datacenter world that we know and love. I thought they were having impact. Changing our OEMs and ODM products, changing what we expect from our vendors, changing the interoperability of managing infrastructure from different vendors. Changing our ability to deploy and manage grid and scale-out infrastructure. And changing how quickly and at what high level we can be innovating. I was wrong. It’s happening much more quickly than I thought.
On November 20-21 we hosted LSI AIS 2013. As I mentioned in a previous post, I was lucky enough to moderate a panel about Open Compute and OpenStack – “the perfect storm.” Truthfully? It felt more like sitting with two friends talking about our industry over beer. I hope to pick up that conversation again someday.
The panelists were awesome: Cole Crawford of Open Compute and Chris Kemp of OpenStack. These guys are not only influential. They have been involved from the very start of these two initiatives, and are in many ways key drivers of both movements. These are impressive, passionate guys who really are changing the world. There aren’t too many of us who can claim that. It was an engaging hour that I learned quite a bit from, and I think the audience did too. I wanted to share from my notes what I took away from that panel. I think you’ll be interested.
Goals and Vision: two “open source” initiatives
There were a few motivations behind Open Compute, and the goal was to improve these things.
The goal then, for the first time, is to work backwards from workload and create open source hardware and infrastructure that is openly available and designed from the start for large scale-out deployments. The idea is to drive high efficiency in cost, materials use and energy consumption. More work/$.
One surprising thing that came up – LSI is in every current contribution in Open Compute.
OpenStack layers services that describe abstractions of computer networking and storage. LSI products tend to sit at that lowest level of abstraction, where there is now a wave of innovation. OpenStack had similar fragmentation issues to deal with and its goals are something like:
There is a certain amount of compatibility with Amazon’s cloud services. Chris’s point was that Amazon is incredibly innovative and a lot of enterprises should use it, but OpenStack enables both service providers and private clouds to compete with Amazon, and it allows unique innovation to evolve on top of it.
OpenStack and Open Compute are not products. They are “standards” or platform architectures, with companies using those standards to innovate on top of them. The idea is for one company to innovate on another’s improvements – everybody building on each other’s work. A huge brain trust. The goal is to create a competitive ecosystem and enable a rapid pace of innovation, and enable large-scale, inexpensive infrastructure that can be managed by a small team of people, and can be managed like a single server to solve massive scale problems.
Here’s their thought. Hardware is a supply chain management game + services. Open Compute is an opportunity for anyone to supply that infrastructure. And today, OEMs are killer at that. But maybe ODMs can be too. Open Compute allows innovation on top of the basic interoperable platforms. OpenStack enables a framework for innovation on top as well: security, reliability, storage, network, performance. It becomes the enabler for innovation, and it provides an “easy” way for startups to plug into a large, vibrant ecosystem. And for customers – someone said its “exa data without exadollar”…
As a result, the argument is this should be good for OEMs and ISVs, and help create a more innovative ecosystem and should also enable more infrastructure capacity to create new and better services. I’m not convinced that will happen yet, but it’s a laudable goal, and frankly that promise is part of what is appealing to LSI.
Open Compute and OpenStack are “peanut butter and jelly”
Ok – if you’re outside of the US, that may not mean much to you. But if you’ve lived in the US, you know that means they fit perfectly, and make something much greater together than their humble selves.
Graham Weston, Chairman of the Rackspace Board, was the one who called these two “peanut butter and jelly.”
Cole and Chris both felt the initiatives are co-enabling, and probably co-travelers too. Sure they can and will deploy independently, but OpenStack enables the management of large scale clusters, which really is not easy. Open Compute enables lower cost large-scale manageable clusters to be deployed. Together? Large-scale clusters that can be installed and deployed more affordably, and easily without hiring a cadre of rare experts.
Personally? I still think they are both a bit short of being ready for “prime time” – or broad deployment, but Cole and Chris gave me really valid arguments to show me I’m wrong. I guess we’ll see.
US or global vision?
I asked if these are US-centric or global visions. There were no qualms – these are global visions. This is just the 3rd anniversary of OpenStack, but even so, there are OpenStack organizations in more than 100 countries, 750 active contributors, and large-scale deployments in datacenters that you probably use every day – especially in China and the US. Companies like PayPal and Yahoo, Rackspace, Baidu, Sina Weibo, Alibaba, JD, and government agencies and HPC clusters like CERN, NASA, and China Defense.
Open Compute is even younger – about 2 years old. (I remember – I was invited to the launch). Even so, most of Facebook’s infrastructure runs on Open Compute. Two Wall Street banks have deployed large clusters, with more coming, and Riot Games, which uses Open Compute infrastructure, drives 3% of the global network traffic with League of Legends. (A complete aside – one of my favorite bands to workout with did a lot of that game’s music, and the live music at the League of Legends competition a few months ago: http://www.youtube.com/watch?v=mWU4QvC09uM – not for everyone, but I like it.)
Both Cole and Chris emailed me more data after the fact on who is using these initiatives. I have to say – they are right. It really has taken off globally, especially OpenStack in the fast-paced Chinese market this year.
Book: 4th Paradigm – A tribute to computer science researcher Jim Grey
Cole and Chris mentioned a book during the panel discussion. A book I had frankly never heard of. It’s called the 4th Paradigm. It was a series of papers dedicated to researcher Jim Grey, who was a quiet but towering figure that I believe I met once at Microsoft Research. The book was put together by Gordon Bell, someone who I have met, and have profound respect for. And there are mentions of people, places, and things that have been woven through my (long) career. I think I would sum up its thesis in a quote from Jim Grey near the start of the book:
“We have to do better producing tools to support the whole research cycle – from data capture and data curation to data analysis and data visualization.”
This is stunningly similar to the very useful big data framework we have been using recently at LSI: ”capture, hold, analyze”… I guess we should have added visualize, but that doesn’t have too much to do with LSI’s business.
As an aside, I would recommend this book for the background and inspiration in why we as an industry are trying to solve many of these computer science problems, and how transformational the impact might be. I mean really transformational in the world around us, what we know, what we can do, and how quickly we can do it – which is tightly related to our CEO’s keynote and the vision video at AIS.
Demos at AIS: “peanut butter and jelly” - and bread?
Ok – I’m struggling for analogy. We had an awesome demo at AIS that Chris and Cole pointed out during the panel. It was originally built using Nebula’s TOR appliance, Open Compute hardware, and LSI’s storage magic to make it complete. The three pieces coming together. Tasty. The Open Compute hardware was swapped out last minute (for safety, those boxes were meant for the datacenter – not the showcase in a hotel with tipsy techies) and were generously supplied by Supermicro.
I don’t think the proto was close to any one of our visions, but even as it stood, it inspired a lot of people, and would make a great product. A short rack of servers, with pooled storage in the rack, OpenStack orchestrating the point and click spawning and tear down of dynamically sized LUNs of different characteristics under the Cinder presentation layer, and deployment of tasks or VMs on them.
We’re working on completing our joint vision. I think the industry will be very impressed when they see it. Chris thinks people will be stunned, and the industry will be changed.
Catalyzing the market… The future may be closer than we think…
Ultimately, this is all about economics. We’re in the middle of an unprecedented bifurcation in IT use. On one hand we’re running existing apps on new, dense enterprise hardware using VMs to layer many applications on few servers. On the other, we’re investing in applications to run at scale across inexpensive clusters of commodity hardware. This has spawned a split in IT vendor business units, product lines and offerings, and sometimes even IT infrastructure management in the datacenter.
New applications and services are needing more infrastructure, and are getting more expensive to power, cool, purchase, run. And there is pressure to transform the datacenter from a cost center into a profit center. As these innovations start, more companies will need scale infrastructure, arguably Open Compute, and then will need an Openstack framework to deploy it quickly.
Whats this mean? With a combination of big data and mobile device services driving economic value, we may be at the point where these clusters start to become mainstream. As an industry we’re already seeing a slight decline in traditional IT equipment sales and a rapid growth in scale-out infrastructure sales. If that continues, then OpenStack and Open Compute are a natural fit. The deployment rate uptick in life sciences, oil and gas, financials this year – really anywhere there is large-scale Hadoop, big data or analytics – may be the start of that growth curve. But both Chris and Cole felt it would probably take 5 years to truly take off.
Time to Wrap Up
I asked Chris and Cole for audience takeaways. Theirs were pretty simple, though possibly controversial in an industry like ours.
Hardware vendors should think about products and how they interface and what abstractions they present and how they fit into the ecosystem. These new ecosystems should allow them to easily plug in. For example, storage under Cinder can be quickly and easily morphed – that’s what we did with our demo.
We should be designing new software to run on distributed scale-out systems in clouds. Chris went on to say their code name was “Maestro” because it orchestrates like in a symphony, bringing things together in a beautiful way. He said “make instruments for the artists out there.” The brain trust. Look for their brushstrokes.
Innovate in the open, and leverage the open initiatives that are available to accelerate innovation and efficiency.
On your next IT purchase, try an RFP with an Open Compute vendor. Cole said you might be surprised. Worst case, you may get a better deal from your existing vendor.
So, Open Compute and Openstack are changing the datacenter world that we know and love. I thought these were having a quick impact, changing our OEMs and ODM products, changing what we expect from our vendors, changing the interoperability of managing infrastructure from different vendors, changing our ability to deploy and manage grid and scale-out infrastructure, and changing how quickly and at what high level we can be innovating. I was wrong. It’s happening much more quickly than even I thought.
Tags: AIS, Alibaba, Amazon, Baidu, big data, CERN, China, China Defense, Chris Kemp, Cole Crawford, datacenter, Facebook, Hadoop, HPC, IT infrastructure, JD, Jim Grey, NASA, Nebula, Networking, Open Compute, OpenStack, PayPal, Rackspace, Riot Games, scale-out cluster, Sina Weibo, Storage, Supermicro, Yahoo
Last week at LSI’s annual Accelerating Innovation Summit (AIS) the company took the wraps off a vision that should lead its technical direction for the next few years.
The LSI keynote featured a video of three situations as they might evolve in the future:
I’ll focus on just one of these to show how LSI expects the future to develop. In the bicycle accident scenario, a businessman falls to the ground while riding a bicycle in a foreign country. Security cameras that have been upgraded to understand what they see notify an emergency services agency which sends an ambulance to the scene. The paramedic performs a retinal scan on the victim, using it to retrieve his medical records, including his DNA sequence, from the web.
The businessman’s wearable body monitoring system also communicates with the paramedic’s instruments to share his vital signs. All of this information is used by cloud-based computers to determine a course of action which, in the video, requires an injection that has been custom-tuned to the victim’s current situation, his medical history, and his genetic makeup.
That’s a pretty tall order, and it will require several advances in the state of the art, but LSI is using this and other scenarios to work with its clients and translate this vision into the products of the future.
What are the key requirements to make this happen? Talwalkar told the audience that we need to create a society that is supported by preventive, predictive and assisted analytics to move in a direction where the general welfare is assisted by all that the Internet and advanced computing have to offer. Since data is growing at an exponential rate, he argued that this will require the instant retrieval of interlinked data objects at scale. Everything that is key to solving the task must be immediately available, and must be quickly analyzed to provide a solution to the problem at hand. The key will be the ability to process interlinked pieces of data that have not been previously structured to handle any particular situation.
To achieve this we will need larger-scale computing resources than are currently available, all closely interconnected, that all operate at very high speeds. LSI hopes to tap into these needs through its strengths in networking and communications chips for the communications, its HDD and server and storage connectivity array chips and boards for large-scale data, and its flash controller memory and PCIe SSD expertise for high performance.
LSI brought to AIS several of the customers and partners it is working with using to develop these technologies. Speakers from Intel, Microsoft, IBM, Toshiba, Ericsson and others showed how they are working with LSI’s various technologies to improve the performance of their own systems. On the exhibition floor booths from LSI and many of its clients demonstrated new technologies that performed everything from high-speed stock market analysis to fast flash management.
It’s pretty exciting to see a company that has a clear vision of its future and is committed to moving its entire ecosystem ahead to make that happen and help companies manage their business more effectively during what LSI calls the “Datacentric Era.” LSI has certainly put a lot of effort into creating a vision and determining where its talents can be brought to bear to improve our lives in the future.
Tags: AIS, chips, communications, connectivity, data, Datacentric Era, Ericsson, flash, flash memory, hard disk drive, HDD, IBM, Intel, large-scale data, Microsoft, Networking, server, Storage, Toshiba