Emerging and disruptive markets are hard to quantify and forecast: They often apply different marketing labels for the same thing, and have no baseline industry data and no consistent methods of measurement and forecasting.
But this recent Wibikon big data report is head and hands above others. This is the third edition of the report and I wanted to give a shout-out to the authors – Jeff Kelly, David Vellante and David Foyer – on this best-in-class body of work.
Behind the numbers: The way I see it, big data has two different markets with very different technology and investment requirements and pace of adoption:
And now, the color commentary on the Wikibon big data report …
I’ve discussed the notion of two different markets, consumer big data and enterprise big data, with dozens of my friends, associates and industry co-travelers. And consistently their response is “yes, we see it the same way. There are two very different big data markets: one for web scale, like Yahoo!, and another for business and industry, like Walmart and GE.” My friends at Rackspace introduced me to the terms consumer big data and enterprise big data … make sense to me.
Open Compute and OpenStack are changing the datacenter world that we know and love. I thought they were having impact. Changing our OEMs and ODM products, changing what we expect from our vendors, changing the interoperability of managing infrastructure from different vendors. Changing our ability to deploy and manage grid and scale-out infrastructure. And changing how quickly and at what high level we can be innovating. I was wrong. It’s happening much more quickly than I thought.
On November 20-21 we hosted LSI AIS 2013. As I mentioned in a previous post, I was lucky enough to moderate a panel about Open Compute and OpenStack – “the perfect storm.” Truthfully? It felt more like sitting with two friends talking about our industry over beer. I hope to pick up that conversation again someday.
The panelists were awesome: Cole Crawford of Open Compute and Chris Kemp of OpenStack. These guys are not only influential. They have been involved from the very start of these two initiatives, and are in many ways key drivers of both movements. These are impressive, passionate guys who really are changing the world. There aren’t too many of us who can claim that. It was an engaging hour that I learned quite a bit from, and I think the audience did too. I wanted to share from my notes what I took away from that panel. I think you’ll be interested.
Goals and Vision: two “open source” initiatives
There were a few motivations behind Open Compute, and the goal was to improve these things.
The goal then, for the first time, is to work backwards from workload and create open source hardware and infrastructure that is openly available and designed from the start for large scale-out deployments. The idea is to drive high efficiency in cost, materials use and energy consumption. More work/$.
One surprising thing that came up – LSI is in every current contribution in Open Compute.
OpenStack layers services that describe abstractions of computer networking and storage. LSI products tend to sit at that lowest level of abstraction, where there is now a wave of innovation. OpenStack had similar fragmentation issues to deal with and its goals are something like:
There is a certain amount of compatibility with Amazon’s cloud services. Chris’s point was that Amazon is incredibly innovative and a lot of enterprises should use it, but OpenStack enables both service providers and private clouds to compete with Amazon, and it allows unique innovation to evolve on top of it.
OpenStack and Open Compute are not products. They are “standards” or platform architectures, with companies using those standards to innovate on top of them. The idea is for one company to innovate on another’s improvements – everybody building on each other’s work. A huge brain trust. The goal is to create a competitive ecosystem and enable a rapid pace of innovation, and enable large-scale, inexpensive infrastructure that can be managed by a small team of people, and can be managed like a single server to solve massive scale problems.
Here’s their thought. Hardware is a supply chain management game + services. Open Compute is an opportunity for anyone to supply that infrastructure. And today, OEMs are killer at that. But maybe ODMs can be too. Open Compute allows innovation on top of the basic interoperable platforms. OpenStack enables a framework for innovation on top as well: security, reliability, storage, network, performance. It becomes the enabler for innovation, and it provides an “easy” way for startups to plug into a large, vibrant ecosystem. And for customers – someone said its “exa data without exadollar”…
As a result, the argument is this should be good for OEMs and ISVs, and help create a more innovative ecosystem and should also enable more infrastructure capacity to create new and better services. I’m not convinced that will happen yet, but it’s a laudable goal, and frankly that promise is part of what is appealing to LSI.
Open Compute and OpenStack are “peanut butter and jelly”
Ok – if you’re outside of the US, that may not mean much to you. But if you’ve lived in the US, you know that means they fit perfectly, and make something much greater together than their humble selves.
Graham Weston, Chairman of the Rackspace Board, was the one who called these two “peanut butter and jelly.”
Cole and Chris both felt the initiatives are co-enabling, and probably co-travelers too. Sure they can and will deploy independently, but OpenStack enables the management of large scale clusters, which really is not easy. Open Compute enables lower cost large-scale manageable clusters to be deployed. Together? Large-scale clusters that can be installed and deployed more affordably, and easily without hiring a cadre of rare experts.
Personally? I still think they are both a bit short of being ready for “prime time” – or broad deployment, but Cole and Chris gave me really valid arguments to show me I’m wrong. I guess we’ll see.
US or global vision?
I asked if these are US-centric or global visions. There were no qualms – these are global visions. This is just the 3rd anniversary of OpenStack, but even so, there are OpenStack organizations in more than 100 countries, 750 active contributors, and large-scale deployments in datacenters that you probably use every day – especially in China and the US. Companies like PayPal and Yahoo, Rackspace, Baidu, Sina Weibo, Alibaba, JD, and government agencies and HPC clusters like CERN, NASA, and China Defense.
Open Compute is even younger – about 2 years old. (I remember – I was invited to the launch). Even so, most of Facebook’s infrastructure runs on Open Compute. Two Wall Street banks have deployed large clusters, with more coming, and Riot Games, which uses Open Compute infrastructure, drives 3% of the global network traffic with League of Legends. (A complete aside – one of my favorite bands to workout with did a lot of that game’s music, and the live music at the League of Legends competition a few months ago: http://www.youtube.com/watch?v=mWU4QvC09uM – not for everyone, but I like it.)
Both Cole and Chris emailed me more data after the fact on who is using these initiatives. I have to say – they are right. It really has taken off globally, especially OpenStack in the fast-paced Chinese market this year.
Book: 4th Paradigm – A tribute to computer science researcher Jim Grey
Cole and Chris mentioned a book during the panel discussion. A book I had frankly never heard of. It’s called the 4th Paradigm. It was a series of papers dedicated to researcher Jim Grey, who was a quiet but towering figure that I believe I met once at Microsoft Research. The book was put together by Gordon Bell, someone who I have met, and have profound respect for. And there are mentions of people, places, and things that have been woven through my (long) career. I think I would sum up its thesis in a quote from Jim Grey near the start of the book:
“We have to do better producing tools to support the whole research cycle – from data capture and data curation to data analysis and data visualization.”
This is stunningly similar to the very useful big data framework we have been using recently at LSI: ”capture, hold, analyze”… I guess we should have added visualize, but that doesn’t have too much to do with LSI’s business.
As an aside, I would recommend this book for the background and inspiration in why we as an industry are trying to solve many of these computer science problems, and how transformational the impact might be. I mean really transformational in the world around us, what we know, what we can do, and how quickly we can do it – which is tightly related to our CEO’s keynote and the vision video at AIS.
Demos at AIS: “peanut butter and jelly” - and bread?
Ok – I’m struggling for analogy. We had an awesome demo at AIS that Chris and Cole pointed out during the panel. It was originally built using Nebula’s TOR appliance, Open Compute hardware, and LSI’s storage magic to make it complete. The three pieces coming together. Tasty. The Open Compute hardware was swapped out last minute (for safety, those boxes were meant for the datacenter – not the showcase in a hotel with tipsy techies) and were generously supplied by Supermicro.
I don’t think the proto was close to any one of our visions, but even as it stood, it inspireda lot of people, and would make a great product. A short rack of servers, with pooled storage in the rack, OpenStack orchestrating the point and click spawning and tear down of dynamically sized LUNs of different characteristics under the Cinder presentation layer, and deployment of tasks or VMs on them.
We’re working on completing our joint vision. I think the industry will be very impressed when they see it. Chris thinks people will be stunned, and the industry will be changed.
Catalyzing the market… The future may be closer than we think…
Ultimately, this is all about economics. We’re in the middle of an unprecedented bifurcation in IT use. On one hand we’re running existing apps on new, dense enterprise hardware using VMs to layer many applications on few servers. On the other, we’re investing in applications to run at scale across inexpensive clusters of commodity hardware. This has spawned a split in IT vendor business units, product lines and offerings, and sometimes even IT infrastructure management in the datacenter.
New applications and services are needing more infrastructure, and are getting more expensive to power, cool, purchase, run. And there is pressure to transform the datacenter from a cost center into a profit center. As these innovations start, more companies will need scale infrastructure, arguably Open Compute, and then will need an Openstack framework to deploy it quickly.
Whats this mean? With a combination of big data and mobile device services driving economic value, we may be at the point where these clusters start to become mainstream. As an industry we’re already seeing a slight decline in traditional IT equipment sales and a rapid growth in scale-out infrastructure sales. If that continues, then OpenStack and Open Compute are a natural fit. The deployment rate uptick in life sciences, oil and gas, financials this year – really anywhere there is large-scale Hadoop, big data or analytics – may be the start of that growth curve. But both Chris and Cole felt it would probably take 5 years to truly take off.
Time to Wrap Up
I asked Chris and Cole for audience takeaways. Theirs were pretty simple, though possibly controversial in an industry like ours.
Hardware vendors should think about products and how they interface and what abstractions they present and how they fit into the ecosystem. These new ecosystems should allow them to easily plug in. For example, storage under Cinder can be quickly and easily morphed – that’s what we did with our demo.
We should be designing new software to run on distributed scale-out systems in clouds. Chris went on to say their code name was “Maestro” because it orchestrates like in a symphony, bringing things together in a beautiful way. He said “make instruments for the artists out there.” The brain trust. Look for their brushstrokes.
Innovate in the open, and leverage the open initiatives that are available to accelerate innovation and efficiency.
On your next IT purchase, try an RFP with an Open Compute vendor. Cole said you might be surprised. Worst case, you may get a better deal from your existing vendor.
So, Open Compute and Openstack are changing the datacenter world that we know and love. I thought these were having a quick impact, changing our OEMs and ODM products, changing what we expect from our vendors, changing the interoperability of managing infrastructure from different vendors, changing our ability to deploy and manage grid and scale-out infrastructure, and changing how quickly and at what high level we can be innovating. I was wrong. It’s happening much more quickly than even I thought.
Tags: AIS, Alibaba, Amazon, Baidu, big data, CERN, China, China Defense, Chris Kemp, Cole Crawford, datacenter, Facebook, Hadoop, HPC, IT infrastructure, JD, Jim Grey, NASA, Nebula, Networking, Open Compute, OpenStack, PayPal, Rackspace, Riot Games, scale-out cluster, Sina Weibo, Storage, Supermicro, Yahoo
Have you ever seen the old BBC TV show “Connections”? It’s a little old now, but I loved how it followed threads through time, and I marveled at the surprising historical depth of important “inventions.” I think we need to remember that as engineers and technologists. We get caught up in the short-term tactical delivery of technology. We don’t see the sometimes immense ripples in society from our work – even years later.
I got a flurry of emails yesterday, arranging an anniversary get-together in August at the Apple campus. Why? It’s the 20th anniversary of the Newton. Ok – so this has nothing to do with LSI really, but it does have a lot to do with our everyday lives. More than you think.
So you either know the Newton and think it was a failure (think Trudeau’s famous handwriting cartoon), or you don’t and you’re wondering what the *bleep* I’m talking about. Sometimes things that don’t seem very significant early on end up having profound consequences. And I admit, the Newton was a failure, too expensive and not quite good enough, and the world couldn’t even get the concept of a general-purpose computer in your hand.
But oh – you could smell the future and get a tantalizing hint of what it would be. Remember – we’re talking 1993 here.
First – why does Rob Ober care? It’s personal. While I didn’t remotely help create the Newton, I did help bring it to market, mature the technology, and set the stage for the future (well – it’s not the future any more – it’s now). I was at Apple wrapping up the creation of the PowerPC processor and architecture, and the first Power Macs. I have a great memory around that time of getting the first Power Mac booted. Someone had the great idea of running the beta 68K emulator (to run standard Mac stuff). That was great, it worked, and then someone else said – wait – I have an Apple II emulator for the 68K Mac. So we had the very first PowerPC Mac running 68K code as a Mac to emulate a 6502 as an Apple II … and we played for hours. I also have a very clear memory of that PowerPC Mac standing shoulder-to-shoulder with the Robotron game in the Valley Green 5 building break room. It was a state-of-the-art video game and looked like this.
Yea, that shows you it was a while ago. (But it was a good game.)
A guy named Shane Robison pulled me over (yea, the same HP CTO, now CEO of FusionIO) to come fix some things on the super-hush Newton program. In the end, I took over responsibility for the processors, custom chips, communication stacks and hardware, plastics and tooling, display, touch screen, power supply, wireless, NiMH and LiION batteries… A lot. We pushed the limits of state of the art on all those fronts. It was a really important wonderful/terrible part of my career. I learned an amazing amount.
(If you’re interested in viewing a Newton from today’s perspective, there is a fascinating review here: http://techland.time.com/2012/06/01/newton-reconsidered/)
Let me start with some boring effects. We were using the ARM processor because of its low power. But. It wasn’t perfect, and ARM itself was on the edge of insolvency. We invested a sizable chunk of money, and gave it guidance on how to transition from ARM 6 to 7 to 9. ARM is alive today because of that, and the ARM 9 is still in 100’s of millions of products. And we also worked with DEC to create the StrongARM processor family, which became XScale at Intel, then went to Marvel, and also bootstrapped Atom, and, and…
The Newton needed non-volatile storage. Disks were immense, expensive and power-hungry. 2-1/2” disk? Didn’t exist. 3-1/2” was small. The only remotely cost-effective technology was called NAND flash, which was fundamentally incompatible with program execution, and nightmarish for data storage/retrieval, and unbelievably expensive per bit. I think the early Newtons were 8 Mbytes? (that’s mega not giga…). The team figured out how to make that work. Yep – that was the first use of Toshiba NAND for program/data. (I’ve been playing with flash for storage since then.)
Then some more interesting things…
I wired the Apple campus with wireless LAN base stations (it would be 6 years until Wifi, and 802.11 wasn’t even dreamt up yet) and built the wireless LAN receivers into Newtons, gave them to the Apple execs and set up their mail to be forwarded. You couldn’t even do that on laptops. We could be anywhere in the campus and instantly receive and send emails. More – we could browse the (rudimentary) web. I also worked with RIM (yea – Research In Motion – Blackberry) and Metricom to use their wireless wide area net technology to give Newtons access to email and the Web anywhere in the Bay Area. Quite a few times I was driving to meetings, wasn’t sure where to go, so pulled over and looked up the meeting in my Newton calendar, then checked the address on my browser with MapQuest. 1995. Sound familiar?
We also spent time with FedEx pitching it on the idea of a Newton-based tablet to manage inventory (integrated bar code scanner), accept signatures on screen with tablet/pen (even the upside down thing to hand it to the customer), show route maps, and cellularly send all that info back and forth for live tracking. FedEx was stunned by the concept. Sound familiar? I still have the proposal book with industrial designs in my garage. Yes, another Silicon Valley garage. Here’s what it rolled out 10 years later… which is ultimately pretty similar to our proposal.
And don’t forget Object Programming. (You remember when OOPS was a high-tech term?) I’m not really a software guy – just not my thing – but I loved programming on the Newton. In 10 minutes you could actually bang out a useful, great-looking program. Personally, I think the world would have been way better off if those object libraries had been folded into the Java object library. Even so, I get a nostalgic feel when I do iOS programming.
I even built a one-off proto that had cellphone guts inside the plastic of the Newton. (OK – it was chunky, but the smallest phones at the time were HUGE). I could make phone calls from the contacts or calendar or emails, send and receive SMS messages, and rudimentary MMS messages before there was such a thing – used just like a very overweight iPhone (OK – more like the big Samsung galaxy phones). I could even, in a pinch, do data over the GSM network – email, web, etc. It was around that time Nokia came calling and asked about our UI, our OS, our ability to used data over the GSM network… Those talks fell apart, but it was serious enough I made trips to Nokia’s mothership in Helsinki and Tampere a few times. (That’s north even for a Canadian boy…)
And then years later I got a phone call from one of the key people at Apple – Mike Culbert (who, sadly, recently passed away) – to ask about cellular/baseband chipsets and solutions. He knew I knew the technology. I introduced him to my friends at Infineon (now Intel Mobile) for a discussion on a mystery project… Those parts ended up in the iPhone. A lot of the same people and technology, just way more advanced…
iPad? Sure. A lot of the same people were involved in a Newton that never saw the light of day. The BIC. Here it is with the iPad. Again – 15 years apart.
And you remember the $100 laptop (OLPC?). As a founding board member, I brought an eMate kids Newton laptop to show the team early on. And of course the debate on disk vs. flash followed the same path as it had in Newton. Here they are together, separated by more than 10 years. And then of course, OLPC has direct genetic parentage of netbooks, which then lead to Ultrabooks… (Did you know at one point Apple was considering joining OLPC and offering Darwin/OSX as the OS? Didn’t last long.)
And then there are the people. Off the top of my head there were founders or key movers of Palm, Xbox, Kindle, Hotmail, Yahoo, Netscape, Android, WebTV (think most set-top boxes), Danger phone (you remember the sidekick?), Evernote, Mercedes research and a bunch of others. And some friends who became well-known VCs. And I still have a lot of super-talented friends from that time, many of whom are still at Apple.
Sometimes things that don’t seem very significant have profound follow-on consequences. I think we need to remember that as engineers and technologists. We don’t see the sometimes immense ripples in society from our work – even years later. Today we’re planting the seeds for all those great things in the future. I admit, the Newton was a failure, but oh – you could smell the future and get a tantalizing hint of what it would be. Remember – we’re talking 1993 here.
Tags: 802.11, Android, Apple, Apple II, ARM, BIC, Blackberry, Darwin, DEC, eMate, Evernote, FedEx, FusionIO, Hotmail, HP, Intel, iPad, iPhone, Kindle, Marvel, Mercedes, Metricom, Mike Culbert, MMS, Netscape, Newton, Nokia, object programming, OLPC, Palm, Power Mac, PowerPC, Research in Motion, Robotron, Shane Robison, SMS, StrongARM, Toshiba, Ultrabook, Web TV, Wifi, Xbox, XScale, Yahoo
I’ve just been to China. Again. It’s only been a few months since I was last there.
I was lucky enough to attend the 5th China Cloud Computing Conference at the China National Convention Center in Beijing. You probably have not heard of it, but it’s an impressive conference. It’s “the one” for the cloud computing industry. It was a unique view for me – more of an inside-out view of the industry. Everyone who’s anyone in China’s cloud industry was there. Our CEO, Abhi Talwalkar, had been invited to keynote the conference, so I tagged along.
First, the air was really hazy, but I don’t think the locals considered it that bad. The US consulate iPhone app said the particulates were in the very unhealthy range. Imagine looking across the street. Sure, you can see the building there, but the next one? Not so much. Look up. Can you see past the 10th floor? No, not really. The building disappears into the smog. That’s what it was like at the China National Convention Center, which is part of the same Olympics complex as the famous Birdcage stadium: http://www.cnccchina.com/en/Venues/Traffic.aspx
I had a fantastic chance to catch up with a university friend, who has been living in Beijing since the 90’s, and is now a venture capitalist. It’s amazing how almost 30 years can disappear and you pick up where you left off. He sure knows how to live. I was picked up in his private limo, whisked off to a very well-known restaurant across the city, where we had a private room and private waitress. We even had some exotic, special dishes that needed to be ordered at least a day in advance. Wow. But we broke Chinese tradition and had imported beer in honor of our Canadian education.
Sizing up China’s cloud infrastructure
The most unusual meeting I attended was an invitation-only session – the Sino-American roundtable on cloud computing. There were just about 40 people in a room – half from the US, half from China. Mostly what I learned is that the cloud infrastructure in China is fragmented, and probably sub-scale. And it’s like that for a reason. It was difficult to understand at first, but I think I’ve made sense of it.
I started asking why to friends and consultants and got some interesting answers. Essentially different regional governments are trying to capture the cloud “industry” in their locality, so they promote activity, and they promote creation of new tools and infrastructure for that. Why reuse something that’s open source and works if you don’t have to and you can create high-tech jobs? (That’s sarcasm, by the way.) Many technologists I spoke with felt this will hold them back, and that they are probably 3-5 years behind the US. As well, each government-run industry specifies the datacenter and infrastructure needed to be a supplier or ecosystem partner with them, and each is different. The national train system has a different cloud infrastructure from the agriculture department, and from the shipping authority, etc… and if you do business with them – that is you are part of their ecosystem of vendors, then you use their infrastructure. It all spells fragmentation and sub-scale. In contrast, the Web 2.0 / social media companies seem to be doing just fine.
Baidu was also showing off its open rack. It’s an embodiment of the Scorpio V1 standard, which was jointly developed with Tencent, Alibaba and China Telecom. It views this as a first experiment, and is looking forward to V2, which will be a much more mature system.
I was also lucky to have personal meetings with general managers,chief architects and effective CTOs of the biggest cloud companies in China. What did I learn? They are all at an inflexion point. Many of the key technologists have experience at American Web 2.0 companies, so they’re able to evolve quickly, leveraging their industry knowledge. They’re all working to build or grow their own datacenters, their own infrastructure. And they’re aggressively expanding products, not just users, so they’re getting a compound growth rate.
Here’s a little of what I learned. In general, there is a trend to try and simplify infrastructure, harmonize divergent platforms, and deploy more infrastructure by spending less on each unit. (In general, they don’t make as much per user as American companies, but they have more users). As a result they are more cost-focused than US companies. And they are starting to put more emphasis on operational simplicity in general. As one GM described it to me – “Yes, techs are inexpensive in China for maintainence, but more often than not they make mistakes that impact operations.” So we (LSI) will be focussing more on simplifying management and maintainence for them.
Baidu’s biggest Hadoop cluster is 20k nodes. I believe that’s as big as Yahoo’s – and it is the originator of Hadoop. Baidu has a unique use profile for flash – it’s not like the hyperscale datacenters in the US. But Baidu is starting to consume a lot. Like most other hyperscale datacenters, it is working on storage erasure coding across servers, racks and datacenters, and it is trying to make a unified namespace across everything. One of its main interests is architecture at datacenter level, harmonizing the various platforms and looking for the optimum at the datacenter level. In general, Baidu is very proud of the advances it has made, and it has real confidence in its vision and route forward, and from what I heard, its architectural ambitions are big.
JD.com (which used to be 360buy.com) is the largest direct ecommerce company in China and (only) had about $10 billion (US) in revenue last year, with 100% CAGR growth. As the GM there said, its growth has to slow sometime, or in 5 years it’ll be the biggest company in the world. I think it is the closest equivalent to Amazon there is out there, and they have similar ambitions. They are in the process of transforming to a self-built, self-managed datacenter infrastructure. It is a company I am going to keep my eyes on.
Tencent is expanding into some interesting new businesses. Sure, people know about the Tencent cloud services that the Chinese government will be using, but Tencent also has some interesting and unique cloud services coming. Let’s just say even I am interested in using them. And of course, while Tencent is already the largest Web 2.0 company in China, its new services promise to push it to new scale and new markets.
Extra! Extra! Read all about it …
And then there was press. I had a very enjoyable conversation with Yuan Shaolong, editor at WatchStor, that I think ran way over. Amazingly – we discovered we have the same favorite band, even half a world away from each other. The results are here, though I’m not sure if Google translate messed a few things up, or if there was some miscommunication, but in general, I think most of the basics are right: http://translate.google.com/translate?hl=en&sl=zh-CN&u=http://tech.watchstor.com/storage-module-144394.htm&prev=/search%3Fq%3Drobert%2Bober%2BLSI%26client%3Dfirefox-a%26rls%3Dorg.mozilla:en-US:official%26biw%3D1346%26bih%3D619
I just keep learning new things every time I go to China. I suspect it has as much to do with how quickly things are changing as new stuff to learn. So I expect it won’t be too long until I go to China, again…
Tags: Abhi Talwalkar, Alibaba, Amazon, Baidu, China, China Cloud Computing Conference, China National Convention Center, China Telecom, datacenter, Hadoop, hyperscale, JD.com, WatchStor, web 2.0, Yahoo