Have you ever seen the old BBC TV show â€śConnectionsâ€ť? Itâ€™s a little old now, but I loved how it followed threads through time, and I marveled at the surprising historical depth of important â€śinventions.â€ť I think we need to remember that as engineers and technologists. We get caught up in the short-term tactical delivery of technology. We donâ€™t see the sometimes immense ripples in society from our work â€“ even years later.
I got a flurry of emails yesterday, arranging an anniversary get-together in August at the Apple campus. Why? Itâ€™s the 20th anniversary of the Newton. Ok â€“ so this has nothing to do with LSI really, but it does have a lot to do with our everyday lives. More than you think.
So you either know the Newton and think it was a failure (think Trudeauâ€™s famous handwriting cartoon), or you donâ€™t and youâ€™re wondering what the *bleep* Iâ€™m talking about. Sometimes things that donâ€™t seem very significant early on end up having profound consequences.Â And I admit, the Newton was a failure, too expensive and not quite good enough, and the world couldnâ€™t even get the concept of a general-purpose computer in your hand.
But oh â€“ you could smell the future and get a tantalizing hint of what it would be. Remember â€“ weâ€™re talking 1993 here.
First â€“ why does Rob Ober care? Itâ€™s personal. While I didnâ€™t remotely help create the Newton, I did help bring it to market, mature the technology, and set the stage for the future (well â€“ itâ€™s not the future any more â€“ itâ€™s now). I was at Apple wrapping up the creation of the PowerPC processor and architecture, and the first Power Macs. I have a great memory around that time of getting the first Power Mac booted. Someone had the great idea of running the beta 68K emulator (to run standard Mac stuff). That was great, it worked, and then someone else said â€“ wait â€“ I have an Apple II emulator for the 68K Mac. So we had the very first PowerPC Mac running 68K code as a Mac to emulate a 6502 as an Apple II â€¦ and we played for hours. I also have a very clear memory of that PowerPC Mac standing shoulder-to-shoulder with the Robotron game in the Valley Green 5 building break room. It was a state-of-the-art video game and looked like this.
Yea, that shows you it was a while ago. (But it was a good game.)
A guy named Shane Robison pulled me over (yea, the same HP CTO, now CEO of FusionIO) to come fix some things on the super-hush Newton program. In the end, I took over responsibility for the processors, custom chips, communication stacks and hardware, plastics and tooling, display, touch screen, power supply, wireless, NiMH and LiION batteriesâ€¦ Â A lot. Â We pushed the limits of state of the art on all those fronts. It was a really important wonderful/terrible part of my career. I learned an amazing amount.
(If youâ€™re interested in viewing a Newton from todayâ€™s perspective, there is a fascinating review here: http://techland.time.com/2012/06/01/newton-reconsidered/)
Let me start with some boring effects. We were using the ARM processor because of its low power. But. It wasnâ€™t perfect, and ARM itself was on the edge of insolvency. We invested a sizable chunk of money, and gave it guidance on how to transition from ARM 6 to 7 to 9. ARM is alive today because of that, and the ARM 9 is still in 100â€™s of millions of products. And we also worked with DEC to create the StrongARM processor family, which became XScale at Intel, then went to Marvel, and also bootstrapped Atom, and, andâ€¦
The Newton needed non-volatile storage. Disks were immense, expensive and power-hungry. 2-1/2â€ť disk? Didnâ€™t exist. Â 3-1/2â€ť was small. The only remotely cost-effective technology was called NAND flash, which was fundamentally incompatible with program execution, and nightmarish for data storage/retrieval, and unbelievably expensive per bit. I think the early Newtons were 8 Mbytes? (thatâ€™s mega not gigaâ€¦). The team figured out how to make that work. Yep â€“ that was the first use of Toshiba NAND for program/data. (Iâ€™ve been playing with flash for storage since then.)
Then some more interesting thingsâ€¦
I wired the Apple campus with wireless LAN base stations (it would be 6 years until Wifi, and 802.11 wasnâ€™t even dreamt up yet) and built the wireless LAN receivers into Newtons, gave them to the Apple execs and set up their mail to be forwarded. You couldnâ€™t even do that on laptops. We could be anywhere in the campus and instantly receive and send emails. More â€“ we could browse the (rudimentary) web. I also worked with RIM (yea â€“ Research In Motion â€“ Blackberry) and Metricom to use their wireless wide area net technology to give Newtons access to email and the Web anywhere in the Bay Area. Quite a few times I was driving to meetings, wasnâ€™t sure where to go, so pulled over and looked up the meeting in my Newton calendar, then checked the address on my browser with MapQuest. 1995. Sound familiar?
We also spent time with FedEx pitching it on the idea of a Newton-based tablet to manage inventory (integrated bar code scanner), accept signatures on screen with tablet/pen (even the upside down thing to hand it to the customer), show route maps, and cellularly send all that info back and forth for live tracking. FedEx was stunned by the concept. Sound familiar? I still have the proposal book with industrial designs in my garage. Yes, another Silicon Valley garage. Hereâ€™s what it rolled out 10 years laterâ€¦ which is ultimately pretty similar to our proposal.
And donâ€™t forget Object Programming. (You remember when OOPS was a high-tech term?) Iâ€™m not really a software guy â€“ just not my thing â€“ but I loved programming on the Newton. In 10 minutes you could actually bang out a useful, great-looking program. Personally, I think the world would have been way better off if those object libraries had been folded into the Java object library. Even so, I get a nostalgic feel when I do iOS programming.
I even built a one-off proto that had cellphone guts inside the plastic of the Newton. (OK â€“ it was chunky, but the smallest phones at the time were HUGE). I could make phone calls from the contacts or calendar or emails, send and receive SMS messages, and rudimentary MMS messages before there was such a thing â€“ used just like a very overweight iPhone (OK â€“ more like the big Samsung galaxy phones). I could even, in a pinch, do data over the GSM network â€“ email, web, etc. It was around that time Nokia came calling and asked about our UI, our OS, our ability to used data over the GSM networkâ€¦ Those talks fell apart, but it was serious enough I made trips to Nokiaâ€™s mothership in Helsinki and Tampere a few times. (Thatâ€™s north even for a Canadian boyâ€¦)
And then years later I got a phone call from one of the key people at Apple â€“ Mike Culbert (who, sadly, recently passed away) â€“ to ask about cellular/baseband chipsets and solutions. He knew I knew the technology. I introduced him to my friends at Infineon (now Intel Mobile) for a discussion on a mystery projectâ€¦ Those parts ended up in the iPhone. A lot of the same people and technology, just way more advancedâ€¦
iPad? Sure. A lot of the same people were involved in a Newton that never saw the light of day. The BIC. Here it is with the iPad. Again â€“ 15 years apart.
And you remember the $100 laptop (OLPC?). As a founding board member, I brought an eMate kids Newton laptop to show the team early on. And of course the debate on disk vs. flash followed the same path as it had in Newton. Â Here they are together, separated by more than 10 years. And then of course, OLPC has direct genetic parentage of netbooks, which then lead to Ultrabooksâ€¦ (Did you know at one point Apple was considering joining OLPC and offering Darwin/OSX as the OS? Didnâ€™t last long.)
And then there are the people. Off the top of my head there were founders or key movers of Palm, Xbox, Kindle, Hotmail, Yahoo,Â Netscape, Android, WebTV (think most set-top boxes), Danger phone (you remember the sidekick?), Evernote, Mercedes research and a bunch of others. And some friends who became well-known VCs.Â And I still have a lot of super-talented friends from that time, many of whom are still at Apple.
Sometimes things that donâ€™t seem very significant have profound follow-on consequences. I think we need to remember that as engineers and technologists. We donâ€™t see the sometimes immense ripples in society from our work â€“ even years later. Today weâ€™re planting the seeds for all those great things in the future. I admit, the Newton was a failure, but oh â€“ you could smell the future and get a tantalizing hint of what it would be. Remember â€“ weâ€™re talking 1993 here.
Tags: 802.11, Android, Apple, Apple II, ARM, BIC, Blackberry, Darwin, DEC, eMate, Evernote, FedEx, FusionIO, Hotmail, HP, Intel, iPad, iPhone, Kindle, Marvel, Mercedes, Metricom, Mike Culbert, MMS, Netscape, Newton, Nokia, object programming, OLPC, Palm, Power Mac, PowerPC, Research in Motion, Robotron, Shane Robison, SMS, StrongARM, Toshiba, Ultrabook, Web TV, Wifi, Xbox, XScale, Yahoo
I was lucky enough to get together for dinner and beer with old friends a few weeks ago. Between the 4 of us, weâ€™ve been involved in or responsible for a lot of stuff you use every day, or at least know about.
Supercomputers, minicomputers, PCs, Macs, Newton, smart phones, game consoles, automotive engine controllers and safety systems, secure passport chips, DRAM interfaces, netbooks, and a bunch of processor architectures: Alpha, PowerPC, Sparc, MIPS, StrongARM/XScale, x86 64-bit, and a bunch of other ones you haven’t heard of (um – most of those are mine, like TriCore). Basically if you drive a European car, travel internationally, use the Internet , if you play video games, or use a smart phone, wellâ€¦Â youâ€™re welcome.
Why do I tell you this? Well – first I’m name dropping – I’m always stunned I can call these guys friends and be their peers. But more importantly, we’ve all been in this industry as architects for about 30 years. Of course our talk went to whatâ€™s going on today. And we all agree that we’ve never seen more changes – inflexions – than the raft unfolding right now. Maybe its pressure from the recession, or maybe un-naturally pent up need for change in the ecosystem, but change there is.
Changes in who drives innovation, whatâ€™s needed, the companies on top and on bottom at every point in the food chain, who competes with whom, how workloads have changed from compute to dataflow, software has moved to opensource, how abstracted code is now from processor architecture, how individual and enterprise customers have been revolting against the “old” ways, old vendors, old business models, and what the architectures look like, how processors communicate, and how systems are purchased, and what fundamental system architectures look like. But not much besides that…
Ok – so if you’re an architect, thatâ€™s as exciting as it gets (you hear it in my voice â€“ right ?), and it makes for a lot of opportunities to innovate and create new or changed businesses. Because innovation is so often at the intersection of changing ways of doing things. We’re at a point where the changes are definitely not done yet. We’re just at the start. (OK â€“ now try to imagine a really animated 4-way conversation over beers at the Britannia Arms in Cupertinoâ€¦ Yea â€“ exciting.)
Iâ€™m going to focus on just one sliver of the market â€“ but itâ€™s important to me â€“ and thatâ€™s enterprise IT. Â I think the changes are as much about business models as technology.
Iâ€™ll start in a strange place.Â Hyperscale datacenters (think social media, search, etc.) and the scale of deployment changes the optimization point. Most of us starting to get comfortable with rack as the new purchase quantum. And some of us are comfortable with the pod or container as the new purchase quantum. But theÂ hyperscale dataenters work more at the datacenter as the quantum. By looking at it that way, they can trade off the cost of power, real estate, bent sheet metal, network bandwidth, disk drives, flash, processor type and quantity, memory amount, where work gets done, and what applications are optimized for. In other words, we shifted from looking at local optima to looking for global optima. I donâ€™t know about you, but when I took operations research in university, I learned there was an unbelievable difference between the two â€“ and global optima was the one you wantedâ€¦
Hyperscale datacenters buy enough (top 6 are probably more than 10% of the market today) that 1) they need to determine what they deploy very carefully on their own, and 2) vendors work hard to give them what they need.
That means innovation used to be driven by OEMs, but now itâ€™s driven by hyperscale datacenters andÂ itâ€™s driven hard. That global optimum? Itâ€™s work/$ spent. Thatâ€™s global work, and global spend. Itâ€™s OK to spend more, even way more on one thing if over-all you get more done for the $â€™s you spend.
Thatâ€™s why the 3 biggest consumers of flash in servers are Facebook, Google, and Apple, with some of the others not far behind. You want stuff, they want to provide it, and flash makes it happen efficiently. So efficiently they can often give that service away for free.
Hyperscale datacenters have started to publish their cost metrics, and open up their architectures (like OpenCompute), and open up their software (like Hadoop and derivatives). More to the point, services like Amazon have put a very clear $ value on services. And itâ€™s shockingly low.
Enterprises have looked at those numbers. Hard. Thatâ€™s catalyzed a customer revolt against the old way of doing things â€“ the old way of buy and billing. OEMs and ISVs are creating lots of value for enterprise, but not that much. They’ve been innovating around â€śstickinessâ€ť and â€ślock-inâ€ť (yea â€“ those really are industry terms) for too long, while hyperscale datacenters have been focused on getting stuff done efficiently. The money they save per unit just means they can deploy more units and provide better services.
That revolt is manifesting itself in 2 ways. The first is seen in the quarterly reports of OEMs and ISVs. Rumors of IBM selling its X-series to Lenovo, Dell going private, Oracle trying to shift business, HP talking of the â€śnew style of ITâ€ťâ€¦ The second is enterprises are looking to emulate hyperscale datacenters as much as possible, and deploy private cloud infrastructure. And often as not, those will be running some of the same open source applications and file systems as the big hyperscale datacenters use.
Where are the hyperscale datacenters leading them? Itâ€™s a big list of changes, and theyâ€™re all over the place.
But theyâ€™re also looking at a few different things. For example, global name space NAS file systems. Personally? I think this oneâ€™s a mistake. I like the idea of file systems/object stores, but the network interconnect seems like a bottleneck. Storage traffic is shared with network traffic, creates some network spine bottlenecks, creates consistency performance bottlenecks between the NAS heads, and â€“ letâ€™s face it â€“ people usually skimp on the number of 10GE ports on the server and in the top of rack switch. A typical SAS storage card now has 8 x 12G ports â€“ thatâ€™s 96G of bandwidth. Will servers have 10 x 10G ports? Yea. I didnâ€™t think so either.
Anyway â€“ all this is not academic. One Wall Street bank shared with me that â€“ hold your breath â€“ it could save 70% of its spend going this route. It was shocked. I wasnâ€™t shocked, because at first blush this seems absurd â€“ not possible. Thatâ€™s how I reacted. I laughed. Butâ€¦ The systems are simpler and less costly to make. There is simply less there to make or ship than OEMs force into the machines for uniqueness and â€śvalue.â€ť They are purchased from much lower margin manufacturers. They have massively reduced maintenance costs (thereâ€™s less to service, and, well, no OEM service contracts). And also important â€“ some of the incredibly expensive software licenses are flipped to open source equivalents. Net savings of 70%. Easy. Stop laughing.
Disaggregation: Or in other words, Pooled Resources
But probably the most important trend from all of this is what server manufacturers are calling â€śdisaggregationâ€ť (hey â€“ youâ€™re ripping apart my server!) but architects are more descriptively calling pooled resources.
First â€“ the intent of disaggregation is not to rip the parts of a server to pieces to get lowest pricing on the components. No. If youâ€™re buying by the rack anyway â€“ why not package so you can put like with like. Each part has its own life cycle after all. CPUs are 18 months. DRAM is several years. Flash might be 3 years. Disks can be 5 to 7 years. Networks are 5 to 10 years. Power supplies areâ€¦ forever? Why not replace each on its own natural failure/upgrade cycle? Why not make enclosures appropriate to the technology they hold? Disk drives need solid vibration-free mechanical enclosures of heavy metal. Processors need strong cooling. Flash wants to run hot. DRAM cool.
Second â€“ pooling allows really efficient use of resources. Systems need slush resources. What happens to a systems that uses 100% of physical memory? It slows down a lot. If a database runs out of storage? It blue screens. If you donâ€™t have enough network bandwidth? The result is, every server is over provisioned for its task. Extra DRAM, extra network bandwidth, extra flash, extra disk drive spindles.. If you have 1,000 nodes you can easily strand TBytes of DRAM, TBytes of flash, a TByte/s of network bandwidth of wasted capacity, and all that always burning power. Worse, if you plan wrong and deploy servers with too little disk or flash or DRAM, thereâ€™s not much you can do about it. Now think 10,000 or 100,000 nodesâ€¦ Ouch.
If you pool those things across 30 to 100 servers, you can allocate as needed to individual servers. Just as importantly, you can configure systems logically, not physically. That means you donâ€™t have to be perfect in planning ahead what configurations and how many of each youâ€™ll need. You have sub-assemblies you slap into a rack, and hook up by configuration scripts, and get efficient resource allocation that can change over time. You need a lot of storage? A little? Higher performance flash? Extra network bandwidth? Just configure them.
Thatâ€™s a big deal.
And of course, this sets the stage for immense pooled main memory â€“ once the next generation non-volatile memories are ready â€“ probably starting around 2015.
You canâ€™t underestimate the operational problems associated with different platforms at scale. Many hyperscale datacenters today have around 6 platforms. If you think they are rolling out new versions of those before old ones are retired they often have 3 generations of each. Thatâ€™s 18 distinct platforms, with multiple software revisions of each. That starts to get crazy when you may have 200,000 to 400,000 servers to manage and maintain in a lights out environment. Pooling resources and allocating them in the field goes a huge way to simplifying operations.
Alternate Processor Architecture
It didnâ€™t always used to be Intel x86. There was a time when Intel was an upstart in the server business. It was Power, MIPs, Alpha, SPARCâ€¦ (and before that IBM mainframes and minis, etc). Each of the changes was brought on by changing the cost structure. Mainframes got displaced by multi-processor RISC, which gave way to x86.
Today, we have Oracle saying theyâ€™re getting out of x86 commodity servers and doubling down on SPARC. IBM is selling off its x86 business and doubling down on Power (hey â€“ donâ€™t confuse that with PowerPC â€“ which started as an architectural cut-down of Power â€“ I was thereâ€¦). And of course there is a rash of 64-bit ARM server SOCs coming â€“ with HP and Dell already dabbling in it. Whatâ€™s important to realize is that all of these offerings are focusing on the platform architecture, and how applications really perform in total, not just the processor.
Let me warp up with an email thread cut/paste from a smart friend â€“ Wayne Nation. I think he summed up some of whatâ€™s going on well, in a sobering way most people donâ€™t even consider.
â€śDoes this remind you of a time, long ago, when the market was exploding with companies that started to make servers out of those cheap little desktop x86 CPUs? What is different this time? Cost reduction and disaggregation? No, cost and disagg are important still, but not new.
A new CPU architecture? No, x86 was “new” before. ARM promises to reduce cost, as did Intel.
Disaggregation enables hyperscale datacenters to leverage vanity-free, but consistent delivery will determine the winning supplier. There is the potential for another Intel to rise from these other companies. â€ś