Last week at LSI’s annual Accelerating Innovation Summit (AIS) the company took the wraps off a vision that should lead its technical direction for the next few years.
The LSI keynote featured a video of three situations as they might evolve in the future:
I’ll focus on just one of these to show how LSI expects the future to develop. In the bicycle accident scenario, a businessman falls to the ground while riding a bicycle in a foreign country. Security cameras that have been upgraded to understand what they see notify an emergency services agency which sends an ambulance to the scene. The paramedic performs a retinal scan on the victim, using it to retrieve his medical records, including his DNA sequence, from the web.
The businessman’s wearable body monitoring system also communicates with the paramedic’s instruments to share his vital signs. All of this information is used by cloud-based computers to determine a course of action which, in the video, requires an injection that has been custom-tuned to the victim’s current situation, his medical history, and his genetic makeup.
That’s a pretty tall order, and it will require several advances in the state of the art, but LSI is using this and other scenarios to work with its clients and translate this vision into the products of the future.
What are the key requirements to make this happen? Talwalkar told the audience that we need to create a society that is supported by preventive, predictive and assisted analytics to move in a direction where the general welfare is assisted by all that the Internet and advanced computing have to offer. Since data is growing at an exponential rate, he argued that this will require the instant retrieval of interlinked data objects at scale. Everything that is key to solving the task must be immediately available, and must be quickly analyzed to provide a solution to the problem at hand. The key will be the ability to process interlinked pieces of data that have not been previously structured to handle any particular situation.
To achieve this we will need larger-scale computing resources than are currently available, all closely interconnected, that all operate at very high speeds. LSI hopes to tap into these needs through its strengths in networking and communications chips for the communications, its HDD and server and storage connectivity array chips and boards for large-scale data, and its flash controller memory and PCIe SSD expertise for high performance.
LSI brought to AIS several of the customers and partners it is working with using to develop these technologies. Speakers from Intel, Microsoft, IBM, Toshiba, Ericsson and others showed how they are working with LSI’s various technologies to improve the performance of their own systems. On the exhibition floor booths from LSI and many of its clients demonstrated new technologies that performed everything from high-speed stock market analysis to fast flash management.
It’s pretty exciting to see a company that has a clear vision of its future and is committed to moving its entire ecosystem ahead to make that happen and help companies manage their business more effectively during what LSI calls the “Datacentric Era.” LSI has certainly put a lot of effort into creating a vision and determining where its talents can be brought to bear to improve our lives in the future.
Tags: AIS, chips, communications, connectivity, data, Datacentric Era, Ericsson, flash, flash memory, hard disk drive, HDD, IBM, Intel, large-scale data, Microsoft, Networking, server, Storage, Toshiba
The lifeblood of any online retailer is the speed of its IT infrastructure. Shoppers aren’t infinitely patient. Sluggish infrastructure performance can make shoppers wait precious seconds longer than they can stand, sending them fleeing to other sites for a faster purchase. Our federal government’s halting rollout of the Health Insurance Marketplace website is a glaring example of what can happen when IT infrastructure isn’t solid. A few bad user experiences that go viral can be damaging enough. Tens of thousands can be crippling.
In hyperscale datacenters, any number of problems including network issues, insufficient scaling and inconsistent management can undermine end users’ experience. But one that hits home for me is the impact of slow storage on the performance of databases, where the data sits. With the database at the heart of all those online transactions, retailers can ill afford to have their tier of database servers operating at anything less than peak performance.
Slow storage undermines database performance
Typically, Web 2.0 and e-commerce companies run relational databases (RDBs) on these massive server-centric infrastructures. (Take a look at my blog last week to get a feel for the size of these hyperscale datacenter infrastructures). If you are running that many servers to support millions of users, you are likely using some kind of open-sourced RDB such as MySQL or other variations. Keep in mind that Oracle 11gR2 likely retails around $30K per core but MSQL is free. But the performance of both, and most other relational databases, suffer immensely when transactions are retrieving data from storage (or disk). You can only throw so much RAM and CPU power at the performance problem … sooner rather than later you have to deal with slow storage.
Almost everyone in industry – Web 2.0, cloud, hyperscale and other providers of massive database infrastructures – is lining up to solve this problem the best way they can. How? By deploying flash as the sole storage for database servers and applications. But is low-latency flash enough? For sheer performance it beats rotational disk hands down. But … even flash storage has its limitations, most notably when you are trying to drive ultra-low latencies for write IOs. Most IO accesses by RDBs, which do the transactional processing, are a mix or read/writes to the storage. Specifically, the mix is 70%/30% reads/writes. These are also typically low q-depth accesses (less than 4). It is those writes that can really slow things down.
PCIe flash reduces write latencies
The good news is that the right PCIe flash technology in the mix can solve the slowdowns. Some interesting PCIe flash technologies designed to tackle this latency problem are on display at AIS this week. DRAM and in particular NVDRAM are being deployed as a tier in front of flash to really tackle those nasty write latencies.
Among other demos, we’re showing how a Nytro™ 6000 series PCIe flash card helps solve the MySQL database performance issues. The typical response time for a small data read (this is what the database will see for a Database IO) from an HDD is 5ms. Flash-based devices such as the Nytro WarpDrive® card can complete the same read in less than 50μs on average during testing, an improvement of several orders-of-magnitude in response time. This response time translates to getting much higher transactions out of the same infrastructure – but with less space (flash is denser) and a lot less power (flash consumes a lot lower power than HDDs).
We’re also showing the Nytro 7000 series PCIe flash cards. They reach even lower write latencies than the 6000 series and very low q-depths. The 7000 series cards also provide DRAM buffering while maintaining data-integrity even in the event of a power loss.
For online retailers and other businesses, higher database speeds mean more than just faster transactions. They can help keep those cash registers ringing.
Tags: AIS, database, DRAM, e-commerce, flash, flash memory, hard disk drive, HDD, hyperscale datacenter, latency, MySQL, NVDRAM, Nytro 6000, Nytro 7000, Nytro WarpDrive, Oracle, PCIe flash, relational database, storage latency, web 2.0, write latency
Try using a sledgehammer to pack 15 pounds of potatoes into a bag with a 5-pound capacity, and what do you end up with? Too much messy and disgusting material crammed into a vessel too small for the job and a lot of sloppy overspill.
Unless you have the right sledgehammer. What does this have to do with computer storage? Plenty. And it all starts with a new data-reshaping capability of LSI® DuraWrite™ technology. Keep the numbers in mind: 15 pounds of data in a 5-pound bag. Or three units of data in a space designed for one. It’s key. More on that later.
What does DuraWrite technology do again?
My blog Write Amplification (Part 2) talks about how DuraWrite data reduction technology can make more space for over provisioning. That translates into faster data transfers, longer flash memory endurance and lower power draw. What it has NOT done is increase the space available for data storage.
DuraWrite Virtual Capacity is like the Dr. Who TARDIS
Excuse my Dr. Who phone booth reference, but for those who know the TARDIS, it provides a great analogy. For those unfamiliar with the TARDIS, it is a fictional time machine that looks like a British police call box. It is very small by external appearances, but inside it is vast in its carrying capacity, taking occupants on an odyssey through time.
DuraWrite Virtual Capacity (DVC) is a new feature of our SandForce® SSD controllers, and it’s a bit like the TARDIS. While there is no time travel involved, it does provide a lot more than can be seen. DVC takes advantage of data entropy (randomness) as data is written to the SSD. Some people like to think of it as data compression. Whatever you call it, the end result is the same—less data is written to the flash than what is written from the host. DuraWrite technology alone will increase the over provisioning of an SSD, but DVC increases the user data storage available in an SSD (not the over provisioning).
How much more space can it add?
The efficiency of DVC is inversely related to the entropy. High entropy data like JPEG, encrypted and similar compressed files do little to increase data capacity. In contrast, files like Microsoft OfficeOutlook® PST, Oracle® databases, EXE, and DLL (operating system) files have much lower entropy and can increase usable storage space on the order of two to three times for the same physical flash memory. Yes, I said two to three times. Better still, that translates into a two to three times reduction in the net cost of the flash storage. Again, no typo: two to three times more affordable. Since most enterprise deployments of flash memory are limited by the cost per GB of the flash, this kind of advance has the potential to further accelerate flash memory deployments in the enterprise.
Why hasn’t this been done in the past?
Think TARDIS again. Step into the booth, and take a joy ride through space and time. It happens on-the-fly. Simple … but, only in a fictional world. With on-the-fly data reduction and compression, the process is filled with complexities. The biggest problem is most operating systems don’t understand that the maximum capacity of a primary storage device (hard disk or solid state drive) can increase or decrease over time. However, open source operating systems can address the issue with customized drivers.
The other problem is any storage device that includes data reduction or compression must use a variable mapping table to track the location of the data on the device once data is reduced. Hard disk drives (HDDs) do not require any kind of mapping table because the operating system can write new data over old data. However, the lack of a mapping table prevents HDDs from supporting an on-the-fly data reduction and compression system.
All solid state drives (SSDs) using NAND flash feature a basic mapping table, typically called the flash translation layer (FTL). This mapping table is required because NAND flash memory pages cannot be rewritten directly, but must first be erased in larger blocks. The SSD controller needs to relocate valid data while the old data gets erased. This process, called garbage collection, uses the mapping table. However, the data reduction and compression system requires a mapping table that is variable in size. Most SSDs lack that capability, but not those using a SandForce controller, making SSDs with SandForce controllers perfectly suited to the job.
What use cases can be applied to DVC?
DVC can be used to increase usable data storage space or provide more cache capacity flexibility by two to three times. To create more usable data storage space, the operating system must be altered with new primary storage device drivers for it to understand the drive’s maximum capacity, which can fluctuate over time based on how much the data is reduced or compressed.
To support greater cache capacity flexibility, a host controller would manage the flash memory directly as a cache. The controller would isolate the flash memory capacity from the host so the operating system does not even see it. The dynamic cache capacity would increase cache performance at a lower price per GB depending upon the entropy of the data. The LSI Nytro product line and some SandForce Driven® program SSDs already support both of these use cases.
When will this appear in my personal computer?
While DVC is already being deployed and evaluated in enterprise datacenters around the world, the use in personal computers will take a bit longer due to the need to have the operating system changed with new storage device drivers that understand the fluctuating maximum capacity.
When these operating system changes come together, you will not need that sledgehammer to pack more data into your TARDIS (SSD). Now that’s a space odyssey to write home about.
Each new generation of NAND flash memory reduces the fabrication geometry – the dimension of the smallest part of an integrated circuit used to build up the components inside the chip. That means there are fewer electrons storing the data, leading to increased errors and a shorter life for the flash memory. No need to worry. Today’s flash memory depends upon the intelligence and capabilities of the solid state drive (SSD) controller to help keep errors in check and get the longest life possible from flash memory, making it usable in compute environments like laptop computers and enterprise datacenters.
Today’s volume NAND flash memory uses a 20nm and 19nm manufacturing process, but the next generation will be in the 16nm range. Some experts speculate that today’s controllers will struggle to work with this next generation of flash memory to support the high number of write cycles required in datacenters. Also, the current multi-level cell (MLC) flash memory is transitioning to triple-level cell (TLC), which has an even shorter life expectancy and higher error rates.
Can sub-20nm flash survive in the datacenter?
Yes, but it will take a flash memory controller with smarts the industry has never seen before. How intelligent? Sub-20nm flash will need to stretch the life of the flash memory beyond the flash manufacturer’s specifications and correct far more errors than ever before, while still maintaining high throughput and very low latency. And to protect against periodic error correction algorithm failures, the flash will need some kind of redundancy (backup) of the data inside the SSD itself.
When will such a controller materialize?
LSI this week introduced the third generation of its flagship SSD and flash memory controller, called the SandForce SF3700. The controller is newly engineered and architected to solve the lifespan, performance, and reliability challenges of deploying sub-20nm flash memory in today’s performance-hungry enterprise datacenters. The SandForce SF3700 also enables longer periods between battery recharges for power-sipping client laptop and ultrabook systems. It all happens in a single ASIC package. The SandForce SF3700 is the first SSD controller to include both PCIe and SATA host interfaces natively in one chip to give customers of SSD manufacturers an easy migration path as more of them move to the faster PCIe host interface.
How does the SandForce SF3700 controller make sub-20nm flash excel in the datacenter?
Our new controller builds on the award-winning capabilities of the current SandForce SSD and flash controllers. We’ve refined our DuraWrite™ data reduction technology to streamline the way it picks blocks, collects garbage and reduces the write count. You’ll like the result: longer flash endurance and higher read and write speeds.
The SandForce SF3700 includes SHIELD™ error correction, which applies LDPC and DSP technology in unique ways to correct the higher error rates from the new generations of flash memory. SHIELD technology uses a multi-level error correction schema to optimize the time to get the correct data. Also, with its exclusive Adaptive Code Rate feature, SHIELD leverages DuraWrite technology’s ability to span internal NAND flash boundaries between the user data space and the flash manufacturer’s dedicated ECC field. Other controllers only use one size of ECC code rate for flash memory – the one largest size designed to support the end of the flash’s life. Early in the flash life, a much smaller size ECC is required, and SHIELD technology scales down the ECC accordingly, diverting the remaining free space as additional over provisioning. SHIELD partially increases the ECC size over time as the flash ages to correct the increasing failures, but does not use the largest ECC size until the flash is nearly at the end of its life.
Why is this good? Greater over provisioning over the life of the SSD improves performance and increases the endurance. SHIELD also allows the ECC field to grow even larger after it reaches its specified end of life. The big takeaway: All of these SHIELD capabilities increase flash write endurance many times beyond the manufacturer’s specification. In fact at the 2013 Flash Memory Summit exposition in Santa Clara, CA, SHIELD was shown to extend the endurance of a particular Micron NAND flash by nearly six times.
That’s not all. The SandForce SF3700 controller’s RAISE™ data reliability feature now offers stronger protection, including full die failure and more options for protecting data on SSDs with low (e.g., 32GB & 64GB) and binary (256GB vs. 240GB) capacities.
So what about end user systems?
The beauty of all SandForce flash and SSD controllers is its onboard firmware, which takes the one common hardware component – the ASIC – and adapts it to the user’s storage environment. For example, in client applications the firmware helps the controller preserve SSD power to enable users of laptop and ultrabook systems to remain unplugged longer between battery recharges. In contrast, enterprise environments require the highest possible performance and lowest latency. This higher performance draws more power, a tradeoff the enterprise is willing to make for the fastest time-to-data. The firmware makes other similar tradeoffs based on which storage environment it is serving.
Although most people consider the enterprise and client storage needs are very diverse, we think the new SandForce SF3700 Flash and SSD controller showcases the perfect balance of power and performance that any user hanging ten can appreciate.
Back in the 1990s, a new paradigm was forced into space exploration. NASA faced big cost cuts. But grand ambitions for missions to Mars were still on its mind. The problem was it couldn’t dream and spend big. So the NASA mantra became “faster, better, cheaper.” The idea was that the agency could slash costs while still carrying out a wide variety of programs and space missions. This led to some radical rethinks, and some fantastically successful programs that had very outside-the-box solutions. (Bouncing Mars landers anyone?)
That probably sounds familiar to any IT admin. And that spirit is alive at LSI’s AIS – The Accelerating Innovation Summit, which is our annual congress of customers and industry pros, coming up Nov. 20-21 in San Jose. Like the people at Mission Control, they all want to make big things happen… without spending too much.
Take technology and line of business professionals. They need to speed up critical business applications. A lot. Or IT staff for enterprise and mobile networks, who must deliver more work to support the ever-growing number of users, devices and virtualized machines that depend on them. Or consider mega datacenter and cloud service providers, whose customers demand the highest levels of service, yet get that service for free. Or datacenter architects and managers, who need servers, storage and networks to run at ever-greater efficiency even as they grow capability exponentially.
(LSI has been working on many solutions to these problems, some of which I spoke about in this blog.)
It’s all about moving data faster, better, and cheaper. If NASA could do it, we can too. In that vein, here’s a look at some of the topics you can expect AIS to address around doing more work for fewer dollars:
And, I think you’ll find some astounding products, demos, proof of concepts and future solutions in the showcase too – not just from LSI but from partners and fellow travelers in this industry. Hey – that’s my favorite part. I can’t wait to see people’s reactions.
Since they rethought how to do business in 2002, NASA has embarked on nearly 60 Mars missions. Faster, better, cheaper. It can work here in IT too.
Tags: 12Gb/s SAS, AIS, big data analytics, cloud infrastructure, cloud services, datacenter, flash, flash memory, hyperscale datacenters, NAS, NASA, SAN, SDN, shareable DAS, software-defined networks, sub-20nm flash, triple-level cell flash, VDI, web 2.0
All NAND flash-based SSDs use a process called garbage collection (GC) so the flash memory can be rewritten with fresh data, enabling the SSD to function like any other rewritable storage device. The number of rewrites (program/erase cycles) possible with NAND flash memory is finite. That’s why it’s essential to ensure that each P/E cycle really counts – that is, each is performed with top efficiency – to help preserve optimum SSD performance.
Collecting the garbage takes time
In my 2011 Flash Memory Summit presentation , I went into great detail about how GC – the automatic memory management process of clearing invalid data from memory to give new data a clean slate – works in an SSD. Here’s a recap: flash memory is organized in groups of pages where data can be written. Once a page is written, it cannot be rewritten until it is erased. Simple enough. But a page can only be erased within a group of typically 128 pages called a block. But wait. The complexity of writing data really starts to escalate in the case of random writes replacing previously written data. Random writes put the new data in previously erased pages elsewhere, peppering a block of valid data with “patches of invalid data.” In order to write new data to these patches, the whole block – all 128 pages – must be erased. But first all surrounding pages with valid data must be read and then rewritten to blank pages. The newly erased block of blank pages is then ready to save new data.
So why’s this a problem? This rewriting process shares the same path to the flash memory as new data arriving from the host system. You see the issue – a bottleneck. What you may not know is that this traffic jam can severely degrade overall write performance, sometimes as much as 90%.
Why not collect the garbage when the SSD is idle?
To improve write performance, many SSDs perform idle-time GC or background GC. When the SSD is idle – not performing reads or writes from the host system – the data paths to the flash memory are open. In a perfect world, the SSD controller would move all valid data into a contiguous group of new blocks so that all the free space would be consolidated into a few very large areas. Then, when new data arrived, the controller would send it directly to the fresh blocks and be spared from having to move data around just to free up space on pages of invalid data. But the world is far from perfect.
No free lunch, even from the garbage can
As might be expected, background or idle-time GC has drawbacks. The two main downsides are:
1. For users of Ultrabooks and other mobile systems, battery power is precious. The longer users can work unplugged between battery charges, the better. To make the most of a single charge, these systems use features like DevSleep to drastically reduce power to internal components not in use. At times when no data is being stored to or retrieved from the SSD, the host system gears down the SSD into a low power state (like DevSleep) to reduce power draw. In this state, an SSD with background or idle-time GC has no power to perform GC.
The upshot is that the SSD will be very slow when the system turns it back on and starts sending new data that must be saved in the spaces not yet cleared out by garbage-collected while the SSD was asleep. Alternately, the SSD may temporarily override the low power command from the host in order to perform the background GC, pulling more power from the battery and shortening the time remaining before it needs to be recharged.
2. When the SSD is performing GC, invalid data is ignored and only valid data is moved before the block is erased. Now imagine a large 2GB file on the SSD that the user plans to delete tomorrow. The SSD has no clue this will happen, so it automatically performs background GC around the 2GB file – and all other data – today, consuming one more of the very limited, precious program/erase cycles for all the flash holding that data. Ideally, the SSD would have waited one more day to garbage collect, the user would have already deleted the file, and the SSD wouldn’t have had to move all that data to new locations. No unnecessary data movement. No unnecessary use of a precious program/erase cycle.
Many people don’t realize that the number of background reads and writes initiated by the operating system, virus checkers, browsers, etc., far outstrip the number initiated by the computer user. Some users rarely delete files, believing they’ll extend the life of their SSD. The truth is, user file deletions are a drop in the bucket. It’s not their use of the computer storage that causes the most wear and tear. Background action from applications and the operating system is the real culprit.
Is there a better option?
What’s an SSD user to do? A super-fast foreground GC engine is the best solution. The key is special hardware and firmware integrated into the flash controller that streamlines garbage collection so it can run in the foreground with incoming data. The engine also enables high-speed writes to the flash memory. By maintaining high write speeds for GC operations, the SSD can afford to leave all valid data mixed with invalid data. That way the blocks are not recycled until absolutely necessary, dramatically reducing wear. Plus, the longer the SSD waits to GC pages, the higher the likelihood other pages of data will have already been made invalid by the operating system or user. The result is lower write amplification, longer flash endurance, and even higher performance.
All LSI® SandForce® Flash Controllers employ foreground GC to provide these invaluable benefits to the user.
Where did my email go?
This week I was dragged into to the virtualized cloud kicking and screaming … well, sort of. LSI has moved me, and all my co-workers, from my nice, safe local Exchange server to one in the amorphous, mysterious cloud. Scary. My IT department says the new cloud email is a great thing. They are now promising me unlimited email storage. Long gone will be the days of harrowing emails telling me I am approaching my storage limit and soon will be unable to send new email.
With cloud email, I can keep everything forever! I am not quite sure that saving mountains of email will be a good thing :-). Other than having to redirect my tablet and smartphone to this new service, update my webmail bookmark and empty my email inbox, there was not much I had to do. So far, so good. I have not experienced any challenges or performance issues. A key reason is flash storage.
To be sure, virtualization is a great tool for improving physical server utilization and flexibility as well as reducing power, cooling and datacenter footprint costs. That’s why the use of virtualization for email, databases and desktops is growing dramatically. But virtualized servers are only as effective as the storage performance that supports them. If, as a datacenter manager, your clients cannot access their application data quickly or boot their virtual desktop in a reasonable time, your company’s productivity and client satisfaction can drop dramatically.
Today, applications most likely run on virtualized servers. The upside of server virtualization is that a company can improve server utilization and run more applications on fewer physical servers. This can reduce power consumption, make more efficient use of datacenter floor space and make it easier to configure servers and deploy applications. The cloud also helps streamline application development, allowing companies to more efficiently and cost effectively test software applications across a broad set of configurations and operating systems.
A heated dispute – storage contention
Once application testing is complete, a virtual server’s configuration and image can be put on a virtual shelf until they are needed again, freeing up memory, processing, storage and other resources on the physical server for new virtual servers with just a few keystrokes. But with virtualization and the cloud there can be downsides, like slow performance – especially storage performance.
When a number of virtual servers are all using the same physical storage, there can be infighting for storage capacity, generally known as storage contention. These internecine battles can slow application response to a frustrating glacial pace and lead to issues like VDI Boot and Login Storm that can extend the time it takes for users to login to tens of minutes.
Flash helps alleviate slowdowns in storage performance
Here is where flash comes to the rescue. New flash storage solutions are being deployed to help improve virtualized storage performance and alleviate productivity-sapping slowdowns caused by VDI Boot and Login Storm — the crush of end users booting up or logging in within a small window that overwhelms the server with data requests and degrades response times. Flash can be used as primary storage inside servers running virtual machines to dramatically speed storage response time. Flash can also be deployed as an intelligent cache for DAS- or SAN-connected storage and even as an external shared storage pool.
It’s clear that virtualization will require higher storage performance and better, more cost-effective ways to deploy flash storage. But how much flash you need depends on your particular virtualization challenge, configuration and of course budget: while flash storage is extremely fast, it is costlier than disk-based storage. So choosing the right storage acceleration solution – one is LSI® Nytro™ Application Acceleration – can be as important as choosing the right cloud provider for your company’s email.
While my email is now stored in the cloud in Timbuktu, I know the flash storage solutions in that datacenter help keep my mail quickly accessible 24/7 whether I access it from my computer, tablet or smartphone, giving my productivity a boost. I can be assured that every action item I am sent will quickly make it to my inbox and be added to my ever-growing to-do list. Now my next big challenge is to improve my own response performance to those email requests!
Most consumers are skeptical when they see a manufacturer whipping out grandiose performance claims. And for good reason. The manufacturer could be stretching the truth, twisting the results, or just being downright misleading. From this distrust grew demand for 3rd-party writers to review products, test claims and provide an unbiased analysis of the device’s performance and other capabilities – as consumers would experience themselves.
Who can really claim to be an SSD benchmarking expert?
Solid state drive (SSD) technology is still relatively new in the computer industry, and in many ways SSDs are profoundly different from hard disk drives – perhaps most notably, in the way they record data, to a NAND cell rather than on spinning media. Because of differences in their operation, SSDs have to be tested in ways that are not necessarily obvious.
Can anyone who simply runs a benchmark application claim to be an expert? I would say not. Just as anyone sitting behind the wheel of a car is not necessarily an expert driver. The problem is that it is hard to determine the thoroughness and expertise of an SSD reviewer. Does the author really understand the details behind the technology to run adequate tests and analyze the results?
Can “experts” present bad data?
Maybe it is obvious, but of course experts can be wrong, especially when they are self-proclaimed mavens without deep experience in the technology they cover. At a minimum, you can generally count on them to act in good faith – that is, to not be intentionally misleading – but they can easily be misinformed (for instance, by manufacturers) and perpetuate the misinformation. What’s more, some reviewers are pressured to do a cursory analysis of an SSD as they crank through countless product evaluations under unremitting deadlines – a crush that can cause oversights in telling aspects of a drive’s performance. In any case, it is not good to rely on bad data no matter the intention.
What makes for a thorough SSD review?
Some reviewers have gone to great lengths to ensure their SSD analysis is extremely detailed and represents a real-world environment and performance. These reviewers will generally talk about how their analysis simulates a true user or server environment. The trouble can begin if a reviewer doesn’t recognize normal operation of an SSD in its own environment. With SSDs, “normal” is when garbage collection is operating, which greatly impacts overall performance. It’s important for reviewers to recognize that, with a new SSD, garbage collection is inactive until at least one full physical capacity of data has been sequentially written to the device. For example, with a 256GB SSD, 256GB of data must be written to trigger garbage collecting. At that point, garbage collection is ongoing, the drive has reached its steady-state performance, and the device is ready for evaluation. Random writes are another story, requiring up to three passes (full-capacity writes) randomly written to the SSD before the steady-state performance level shown below is reached.
You can see that running only a few minutes of random write tests on this SSD logs performance of over 275 MB/s. However, once garbage collection starts, performance plunges and then takes up to 3 hours before the true performance of 25 MB/s (a 90% drop) is finally evident – a phenomenon that often is not communicated clearly in reviews nor widely understood.
Good benchmarkers will discuss how their review factors in both garbage collection preparation and steady-state performance testing. Test results that purportedly achieve steady state in less time than in the example above are unlikely to reflect real-world performance. This is all part of what is called SSD preconditioning, but keep in mind that different tests require different steps for preconditioning.
For additional information on this topic, you can review my presentation from Flash Memory Summit 2013 on “Don’t let your favorite benchmarks lie to you.”
For the uninitiated, low-density parity-check (LDPC) code is an error correction code (ECC) that is used to both detect and correct errors on data that is transmitted from one point to another. All ECC types include correction data, so when information is transmitted with errors, the receiver has enough information to fix the errors without having to ask the source for the data again.
This enables transmitted data to maintain a constant speed as is required with digital television signals. What you don’t want is for the image to freeze repeatedly while waiting for correction data to be sent multiple times.
LDPC code was first presented to the world by Robert G. Gallager at MIT in 1960. It was very advanced for its time and, as it turned out, required a fantastic amount of computation to use real-time. The problem was that back in 1960, vacuum tube computers of that period performed about 100 times less work than the microprocessor-powered computers of today. In 1960, you would need a computer the size of a 2,000-square-foot house to process the LDPC correction information in real-time. This was hardly economical, so LDPC was mostly lost for nearly 40 years, as other simpler codes took its place over that period.
What was old is new again
In the mid-1990s, engineers working on satellite transmissions for digital television dusted off the LDPC codes and started using them for real-time operations. By then, computer processing had seen dramatic reductions in size and costs. Fast-forward to the past 5 years: we have seen a major increase in LDPC development and use because it appears to be the best solution for high-speed data transmissions, especially those subject to heavy levels of electrical noise that induce higher error rates. Also, the processing power of target devices like WiFi receivers and HDDs has grown even stronger and faster than some of the main CPUs of a few years ago. This enables LDPC to be deployed for little additional cost with the advantage of real-time data correction superior to correction offered by simpler codes.
If you have seen one LDPC solution, have you seen them all?
Nothing could be further from the truth. For example, an LDPC solution designed for satellite communication cannot be used for HDDs since there is no direct porting of the code, though there are distinct advantages to the two engineering teams sharing their knowledge and experience through their development efforts. Take, for example, the LDPC code that LSI has been shipping in its TrueStore® HDD read channel solutions for 3 years now. When LSI acquired SandForce and started work on SHIELD™ error correction code (based on LDPC) for flash controllers, there was no direct porting of that HDD code to support SSDs. However, the HDD development team’s knowledge and experience from creating the HDD code greatly improved the SSD team’s ability to more quickly bring SHIELD technology to the next-generation SandForce flash controller.
How do LDPC solutions for SSDs differ?
Many LDPC providers claim that their offerings rival the capabilities of competitive solutions, though often they aren’t telling the whole story. All LDPC solutions start with what is called hard-decision LDPC – a digital correction algorithm that operates at line rate on all data passing through the correction engine. The algorithm uses the meta-data generated from the user and system data stored on the flash memory, and helps recreate the user data when the flash memory returns it with errors. Hard-decision LDPC catches most errors from the flash memory, though sometimes it can be overwhelmed by an inordinate number of errors. That is where soft-decision LDPC – a more analog-based correction algorithm – comes into play.
Can a soft-decision be strong enough for my data?
Soft-decision LDPC is an error correction method that looks at other information beyond the actual ECC data. Soft-decision, in a sense, looks at the meta-data of the meta-data. The simplest form of soft-decision LDPC may just re-read the data at a different reference voltage, as if asking a person “can you say that again?” More complex soft-decision might be compared to listening to a man with a heavy French accent speaking English. You know he just said something in English, but you could not clearly grasp what he said. You ask some questions and, from his answers, soon realize what he originally said and are now back on track. While this might seem more like guessing at the answer, soft-decision LDPC uses statistics to help ensure the answers are not false positive results. As a result, soft-decision LDPC uncovers a new set of engineering problems that need to be solved, opening new opportunities for flash controller manufacturers to create powerful intellectual property (IP). For that reason, you’re likely to learn very little about how a given company’s soft-decision LDPC works.
At the 2013 Flash Memory Summit in Santa Clara, California, LSI demonstrated its SHIELD Advanced Error Correction Technology. SHIELD technology includes hard and soft-decision LDPC with digital signal processing (DSP) and a number of other unique features designed to optimize future NAND flash memory operation in compute environments. One feature, called Adaptive Code Rate, works with other LSI features to enable the spare area in flash memory reserved for ECC data to occupy less space than the manufacturer allocation and then dynamically grow to accommodate inevitable increases in flash error rates. The soft-decision LDPC capability offers multiple strengths of correction, with each activating only as necessary to ensure the lowest possible real-time latency.
So it’s clear that all LDPC solutions are far from the same. When evaluating LDPC solutions, be sure to understand how they manage data correction when it exceeds the ability of the hard-decision LDPC. Also, make sure the algorithms are actually in use in a product. Otherwise, the product might turn out to be a science experiment that never works.
Tags: Adaptive Code Rate, ECC, error correction, flash controllers, flash memory, Flash Memory Summit, LDPC, NAND flash, Robert Gallager, SandForce, SHIELD, solid state drive, SSD, TrueStore
Part two of this Write Amplification (WA) series covered how WA works in solid-state drives (SSDs) that use data reduction technology. I mentioned that, with one of these SSDs, the WA can be less than one, which can greatly improve flash memory performance and endurance.
Why is it important to know your SSD write amplification?
Well, it’s not really necessary to know the write amplification of your SSD at any particular point in time, but you do want an SSD with the lowest WA available. The reason is that the limited number of program/erase cycles that NAND flash can support keeps dropping with each generation of flash developed. A low WA will ensure the flash memory lasts longer than flash on an SSD with a higher WA.
A direct benefit of a WA below one is that the amount of dynamic over provisioning is higher, which generally provides higher performance. In the case of over provisioning, more is better, since a key attribute of SSD is performance. Keep in mind that, beyond selecting the best controller, you cannot control the WA of an SSD.
Just how smart are the SSD SMART attributes?
The monitoring system SMART (Self-Monitoring, Analysis and Reporting Technology) tracks various indicators of hard disk solid state drive reliability – including the number of errors corrected, bytes written, and power-on hours – to help anticipate failures, enabling users to replace storage before a failure causes data loss or system outages.
Some of these indicators, or attributes, point to the status of the drive health and others provide statistical information. While all manufacturers use many of these attributes in the same or a similar way, there is no standard definition for each attribute, so the meaning of any attribute can vary from one manufacturer to another. What’s more, there’s no requirement for drive manufacturers to list their SMART attributes.
How to measure missing attributes by extrapolation
Most SSDs provide some list of SMART attributes, but WA typically is excluded. However, with the right tests, you can sometimes extrapolate, with some accuracy, the WA value. We know that under normal conditions, an SSD will have a WA very close to 1:1 when writing data sequentially.
For an SSD with data reduction technology, you must write data with 100% entropy to ensure you identify the correct attributes, then rerun the tests with an entropy that matches your typical data workload to get a true WA calculation. SSDs without data reduction technology do not benefit from entropy, so the level of entropy used on them does not matter.
To measure missing attributes by extrapolation, start by performing a secure erase of the SSD, and then use a program to read all the current SMART attribute values. Some programs do not accurately display the true meaning of an attribute simply because the attribute itself contains no description. For you to know what each attribute represents, the program reading the attribute has to be pre-programmed by the manufacturer. The problem is that some programs mislabel some attributes. Therefore, you need to perform tests to confirm the attributes’ true meaning.
Start writing sequential data to the SSD, noting how much data is being written. Some programs will indicate exactly how much data the SSD has written, while others will reveal only the average data per second over a given period. Either way, the number of bytes written to the SSD will be clear. You want to write about 10 or more times the physical capacity of the SSD. This step is often completed with IOMeter, VDbench, or other programs that can send large measurable quantities of data.
At the end of the test period, print out the SMART attributes again and look for all attributes that have a different value than at the start of the test. Record the attribute number and the difference between the two test runs. You are trying to find one that represents a change of about 10, or the number of times you wrote to the entire capacity of the SSD. The attribute you are trying to find may represent the number of complete program/erase cycles, which would match your count almost exactly. You might also find an attribute that is counting the number of gigabytes (GBs) of data written from the host. To match that attribute, take the number of times you wrote to the entire SSD and multiply by the physical capacity of the flash. Technically, you already know how much you wrote from the host, but it is good to have the drive confirm that value.
Doing the math
When you find candidates that might be a match (you might have multiple attributes), secure erase the drive again, this time writing randomly with 4K transfers. Again, write about 10 times the physical capacity of the drive, then record the SMART attributes and calculate the difference from the last recording of the same attributes that changed between the first two recordings. This time, the change you see in the data written from the host should be nearly the same as with the sequential run. However, the attribute that represents the program/erase cycles (if present) will be many times higher than during the sequential run.
To calculate write amplification, use this equation:
( Number of erase cycles x Physical capacity in GB ) / Amount of data written from the host in GB
With sequential transfers, this number should be very close to 1. With random transfers, the number will be much higher depending on the SSD controller. Different SSDs will have different random WA values.
If you have an SSD with the type of data reduction technology used in the LSI® SandForce® controller, you will see lower and lower write amplification as you approach your lowest data entropy when you test with any entropy lower than 100%. With this method, you should be able to measure the write amplification of any SSD as long as it has erase cycles and host data-written attributes or something that closely represents them.
Protect your SSD against degraded performance
The key point to remember is that write amplification is the enemy of flash memory performance and endurance, and therefore the users of SSDs. This three-part series examined all the elements that affect WA, including the implications and advantages of a data reduction technology like the LSI SandForce DuraWrite™ technology. Once you understand how WA works and how to measure it, you will be better armed to defend yourself against this beastly cause of degraded SSD performance.
This three-part series examines some of the details behind write amplification, a key property of all NAND flash-based SSDs that is often misunderstood in the industry.