Try using a sledgehammer to pack 15 pounds of potatoes into a bag with a 5-pound capacity, and what do you end up with? Too much messy and disgusting material crammed into a vessel too small for the job and a lot of sloppy overspill.
Unless you have the right sledgehammer. What does this have to do with computer storage? Plenty. And it all starts with a new data-reshaping capability of LSI® DuraWrite™ technology. Keep the numbers in mind: 15 pounds of data in a 5-pound bag. Or three units of data in a space designed for one. It’s key. More on that later.
What does DuraWrite technology do again?
My blog Write Amplification (Part 2) talks about how DuraWrite data reduction technology can make more space for over provisioning. That translates into faster data transfers, longer flash memory endurance and lower power draw. What it has NOT done is increase the space available for data storage.
DuraWrite Virtual Capacity is like the Dr. Who TARDIS
Excuse my Dr. Who phone booth reference, but for those who know the TARDIS, it provides a great analogy. For those unfamiliar with the TARDIS, it is a fictional time machine that looks like a British police call box. It is very small by external appearances, but inside it is vast in its carrying capacity, taking occupants on an odyssey through time.
DuraWrite Virtual Capacity (DVC) is a new feature of our SandForce® SSD controllers, and it’s a bit like the TARDIS. While there is no time travel involved, it does provide a lot more than can be seen. DVC takes advantage of data entropy (randomness) as data is written to the SSD. Some people like to think of it as data compression. Whatever you call it, the end result is the same—less data is written to the flash than what is written from the host. DuraWrite technology alone will increase the over provisioning of an SSD, but DVC increases the user data storage available in an SSD (not the over provisioning).
How much more space can it add?
The efficiency of DVC is inversely related to the entropy. High entropy data like JPEG, encrypted and similar compressed files do little to increase data capacity. In contrast, files like Microsoft OfficeOutlook® PST, Oracle® databases, EXE, and DLL (operating system) files have much lower entropy and can increase usable storage space on the order of two to three times for the same physical flash memory. Yes, I said two to three times. Better still, that translates into a two to three times reduction in the net cost of the flash storage. Again, no typo: two to three times more affordable. Since most enterprise deployments of flash memory are limited by the cost per GB of the flash, this kind of advance has the potential to further accelerate flash memory deployments in the enterprise.
Why hasn’t this been done in the past?
Think TARDIS again. Step into the booth, and take a joy ride through space and time. It happens on-the-fly. Simple … but, only in a fictional world. With on-the-fly data reduction and compression, the process is filled with complexities. The biggest problem is most operating systems don’t understand that the maximum capacity of a primary storage device (hard disk or solid state drive) can increase or decrease over time. However, open source operating systems can address the issue with customized drivers.
The other problem is any storage device that includes data reduction or compression must use a variable mapping table to track the location of the data on the device once data is reduced. Hard disk drives (HDDs) do not require any kind of mapping table because the operating system can write new data over old data. However, the lack of a mapping table prevents HDDs from supporting an on-the-fly data reduction and compression system.
All solid state drives (SSDs) using NAND flash feature a basic mapping table, typically called the flash translation layer (FTL). This mapping table is required because NAND flash memory pages cannot be rewritten directly, but must first be erased in larger blocks. The SSD controller needs to relocate valid data while the old data gets erased. This process, called garbage collection, uses the mapping table. However, the data reduction and compression system requires a mapping table that is variable in size. Most SSDs lack that capability, but not those using a SandForce controller, making SSDs with SandForce controllers perfectly suited to the job.
What use cases can be applied to DVC?
DVC can be used to increase usable data storage space or provide more cache capacity flexibility by two to three times. To create more usable data storage space, the operating system must be altered with new primary storage device drivers for it to understand the drive’s maximum capacity, which can fluctuate over time based on how much the data is reduced or compressed.
To support greater cache capacity flexibility, a host controller would manage the flash memory directly as a cache. The controller would isolate the flash memory capacity from the host so the operating system does not even see it. The dynamic cache capacity would increase cache performance at a lower price per GB depending upon the entropy of the data. The LSI Nytro product line and some SandForce Driven® program SSDs already support both of these use cases.
When will this appear in my personal computer?
While DVC is already being deployed and evaluated in enterprise datacenters around the world, the use in personal computers will take a bit longer due to the need to have the operating system changed with new storage device drivers that understand the fluctuating maximum capacity.
When these operating system changes come together, you will not need that sledgehammer to pack more data into your TARDIS (SSD). Now that’s a space odyssey to write home about.
Each new generation of NAND flash memory reduces the fabrication geometry – the dimension of the smallest part of an integrated circuit used to build up the components inside the chip. That means there are fewer electrons storing the data, leading to increased errors and a shorter life for the flash memory. No need to worry. Today’s flash memory depends upon the intelligence and capabilities of the solid state drive (SSD) controller to help keep errors in check and get the longest life possible from flash memory, making it usable in compute environments like laptop computers and enterprise datacenters.
Today’s volume NAND flash memory uses a 20nm and 19nm manufacturing process, but the next generation will be in the 16nm range. Some experts speculate that today’s controllers will struggle to work with this next generation of flash memory to support the high number of write cycles required in datacenters. Also, the current multi-level cell (MLC) flash memory is transitioning to triple-level cell (TLC), which has an even shorter life expectancy and higher error rates.
Can sub-20nm flash survive in the datacenter?
Yes, but it will take a flash memory controller with smarts the industry has never seen before. How intelligent? Sub-20nm flash will need to stretch the life of the flash memory beyond the flash manufacturer’s specifications and correct far more errors than ever before, while still maintaining high throughput and very low latency. And to protect against periodic error correction algorithm failures, the flash will need some kind of redundancy (backup) of the data inside the SSD itself.
When will such a controller materialize?
LSI this week introduced the third generation of its flagship SSD and flash memory controller, called the SandForce SF3700. The controller is newly engineered and architected to solve the lifespan, performance, and reliability challenges of deploying sub-20nm flash memory in today’s performance-hungry enterprise datacenters. The SandForce SF3700 also enables longer periods between battery recharges for power-sipping client laptop and ultrabook systems. It all happens in a single ASIC package. The SandForce SF3700 is the first SSD controller to include both PCIe and SATA host interfaces natively in one chip to give customers of SSD manufacturers an easy migration path as more of them move to the faster PCIe host interface.
How does the SandForce SF3700 controller make sub-20nm flash excel in the datacenter?
Our new controller builds on the award-winning capabilities of the current SandForce SSD and flash controllers. We’ve refined our DuraWrite™ data reduction technology to streamline the way it picks blocks, collects garbage and reduces the write count. You’ll like the result: longer flash endurance and higher read and write speeds.
The SandForce SF3700 includes SHIELD™ error correction, which applies LDPC and DSP technology in unique ways to correct the higher error rates from the new generations of flash memory. SHIELD technology uses a multi-level error correction schema to optimize the time to get the correct data. Also, with its exclusive Adaptive Code Rate feature, SHIELD leverages DuraWrite technology’s ability to span internal NAND flash boundaries between the user data space and the flash manufacturer’s dedicated ECC field. Other controllers only use one size of ECC code rate for flash memory – the one largest size designed to support the end of the flash’s life. Early in the flash life, a much smaller size ECC is required, and SHIELD technology scales down the ECC accordingly, diverting the remaining free space as additional over provisioning. SHIELD partially increases the ECC size over time as the flash ages to correct the increasing failures, but does not use the largest ECC size until the flash is nearly at the end of its life.
Why is this good? Greater over provisioning over the life of the SSD improves performance and increases the endurance. SHIELD also allows the ECC field to grow even larger after it reaches its specified end of life. The big takeaway: All of these SHIELD capabilities increase flash write endurance many times beyond the manufacturer’s specification. In fact at the 2013 Flash Memory Summit exposition in Santa Clara, CA, SHIELD was shown to extend the endurance of a particular Micron NAND flash by nearly six times.
That’s not all. The SandForce SF3700 controller’s RAISE™ data reliability feature now offers stronger protection, including full die failure and more options for protecting data on SSDs with low (e.g., 32GB & 64GB) and binary (256GB vs. 240GB) capacities.
So what about end user systems?
The beauty of all SandForce flash and SSD controllers is its onboard firmware, which takes the one common hardware component – the ASIC – and adapts it to the user’s storage environment. For example, in client applications the firmware helps the controller preserve SSD power to enable users of laptop and ultrabook systems to remain unplugged longer between battery recharges. In contrast, enterprise environments require the highest possible performance and lowest latency. This higher performance draws more power, a tradeoff the enterprise is willing to make for the fastest time-to-data. The firmware makes other similar tradeoffs based on which storage environment it is serving.
Although most people consider the enterprise and client storage needs are very diverse, we think the new SandForce SF3700 Flash and SSD controller showcases the perfect balance of power and performance that any user hanging ten can appreciate.
All NAND flash-based SSDs use a process called garbage collection (GC) so the flash memory can be rewritten with fresh data, enabling the SSD to function like any other rewritable storage device. The number of rewrites (program/erase cycles) possible with NAND flash memory is finite. That’s why it’s essential to ensure that each P/E cycle really counts – that is, each is performed with top efficiency – to help preserve optimum SSD performance.
Collecting the garbage takes time
In my 2011 Flash Memory Summit presentation , I went into great detail about how GC – the automatic memory management process of clearing invalid data from memory to give new data a clean slate – works in an SSD. Here’s a recap: flash memory is organized in groups of pages where data can be written. Once a page is written, it cannot be rewritten until it is erased. Simple enough. But a page can only be erased within a group of typically 128 pages called a block. But wait. The complexity of writing data really starts to escalate in the case of random writes replacing previously written data. Random writes put the new data in previously erased pages elsewhere, peppering a block of valid data with “patches of invalid data.” In order to write new data to these patches, the whole block – all 128 pages – must be erased. But first all surrounding pages with valid data must be read and then rewritten to blank pages. The newly erased block of blank pages is then ready to save new data.
So why’s this a problem? This rewriting process shares the same path to the flash memory as new data arriving from the host system. You see the issue – a bottleneck. What you may not know is that this traffic jam can severely degrade overall write performance, sometimes as much as 90%.
Why not collect the garbage when the SSD is idle?
To improve write performance, many SSDs perform idle-time GC or background GC. When the SSD is idle – not performing reads or writes from the host system – the data paths to the flash memory are open. In a perfect world, the SSD controller would move all valid data into a contiguous group of new blocks so that all the free space would be consolidated into a few very large areas. Then, when new data arrived, the controller would send it directly to the fresh blocks and be spared from having to move data around just to free up space on pages of invalid data. But the world is far from perfect.
No free lunch, even from the garbage can
As might be expected, background or idle-time GC has drawbacks. The two main downsides are:
1. For users of Ultrabooks and other mobile systems, battery power is precious. The longer users can work unplugged between battery charges, the better. To make the most of a single charge, these systems use features like DevSleep to drastically reduce power to internal components not in use. At times when no data is being stored to or retrieved from the SSD, the host system gears down the SSD into a low power state (like DevSleep) to reduce power draw. In this state, an SSD with background or idle-time GC has no power to perform GC.
The upshot is that the SSD will be very slow when the system turns it back on and starts sending new data that must be saved in the spaces not yet cleared out by garbage-collected while the SSD was asleep. Alternately, the SSD may temporarily override the low power command from the host in order to perform the background GC, pulling more power from the battery and shortening the time remaining before it needs to be recharged.
2. When the SSD is performing GC, invalid data is ignored and only valid data is moved before the block is erased. Now imagine a large 2GB file on the SSD that the user plans to delete tomorrow. The SSD has no clue this will happen, so it automatically performs background GC around the 2GB file – and all other data – today, consuming one more of the very limited, precious program/erase cycles for all the flash holding that data. Ideally, the SSD would have waited one more day to garbage collect, the user would have already deleted the file, and the SSD wouldn’t have had to move all that data to new locations. No unnecessary data movement. No unnecessary use of a precious program/erase cycle.
Many people don’t realize that the number of background reads and writes initiated by the operating system, virus checkers, browsers, etc., far outstrip the number initiated by the computer user. Some users rarely delete files, believing they’ll extend the life of their SSD. The truth is, user file deletions are a drop in the bucket. It’s not their use of the computer storage that causes the most wear and tear. Background action from applications and the operating system is the real culprit.
Is there a better option?
What’s an SSD user to do? A super-fast foreground GC engine is the best solution. The key is special hardware and firmware integrated into the flash controller that streamlines garbage collection so it can run in the foreground with incoming data. The engine also enables high-speed writes to the flash memory. By maintaining high write speeds for GC operations, the SSD can afford to leave all valid data mixed with invalid data. That way the blocks are not recycled until absolutely necessary, dramatically reducing wear. Plus, the longer the SSD waits to GC pages, the higher the likelihood other pages of data will have already been made invalid by the operating system or user. The result is lower write amplification, longer flash endurance, and even higher performance.
All LSI® SandForce® Flash Controllers employ foreground GC to provide these invaluable benefits to the user.
Most consumers are skeptical when they see a manufacturer whipping out grandiose performance claims. And for good reason. The manufacturer could be stretching the truth, twisting the results, or just being downright misleading. From this distrust grew demand for 3rd-party writers to review products, test claims and provide an unbiased analysis of the device’s performance and other capabilities – as consumers would experience themselves.
Who can really claim to be an SSD benchmarking expert?
Solid state drive (SSD) technology is still relatively new in the computer industry, and in many ways SSDs are profoundly different from hard disk drives – perhaps most notably, in the way they record data, to a NAND cell rather than on spinning media. Because of differences in their operation, SSDs have to be tested in ways that are not necessarily obvious.
Can anyone who simply runs a benchmark application claim to be an expert? I would say not. Just as anyone sitting behind the wheel of a car is not necessarily an expert driver. The problem is that it is hard to determine the thoroughness and expertise of an SSD reviewer. Does the author really understand the details behind the technology to run adequate tests and analyze the results?
Can “experts” present bad data?
Maybe it is obvious, but of course experts can be wrong, especially when they are self-proclaimed mavens without deep experience in the technology they cover. At a minimum, you can generally count on them to act in good faith – that is, to not be intentionally misleading – but they can easily be misinformed (for instance, by manufacturers) and perpetuate the misinformation. What’s more, some reviewers are pressured to do a cursory analysis of an SSD as they crank through countless product evaluations under unremitting deadlines – a crush that can cause oversights in telling aspects of a drive’s performance. In any case, it is not good to rely on bad data no matter the intention.
What makes for a thorough SSD review?
Some reviewers have gone to great lengths to ensure their SSD analysis is extremely detailed and represents a real-world environment and performance. These reviewers will generally talk about how their analysis simulates a true user or server environment. The trouble can begin if a reviewer doesn’t recognize normal operation of an SSD in its own environment. With SSDs, “normal” is when garbage collection is operating, which greatly impacts overall performance. It’s important for reviewers to recognize that, with a new SSD, garbage collection is inactive until at least one full physical capacity of data has been sequentially written to the device. For example, with a 256GB SSD, 256GB of data must be written to trigger garbage collecting. At that point, garbage collection is ongoing, the drive has reached its steady-state performance, and the device is ready for evaluation. Random writes are another story, requiring up to three passes (full-capacity writes) randomly written to the SSD before the steady-state performance level shown below is reached.
You can see that running only a few minutes of random write tests on this SSD logs performance of over 275 MB/s. However, once garbage collection starts, performance plunges and then takes up to 3 hours before the true performance of 25 MB/s (a 90% drop) is finally evident – a phenomenon that often is not communicated clearly in reviews nor widely understood.
Good benchmarkers will discuss how their review factors in both garbage collection preparation and steady-state performance testing. Test results that purportedly achieve steady state in less time than in the example above are unlikely to reflect real-world performance. This is all part of what is called SSD preconditioning, but keep in mind that different tests require different steps for preconditioning.
For additional information on this topic, you can review my presentation from Flash Memory Summit 2013 on “Don’t let your favorite benchmarks lie to you.”
For the uninitiated, low-density parity-check (LDPC) code is an error correction code (ECC) that is used to both detect and correct errors on data that is transmitted from one point to another. All ECC types include correction data, so when information is transmitted with errors, the receiver has enough information to fix the errors without having to ask the source for the data again.
This enables transmitted data to maintain a constant speed as is required with digital television signals. What you don’t want is for the image to freeze repeatedly while waiting for correction data to be sent multiple times.
LDPC code was first presented to the world by Robert G. Gallager at MIT in 1960. It was very advanced for its time and, as it turned out, required a fantastic amount of computation to use real-time. The problem was that back in 1960, vacuum tube computers of that period performed about 100 times less work than the microprocessor-powered computers of today. In 1960, you would need a computer the size of a 2,000-square-foot house to process the LDPC correction information in real-time. This was hardly economical, so LDPC was mostly lost for nearly 40 years, as other simpler codes took its place over that period.
What was old is new again
In the mid-1990s, engineers working on satellite transmissions for digital television dusted off the LDPC codes and started using them for real-time operations. By then, computer processing had seen dramatic reductions in size and costs. Fast-forward to the past 5 years: we have seen a major increase in LDPC development and use because it appears to be the best solution for high-speed data transmissions, especially those subject to heavy levels of electrical noise that induce higher error rates. Also, the processing power of target devices like WiFi receivers and HDDs has grown even stronger and faster than some of the main CPUs of a few years ago. This enables LDPC to be deployed for little additional cost with the advantage of real-time data correction superior to correction offered by simpler codes.
If you have seen one LDPC solution, have you seen them all?
Nothing could be further from the truth. For example, an LDPC solution designed for satellite communication cannot be used for HDDs since there is no direct porting of the code, though there are distinct advantages to the two engineering teams sharing their knowledge and experience through their development efforts. Take, for example, the LDPC code that LSI has been shipping in its TrueStore® HDD read channel solutions for 3 years now. When LSI acquired SandForce and started work on SHIELD™ error correction code (based on LDPC) for flash controllers, there was no direct porting of that HDD code to support SSDs. However, the HDD development team’s knowledge and experience from creating the HDD code greatly improved the SSD team’s ability to more quickly bring SHIELD technology to the next-generation SandForce flash controller.
How do LDPC solutions for SSDs differ?
Many LDPC providers claim that their offerings rival the capabilities of competitive solutions, though often they aren’t telling the whole story. All LDPC solutions start with what is called hard-decision LDPC – a digital correction algorithm that operates at line rate on all data passing through the correction engine. The algorithm uses the meta-data generated from the user and system data stored on the flash memory, and helps recreate the user data when the flash memory returns it with errors. Hard-decision LDPC catches most errors from the flash memory, though sometimes it can be overwhelmed by an inordinate number of errors. That is where soft-decision LDPC – a more analog-based correction algorithm – comes into play.
Can a soft-decision be strong enough for my data?
Soft-decision LDPC is an error correction method that looks at other information beyond the actual ECC data. Soft-decision, in a sense, looks at the meta-data of the meta-data. The simplest form of soft-decision LDPC may just re-read the data at a different reference voltage, as if asking a person “can you say that again?” More complex soft-decision might be compared to listening to a man with a heavy French accent speaking English. You know he just said something in English, but you could not clearly grasp what he said. You ask some questions and, from his answers, soon realize what he originally said and are now back on track. While this might seem more like guessing at the answer, soft-decision LDPC uses statistics to help ensure the answers are not false positive results. As a result, soft-decision LDPC uncovers a new set of engineering problems that need to be solved, opening new opportunities for flash controller manufacturers to create powerful intellectual property (IP). For that reason, you’re likely to learn very little about how a given company’s soft-decision LDPC works.
At the 2013 Flash Memory Summit in Santa Clara, California, LSI demonstrated its SHIELD Advanced Error Correction Technology. SHIELD technology includes hard and soft-decision LDPC with digital signal processing (DSP) and a number of other unique features designed to optimize future NAND flash memory operation in compute environments. One feature, called Adaptive Code Rate, works with other LSI features to enable the spare area in flash memory reserved for ECC data to occupy less space than the manufacturer allocation and then dynamically grow to accommodate inevitable increases in flash error rates. The soft-decision LDPC capability offers multiple strengths of correction, with each activating only as necessary to ensure the lowest possible real-time latency.
So it’s clear that all LDPC solutions are far from the same. When evaluating LDPC solutions, be sure to understand how they manage data correction when it exceeds the ability of the hard-decision LDPC. Also, make sure the algorithms are actually in use in a product. Otherwise, the product might turn out to be a science experiment that never works.
Tags: Adaptive Code Rate, ECC, error correction, flash controllers, flash memory, Flash Memory Summit, LDPC, NAND flash, Robert Gallager, SandForce, SHIELD, solid state drive, SSD, TrueStore
Part two of this Write Amplification (WA) series covered how WA works in solid-state drives (SSDs) that use data reduction technology. I mentioned that, with one of these SSDs, the WA can be less than one, which can greatly improve flash memory performance and endurance.
Why is it important to know your SSD write amplification?
Well, it’s not really necessary to know the write amplification of your SSD at any particular point in time, but you do want an SSD with the lowest WA available. The reason is that the limited number of program/erase cycles that NAND flash can support keeps dropping with each generation of flash developed. A low WA will ensure the flash memory lasts longer than flash on an SSD with a higher WA.
A direct benefit of a WA below one is that the amount of dynamic over provisioning is higher, which generally provides higher performance. In the case of over provisioning, more is better, since a key attribute of SSD is performance. Keep in mind that, beyond selecting the best controller, you cannot control the WA of an SSD.
Just how smart are the SSD SMART attributes?
The monitoring system SMART (Self-Monitoring, Analysis and Reporting Technology) tracks various indicators of hard disk solid state drive reliability – including the number of errors corrected, bytes written, and power-on hours – to help anticipate failures, enabling users to replace storage before a failure causes data loss or system outages.
Some of these indicators, or attributes, point to the status of the drive health and others provide statistical information. While all manufacturers use many of these attributes in the same or a similar way, there is no standard definition for each attribute, so the meaning of any attribute can vary from one manufacturer to another. What’s more, there’s no requirement for drive manufacturers to list their SMART attributes.
How to measure missing attributes by extrapolation
Most SSDs provide some list of SMART attributes, but WA typically is excluded. However, with the right tests, you can sometimes extrapolate, with some accuracy, the WA value. We know that under normal conditions, an SSD will have a WA very close to 1:1 when writing data sequentially.
For an SSD with data reduction technology, you must write data with 100% entropy to ensure you identify the correct attributes, then rerun the tests with an entropy that matches your typical data workload to get a true WA calculation. SSDs without data reduction technology do not benefit from entropy, so the level of entropy used on them does not matter.
To measure missing attributes by extrapolation, start by performing a secure erase of the SSD, and then use a program to read all the current SMART attribute values. Some programs do not accurately display the true meaning of an attribute simply because the attribute itself contains no description. For you to know what each attribute represents, the program reading the attribute has to be pre-programmed by the manufacturer. The problem is that some programs mislabel some attributes. Therefore, you need to perform tests to confirm the attributes’ true meaning.
Start writing sequential data to the SSD, noting how much data is being written. Some programs will indicate exactly how much data the SSD has written, while others will reveal only the average data per second over a given period. Either way, the number of bytes written to the SSD will be clear. You want to write about 10 or more times the physical capacity of the SSD. This step is often completed with IOMeter, VDbench, or other programs that can send large measurable quantities of data.
At the end of the test period, print out the SMART attributes again and look for all attributes that have a different value than at the start of the test. Record the attribute number and the difference between the two test runs. You are trying to find one that represents a change of about 10, or the number of times you wrote to the entire capacity of the SSD. The attribute you are trying to find may represent the number of complete program/erase cycles, which would match your count almost exactly. You might also find an attribute that is counting the number of gigabytes (GBs) of data written from the host. To match that attribute, take the number of times you wrote to the entire SSD and multiply by the physical capacity of the flash. Technically, you already know how much you wrote from the host, but it is good to have the drive confirm that value.
Doing the math
When you find candidates that might be a match (you might have multiple attributes), secure erase the drive again, this time writing randomly with 4K transfers. Again, write about 10 times the physical capacity of the drive, then record the SMART attributes and calculate the difference from the last recording of the same attributes that changed between the first two recordings. This time, the change you see in the data written from the host should be nearly the same as with the sequential run. However, the attribute that represents the program/erase cycles (if present) will be many times higher than during the sequential run.
To calculate write amplification, use this equation:
( Number of erase cycles x Physical capacity in GB ) / Amount of data written from the host in GB
With sequential transfers, this number should be very close to 1. With random transfers, the number will be much higher depending on the SSD controller. Different SSDs will have different random WA values.
If you have an SSD with the type of data reduction technology used in the LSI® SandForce® controller, you will see lower and lower write amplification as you approach your lowest data entropy when you test with any entropy lower than 100%. With this method, you should be able to measure the write amplification of any SSD as long as it has erase cycles and host data-written attributes or something that closely represents them.
Protect your SSD against degraded performance
The key point to remember is that write amplification is the enemy of flash memory performance and endurance, and therefore the users of SSDs. This three-part series examined all the elements that affect WA, including the implications and advantages of a data reduction technology like the LSI SandForce DuraWrite™ technology. Once you understand how WA works and how to measure it, you will be better armed to defend yourself against this beastly cause of degraded SSD performance.
This three-part series examines some of the details behind write amplification, a key property of all NAND flash-based SSDs that is often misunderstood in the industry.
In part one of this Write Amplification (WA) series, I examined how WA works in basic solid-state drives (SSDs). Part two now takes a deeper look at how SSDs that use some form of data reduction technology can see a very big and positive impact on WA.
Data reduction technology can master data entropy
The performance of all SSDs is influenced by the same factors – such as the amount of over provisioning and levels of random vs. sequential writing – with one major exception: entropy. Only SSDs with data reduction technology can take advantage of entropy – the degree of randomness of data – to provide significant performance, endurance and power-reduction advantages.
Data reduction technology parlays data entropy (not to be confused with how data is written to the storage device – sequential vs. random) into higher performance. How? When data reduction technology sends data to the flash memory, it uses some form of data de-duplication, compression, or data differencing to rearrange the information and use fewer bytes overall. After the data is read from flash memory, data reduction technology, by design, restores 100% of the original content to the host computer This is known as “loss-less” data reduction and can be contrasted with “lossy” techniques like MPEG, MP3, JPEG, and other common formats used for video, audio, and visual data files. These formats lose information that cannot be restored, though the resolution remains adequate for entertainment purposes.
The multi-faceted power of data reduction technology
My previous blog on data reduction discusses how data reduction technology relates to the SATA TRIM command and increases free space on the SSD, which in turn reduces WA and improves subsequent write performance. With a data-reduction SSD, the lower the entropy of the data coming from the host computer, the less the SSD has to write to the flash memory, leaving more space for over provisioning. This additional space enables write operations to complete faster, which translates not only into a higher write speed at the host computer but also into lower power use because flash memory draws power only while reading or writing. Higher write speeds also mean lower power draw for the flash memory.
Because data reduction technology can send less data to the flash than the host originally sent to the SSD, the typical write amplification factor falls below 1.0. It is not uncommon to see a WA of 0.5 on an SSD with this technology. Writing less data to the flash leads directly to:
Each of these in turn produces other benefits, some of which circle back upon themselves in a recursive manner. This logic diagram highlights those benefits.
So this is a rare instance when an amplifier – namely, Write Amplification – makes something smaller. At LSI, this unique amplifier comes in the form of the LSI® DuraWrite™ data reduction technology in all SandForce® Driven™ SSDs.
This three-part series examines some of the details behind write amplification, a key property of all NAND flash-based SSDs that is often misunderstood in the industry.
In today’s solid state drives (SSDs), the NAND flash memory must be erased before it can store new data. In other words, data cannot be overwritten directly as it is in a hard disk drive (HDD). Instead, SSDs use a process called garbage collection (GC) to reclaim the space taken by previously stored data. This means that write demands are heavier on SSDs than HDDs when storing the same information.
This is bad because the flash memory in the SSD supports only a limited number of writes before it can no longer be read. We call this undesirable effect write amplification (WA). In my blog, Gassing up your SSD, I describe why WA exists in a little more detail, but here I will explain what controls it.
It’s all about the free space
I often tell people that SSDs work better with more free space, so anything that increases free space will keep WA lower. The two key ways to expand free space (thereby decreasing WA) are to 1) increase over provisioning and 2) keep more storage space free (if you have TRIM support).
As I said earlier, there is no WA before GC is active. However, this pristine pre-GC condition has a tiny life span – just one full-capacity write cycle during a “fresh-out-of-box” (FOB) state, which accounts for less than 0.04% of the SSD’s life. Although you can manually recreate this condition with a secure erase, the cost is an additional write cycle, which defeats the purpose. Also keep in mind that the GC efficiency and associated wear leveling algorithms can affect WA (more efficient = lower WA).
The other major contributor to WA is the organization of the free space (how data is written to the flash). When data is written randomly, the eventual replacement data will also likely come in randomly, so some pages of a block will be replaced (made invalid) and others will still be good (valid). During GC, valid data in blocks like this needs to be rewritten to new blocks. This produces another write to the flash for each valid page, causing write amplification.
With sequential writes, generally all the data in the pages of the block becomes invalid at the same time. As a result, no data needs relocating during GC since there is no valid data remaining in the block before it is erased. In this case, there is no amplification, but other things like wear leveling on blocks that don’t change will still eventually produce some write amplification no matter how data is written.
Calculating write amplification
Write Amplification is fundamentally the result of data written to the flash memory divided by data written by the host. In 2008, both Intel and SiliconSystems (acquired by Western Digital) were the first to start talking publically about WA. At that time, the WA of all SSDs was something greater than 1.0. It was not until SandForce introduced the first SSD controller with DuraWrite™ technology in 2009 that WA could fall below 1.0. DuraWrite technology increases the free space mentioned above, but in a way that is unique from other SSD controllers. In part two of my write amplification series, I will explain how DuraWrite technology works.
This three-part series examines some of the details behind write amplification, a key property of all NAND flash-based SSDs that is often misunderstood in the industry.
It is always good to hear the opinions of your customers and end users, and in that respect June was a banner month for LSI® SandForce® flash controllers.
In a survey soliciting responses from more than 1 million members of on-line groups and other sources by IT Brand Pulse, an independent product testing and validation lab, LSI SandForce controllers ranked at the top of all six SSD controller chip sub-categories: market, price, performance, reliability, service and support, and innovation. Last August, the LSI SandForce controllers won in three of the six sub-categories, so we’re thrilled to see momentum building.
Winning all six awards is no easy task. Some of the sub-categories could be considered mutually exclusive, requiring customers to make trade-offs among product attributes. For example, often a product with the best price is considered to have skimped on quality compared to pricier solutions. A product with screaming performance, ironically, is seen as something of a market laggard because it usually does not carry the best price. So it is exciting to strike the right balance among all six measures and sweep the product category. You can find more details on the awards here: http://itbrandpulse.com/research/brand-leader-program/225-ssd-controller-chips-2013
For those of us in product marketing, winning product awards voted on by your peers can bring on a feeling similar to that warm afterglow parents bask in when they hear their child has made the honor roll or was named the valedictorian for his or her graduating class.
So please pardon us, for a moment, as we beam with pride.
It seems like our smartphones are getting bigger and bigger with each generation. Sometimes I see people holding what appears to be a tablet computer up next to their head. I doubt they know how ridiculous they look to the rest of us, and I wonder what pants today have pockets that big.
I certainly do like the convenience of the instant-on capabilities my smart phone gives me, but I still need my portable computer with its big screen and keyboard separate from my phone.
A few years ago, SATA-IO, the standards body, added a new feature to the Serial ATA (SATA) specification designed to further reduce battery consumption in portable computer products. This new feature, DevSleep, enables solid state drives (SSDs) to act more like smartphones, allowing you to go days without plugging in to recharge and then instantly turn them on and see all the latest email, social media updates, news and events.
Why not just switch the system off?
When most PC users think about switching off their system, they dread waiting for the operating system to boot back up. That is one of the key advantages of replacing the hard disk drive (HDD) in the system with a faster SSD. However, in our instant gratification society, we hate to wait even seconds for web pages to come up, so waiting minutes for your PC to turn on and boot up can feel like an eternity. Therefore, many people choose to leave the system on to save those precious moments… but at the expense of battery life.
Can I get this today?
To further extend battery life, the new DevSleep feature requires a signal change on the SATA connector. This change is currently supported only in new Intel® Haswell chipset-based platforms announced this June. What’s more, the SSD in these systems must support the DevSleep feature and monitor the signal on the SATA connector. Most systems that support DevSleep will likely be very low-power notebook systems and will likely already ship with an SSD installed using a small mSATA, M.2, or similar edge connector. Therefore, the signal change on the SATA interface will not immediately affect the rest of the SSDs designed for desktop systems shipping through retail and online sources. Note that not all SSDs are created equal and, while many claim support for DevSleep, be sure to look at the fine print to compare the actual power draw when in DevSleep.
At Computex last month, LSI announced support for the DevSleep feature and staged demonstrations showing a 400x reduction in idle power. It should be noted that a 400x reduction in power does not directly translate to a 400x increase in battery life, but any reduction in power will give you more time on the battery, and that will certainly benefit any user who often works without a power cord.