Each new generation of NAND flash memory reduces the fabrication geometry â€“ the dimension of the smallest part of an integrated circuit used to build up the components inside the chip. That means there are fewer electrons storing the data, leading to increased errors and a shorter life for the flash memory. No need to worry. Todayâ€™s flash memory depends upon the intelligence and capabilities of the solid state drive (SSD) controller to help keep errors in check and get the longest life possible from flash memory, making it usable in compute environments like laptop computers and enterprise datacenters.
Todayâ€™s volume NAND flash memory uses a 20nm and 19nm manufacturing process, but the next generation will be in the 16nm range. Some experts speculate that todayâ€™s controllers will struggle to work with this next generation of flash memory to support the high number of write cycles required in datacenters. Also, the current multi-level cell (MLC) flash memory is transitioning to triple-level cell (TLC), which has an even shorter life expectancy and higher error rates.
Can sub-20nm flash survive in the datacenter?
Yes, but it will take a flash memory controller with smarts the industry has never seen before. How intelligent? Sub-20nm flash will need to stretch the life of the flash memory beyond the flash manufacturerâ€™s specifications and correct far more errors than ever before, while still maintaining high throughput and very low latency. And to protect against periodic error correction algorithm failures, the flash will need some kind of redundancy (backup) of the data inside the SSD itself.
When will such a controller materialize?
LSI this week introduced the third generation of its flagship SSD and flash memory controller, called the SandForce SF3700. The controller is newly engineered and architected to solve the lifespan, performance, and reliability challenges of deploying sub-20nm flash memory in todayâ€™s performance-hungry enterprise datacenters. The SandForce SF3700 also enables longer periods between battery recharges for power-sipping client laptop and ultrabook systems. It all happens in a single ASIC package. The SandForce SF3700 is the first SSD controller to include both PCIe and SATA host interfaces natively in one chip to give customers of SSD manufacturers an easy migration path as more of them move to the faster PCIe host interface.
How does the SandForce SF3700 controller make sub-20nm flash excel in the datacenter?
Our new controller builds on the award-winning capabilities of the current SandForce SSD and flash controllers. Weâ€™ve refined our DuraWriteâ„˘ data reduction technology to streamline the way it picks blocks, collects garbage and reduces the write count. Youâ€™ll like the result: longer flash endurance and higher read and write speeds.
The SandForce SF3700 includes SHIELDâ„˘ error correction, which applies LDPC and DSP technology in unique ways to correct the higher error rates from the new generations of flash memory. SHIELD technology uses a multi-level error correction schema to optimize the time to get the correct data. Also, with its exclusive Adaptive Code Rate feature, SHIELD leverages DuraWrite technologyâ€™s ability to span internal NAND flash boundaries between the user data space and the flash manufacturerâ€™s dedicated ECC field. Other controllers only use one size of ECC code rate for flash memory â€“ the one largest size designed to support the end of the flashâ€™s life. Early in the flash life, a much smaller size ECC is required, and SHIELD technology scales down the ECC accordingly, diverting the remaining free space as additional over provisioning. SHIELD partially increases the ECC size over time as the flash ages to correct the increasing failures, but does not use the largest ECC size until the flash is nearly at the end of its life.
Why is this good? Greater over provisioning over the life of the SSD improves performance and increases the endurance. SHIELD also allows the ECC field to grow even larger after it reaches its specified end of life. The big takeaway: All of these SHIELD capabilities increase flash write endurance many times beyond the manufacturerâ€™s specification. In fact Â at the 2013 Flash Memory Summit exposition in Santa Clara, CA, SHIELD was shown to extend the endurance of a particular Micron NAND flash by nearly six times.
Thatâ€™s not all. The SandForce SF3700 controllerâ€™s RAISEâ„˘ data reliability feature now offers stronger protection, including full die failure and more options for protecting data on SSDs with low (e.g., 32GB & 64GB) and binary (256GB vs. 240GB) capacities.
So what about end user systems?
The beauty of all SandForce flash and SSD controllers is its onboard firmware, which takes the one common hardware component â€“ the ASIC â€“ and adapts it to the userâ€™s storage environment. For example, in client applications the firmware helps the controller preserve SSD power to enable users of laptop and ultrabook systems to remain unplugged longer between battery recharges. In contrast, enterprise environments require the highest possible performance and lowest latency. This higher performance draws more power, a tradeoff the enterprise is willing to make for the fastest time-to-data. The firmware makes other similar tradeoffs based on which storage environment it is serving.
Although most people consider the enterprise and client storage needs are very diverse, we think the new SandForce SF3700 Flash and SSD controller showcases the perfect balance of power and performance that any user hanging ten can appreciate.
Part two of this Write Amplification (WA) series covered how WA works in solid-state drives (SSDs) that use data reduction technology. I mentioned that, with one of these SSDs, the WA can be less than one, which can greatly improve flash memory performance and endurance.
Why is it important to know your SSD write amplification?
Well, itâ€™s not really necessary to know the write amplification of your SSD at any particular point in time, but you do want an SSD with the lowest WA available. The reason is that the limited number of program/erase cycles that NAND flash can support keeps dropping with each generation of flash developed. A low WA will ensure the flash memory lasts longer than flash on an SSD with a higher WA.
A direct benefit of a WA below one is that the amount of dynamic over provisioning is higher, which generally provides higher performance. In the case of over provisioning, more is better, since a key attribute of SSD is performance. Keep in mind that, beyond selecting the best controller, you cannot control the WA of an SSD.
Just how smart are the SSD SMART attributes?
The monitoring system SMART (Self-Monitoring, Analysis and Reporting Technology) tracks various indicators of hard disk solid state drive reliability â€“ includingÂ the number of errors corrected, bytes written, and power-on hours â€“ to help anticipate failures, enabling users to replace storage before a failure causes data loss or system outages.Â
Some of these indicators, or attributes, point to the status of the drive health and others provide statistical information. While all manufacturers use many of these attributes in the same or a similar way, there is no standard definition for each attribute, so the meaning of any attribute can vary from one manufacturer to another. Whatâ€™s more, thereâ€™s no requirement for drive manufacturers to list their SMART attributes.
How to measure missing attributes by extrapolation
Most SSDs provide some list of SMART attributes, but WA typically is excluded. However, with the right tests, you can sometimes extrapolate, with some accuracy, the WA value. Â We know that under normal conditions, an SSD will have a WA very close to 1:1 when writing data sequentially.
For an SSD with data reduction technology, you must write data with 100% entropy to ensure you identify the correct attributes, then rerun the tests with an entropy that matches your typical data workload to get a true WA calculation. SSDs without data reduction technology do not benefit from entropy, so the level of entropy used on them does not matter.
To measure missing attributes by extrapolation, start by performing a secure erase of the SSD, and then use a program to read all the current SMART attribute values. Some programs do not accurately display the true meaning of an attribute simply because the attribute itself contains no description. For you to know what each attribute represents, the program reading the attribute has to be pre-programmed by the manufacturer. The problem is that some programs mislabel some attributes. Therefore, you need to perform tests to confirm the attributesâ€™ true meaning.
Start writing sequential data to the SSD, noting how much data is being written. Some programs will indicate exactly how much data the SSD has written, while others will reveal only the average data per second over a given period. Either way, the number of bytes written to the SSD will be clear. You want to write about 10 or more times the physical capacity of the SSD. This step is often completed with IOMeter, VDbench, or other programs that can send large measurable quantities of data.
At the end of the test period, print out the SMART attributes again and look for all attributes that have a different value than at the start of the test. Record the attribute number and the difference between the two test runs. You are trying to find one that represents a change of about 10, or the number of times you wrote to the entire capacity of the SSD. The attribute you are trying to find may represent the number of complete program/erase cycles, which would match your count almost exactly. You might also find an attribute that is counting the number of gigabytes (GBs) of data written from the host. To match that attribute, take the number of times you wrote to the entire SSD and multiply by the physical capacity of the flash. Technically, you already know how much you wrote from the host, but it is good to have the drive confirm that value.
Doing the math
When you find candidates that might be a match (you might have multiple attributes), secure erase the drive again, this time writing randomly with 4K transfers. Again, write about 10 times the physical capacity of the drive, then record the SMART attributes and calculate the difference from the last recording of the same attributes that changed between the first two recordings. This time, the change you see in the data written from the host should be nearly the same as with the sequential run. However, the attribute that represents the program/erase cycles (if present) will be many times higher than during the sequential run.
To calculate write amplification, use this equation:
( Number of erase cycles Â x Â Physical capacity in GB ) / Amount of data written from the host in GB
With sequential transfers, this number should be very close to 1. With random transfers, the number will be much higher depending on the SSD controller. Different SSDs will have different random WA values.
If you have an SSD with the type of data reduction technology used in the LSIÂ® SandForceÂ® controller, you will see lower and lower write amplification as you approach your lowest data entropy when you test with any entropy lower than 100%. With this method, you should be able to measure the write amplification of any SSD as long as it has erase cycles and host data-written attributes or something that closely represents them.
Protect your SSD against degraded performance
The key point to remember is that write amplification is the enemy of flash memory performance and endurance, and therefore the users of SSDs. This three-part series examined all the elements that affect WA, including the implications and advantages of a data reduction technology like the LSI SandForce DuraWriteâ„˘ technology. Once you understand how WA works and how to measure it, you will be better armed to defend yourself against this beastly cause of degraded SSD performance.
This three-part series examines some of the details behind write amplification, a key property of all NAND flash-based SSDs that is often misunderstood in the industry.
In part one of this Write Amplification (WA) series, I examined how WA works in basic solid-state drives (SSDs). Part two now takes a deeper look at how SSDs that use some form of data reduction technology can see a very big and positive impact on WA.
Data reduction technology can master data entropy
The performance of all SSDs is influenced by the same factors â€“ such as the amount of over provisioning and levels of random vs. sequential writing â€“ with one major exception: entropy. Only SSDs with data reduction technology can take advantage of entropy â€“ the degree of randomness of data â€“ to provide significant performance, endurance and power-reduction advantages.
Data reduction technology parlays data entropy (not to be confused with how data is written to the storage device â€“ sequential vs. random) into higher performance. How? When data reduction technology sends data to the flash memory, it uses some form of data de-duplication, compression, or data differencing to rearrange the information and use fewer bytes overall. After the data is read from flash memory, Â data reduction technology, by design, restores 100% of the original content to the host computer This is known as â€śloss-lessâ€ť data reduction and can be contrasted with â€ślossyâ€ť techniques like MPEG, MP3, JPEG, and other common formats used for video, audio, and visual data files. These formats lose information that cannot be restored, though the resolution remains adequate for entertainment purposes.
The multi-faceted power of data reduction technology
My previous blog on data reduction discusses how data reduction technology relates to the SATA TRIM command and increases free space on the SSD, which in turn reduces WA and improves subsequent write performance. With a data-reduction SSD, the lower the entropy of the data coming from the host computer, the less the SSD has to write to the flash memory, leaving more space for over provisioning. This additional space enables write operations to complete faster, which translates not only into a higher write speed at the host computer but also into lower power use Â because flash memory draws power only while reading or writing. Higher write speeds also mean lower power draw for the flash memory.
Because data reduction technology can send less data to the flash than the host originally sent to the SSD, the typical write amplification factor falls below 1.0. It is not uncommon to see a WA of 0.5 on an SSD with this technology. Writing less data to the flash leads directly to:
Each of these in turn produces other benefits, some of which circle back upon themselves in a recursive manner. This logic diagram highlights those benefits.
So this is a rare instance when an amplifier â€“ namely, Write Amplification â€“ makes something smaller. At LSI, this unique amplifier comes in the form of the LSIÂ® DuraWriteâ„˘ data reduction technology in all SandForceÂ® Drivenâ„˘ SSDs.
This three-part series examines some of the details behind write amplification, a key property of all NAND flash-based SSDs that is often misunderstood in the industry.
Imagine a bathtub full of water and asking someone to empty the tub while you turn your back for a moment. When you look again and see the water is gone, do you just assume someone pulled the drain plug?
I think most people would, but what about the other methods of removing the water like with a siphon, or using buckets to bail out the water? In a typical bathroom you are not likely to see these other methods used, but that does not mean they do not exist. The point is that just because you see a certain result does not necessarily mean the obvious solution was used.
I see a lot of confusion in forum posts from SandForce Drivenâ„˘ SSD users and reviewers over how the LSIÂ® DuraWriteâ„˘ data reduction and advanced garbage collection technology relates to the SATA TRIM command. In my earlier blog on TRIM I covered this topic in great detail, but in simple terms the operating system uses the TRIM command to inform an SSD what information is outdated and invalid. Without the TRIM command the SSD assumes all of the user capacity is valid data. I explained in my blog Gassing up your SSD that creating more free space through over provisioning or using less of the total capacity enables the SSD to operate more efficiently by reducing the write amplification, which leads to increased performance and flash memory endurance. So without TRIM the SSD operates at its lowest level of efficiency for a particular level of over provisioning.
Will you drown in invalid data without TRIM?
TRIM is a way to increase the free space on an SSD â€“ what we call â€śdynamic over provisioningâ€ť â€“ and DuraWrite technology is another method to increase the free space. Since DuraWrite technology is dependent upon the entropy (randomness) of the data, some users will get more free space than others depending on what data they store. Since the technology works on the basis of the aggregate of all data stored, boot SSDs with operating systems can still achieve some level of dynamic over provisioning even when all other files are at the highest entropy, e.g., encrypted or compressed files.
With an older operating system or in an environment that does not support TRIM (most RAID configurations), DuraWrite technology can provide enough free space to offer the same benefits as having TRIM fully operational. In cases where both TRIM and DuraWrite technology are operating, the combined result may not be as noticeable as when theyâ€™re working independently since there are diminishing returns when the free space grows to greater than half of the SSD storage capacity.
So the next time you fill your bathtub, think about all the ways you can get the water out of the tub without using the drain. That will help you remember that both TRIM and DuraWrite technology can improve SSD performance using different approaches to the same problem. If that analogy does not work for you, consider the different ways to produce a furless feline, and think about what opening graphic image I might have used for a more jolting effect. Although in that case you might not have seen this blog since that image would likely have gotten us banned from GoogleÂ® â€śsafe for workâ€ť searches.
I presented on this topic in detail at the Flash Memory Summit in 2011. You can read it in full here: http://www.lsi.com/downloads/Public/Flash%20Storage%20Processors/LSI_PRS_FMS2011_T1A_Smith.pdf
Tags: bathtub drain, controller, data reduction technology, DuraWrite, flash, flash controller, flash memory, Flash Memory Summit, NAND, over-provisioning, SandForce, SandForce Driven SSD, SATA, Serial ATA, solid state drive, TRIM
The term global warming can be very polarizing in a conversation and both sides of the argument have mountains of material that support or discredit the overall situation. The most devout believers in global warming point to the average temperature increases in the Earthâ€™s atmosphere over the last 100+ years. They maintain the rise is primarily caused by increased greenhouse gases from humans burning fossil fuels and deforestation.
The opposition generally agrees with the measured increase in temperature over that time, but claims that increase is part of a natural cycle of the planet and not something humans can significantly impact one way or another. The US Energy Information Administration estimates that 90% of worldâ€™s marketed energy consumption is from non-renewable energy sources like fossil fuels. Our internet-driven lives run through datacenters that are well-known to consume large quantities of power. No matter which side of the global warming argument you support, most people agree that wasting power is not a good long-term position. Therefore, if the power consumed by datacenters can be reduced, especially as we live in an increasingly digitized world, this would benefit all mankind.
When we look at the most power-hungry components of a datacenter, we find mainly server and storage systems. However, people sometimes forget that those systems require cooling to counteract the heat generated. But the cooling itself consumes even more energy. So anything that can store data more efficiently and quickly will reduce both the initial energy consumption and the energy to cool those systems. As datacenters demand faster data storage, they are shifting to solid state drives (SSDs). SSDs generally provide higher performance per watt of power consumed over hard disk drives, but there is still more that can be done.
Reducing data to help turn down the heat
The good news is that thereâ€™s a way to reduce the amount of data that reaches the flash memory of the SSD. The unique DuraWriteâ„˘ technology found in all LSIÂ® SandForceÂ®Â flash controllersÂ reduces the amount of data written to the flash memory to cut the time it takes to complete the writes and therefore reduce power consumption, below levels of other SSD technologies. That, in turn, reduces the cooling needed to further reduce overall power consumption. Now this data reduction is â€śloss-less,â€ť meaning 100% of what is saved is returned to the host, unlike MPEG, JPEG, and MP3 files, which tolerate some amount of data loss to reduce file sizes.
Today you can find many datacenters already using SandForce Driven SSDs and LSI Nytroâ„˘ application acceleration products (which use DuraWrite technology as well). When we start to see datacenters deploying these flash storage products by the millions, you will certainly be able to measure the reduction in power consumed by datacenters. Unfortunately, LSI will not be able to claim it stopped global warming, but at least we, and our customers, can say we did something to help defer the end result.