Part two of this Write Amplification (WA) series covered how WA works in solid-state drives (SSDs) that use data reduction technology. I mentioned that, with one of these SSDs, the WA can be less than one, which can greatly improve flash memory performance and endurance.
Why is it important to know your SSD write amplification?
Well, itâ€™s not really necessary to know the write amplification of your SSD at any particular point in time, but you do want an SSD with the lowest WA available. The reason is that the limited number of program/erase cycles that NAND flash can support keeps dropping with each generation of flash developed. A low WA will ensure the flash memory lasts longer than flash on an SSD with a higher WA.
A direct benefit of a WA below one is that the amount of dynamic over provisioning is higher, which generally provides higher performance. In the case of over provisioning, more is better, since a key attribute of SSD is performance. Keep in mind that, beyond selecting the best controller, you cannot control the WA of an SSD.
Just how smart are the SSD SMART attributes?
The monitoring system SMART (Self-Monitoring, Analysis and Reporting Technology) tracks various indicators of hard disk solid state drive reliability â€“ includingÂ the number of errors corrected, bytes written, and power-on hours â€“ to help anticipate failures, enabling users to replace storage before a failure causes data loss or system outages.Â
Some of these indicators, or attributes, point to the status of the drive health and others provide statistical information. While all manufacturers use many of these attributes in the same or a similar way, there is no standard definition for each attribute, so the meaning of any attribute can vary from one manufacturer to another. Whatâ€™s more, thereâ€™s no requirement for drive manufacturers to list their SMART attributes.
How to measure missing attributes by extrapolation
Most SSDs provide some list of SMART attributes, but WA typically is excluded. However, with the right tests, you can sometimes extrapolate, with some accuracy, the WA value. Â We know that under normal conditions, an SSD will have a WA very close to 1:1 when writing data sequentially.
For an SSD with data reduction technology, you must write data with 100% entropy to ensure you identify the correct attributes, then rerun the tests with an entropy that matches your typical data workload to get a true WA calculation. SSDs without data reduction technology do not benefit from entropy, so the level of entropy used on them does not matter.
To measure missing attributes by extrapolation, start by performing a secure erase of the SSD, and then use a program to read all the current SMART attribute values. Some programs do not accurately display the true meaning of an attribute simply because the attribute itself contains no description. For you to know what each attribute represents, the program reading the attribute has to be pre-programmed by the manufacturer. The problem is that some programs mislabel some attributes. Therefore, you need to perform tests to confirm the attributesâ€™ true meaning.
Start writing sequential data to the SSD, noting how much data is being written. Some programs will indicate exactly how much data the SSD has written, while others will reveal only the average data per second over a given period. Either way, the number of bytes written to the SSD will be clear. You want to write about 10 or more times the physical capacity of the SSD. This step is often completed with IOMeter, VDbench, or other programs that can send large measurable quantities of data.
At the end of the test period, print out the SMART attributes again and look for all attributes that have a different value than at the start of the test. Record the attribute number and the difference between the two test runs. You are trying to find one that represents a change of about 10, or the number of times you wrote to the entire capacity of the SSD. The attribute you are trying to find may represent the number of complete program/erase cycles, which would match your count almost exactly. You might also find an attribute that is counting the number of gigabytes (GBs) of data written from the host. To match that attribute, take the number of times you wrote to the entire SSD and multiply by the physical capacity of the flash. Technically, you already know how much you wrote from the host, but it is good to have the drive confirm that value.
Doing the math
When you find candidates that might be a match (you might have multiple attributes), secure erase the drive again, this time writing randomly with 4K transfers. Again, write about 10 times the physical capacity of the drive, then record the SMART attributes and calculate the difference from the last recording of the same attributes that changed between the first two recordings. This time, the change you see in the data written from the host should be nearly the same as with the sequential run. However, the attribute that represents the program/erase cycles (if present) will be many times higher than during the sequential run.
To calculate write amplification, use this equation:
( Number of erase cycles Â x Â Physical capacity in GB ) / Amount of data written from the host in GB
With sequential transfers, this number should be very close to 1. With random transfers, the number will be much higher depending on the SSD controller. Different SSDs will have different random WA values.
If you have an SSD with the type of data reduction technology used in the LSIÂ® SandForceÂ® controller, you will see lower and lower write amplification as you approach your lowest data entropy when you test with any entropy lower than 100%. With this method, you should be able to measure the write amplification of any SSD as long as it has erase cycles and host data-written attributes or something that closely represents them.
Protect your SSD against degraded performance
The key point to remember is that write amplification is the enemy of flash memory performance and endurance, and therefore the users of SSDs. This three-part series examined all the elements that affect WA, including the implications and advantages of a data reduction technology like the LSI SandForce DuraWriteâ„˘ technology. Once you understand how WA works and how to measure it, you will be better armed to defend yourself against this beastly cause of degraded SSD performance.
This three-part series examines some of the details behind write amplification, a key property of all NAND flash-based SSDs that is often misunderstood in the industry.
August was always an exciting time at my childhood home.Â We were excited that was school was starting in September and mom was relieved that summer was coming to an end. I remember the annual trips to the local department stores to buy school clothes every year.Â It was always exciting to pick out a new school clothing and a new winter coat. With only a few stores to choose from, many of us wore similar clothes and coats when classes started.
As consumers, we have far more fashion and store options today. There are specialty stores at the mall, big box outlets, membership stores and specialty online portals. With so many more clothing designers than in years past, retailers are also inundated with fashion choices. The question becomes, â€śhow does the fashion chain â€“ from textile suppliers and clothing manufacturers to the retailers themselves â€“ choose what to carry?â€ť Â
They all rely on big data to make critical decisions.Â Letâ€™s go to the start of the chain: the textile manufacturer. It may analyze previous yearsâ€™ orders, competitive intelligence, purchasing trend data, and raw material and manufacturing costs.Â While tracking analytics on one data source is relatively easy, capturing and analyzing multiple data sources can be a tremendous challenge â€“ a point underscored in a 2012 research report from Gartner.Â In its analysis, Gartner found that big data processing challenges donâ€™t come from analysis or a single data set or source but rather from the complexity of interaction between two or more data sets.
â€śWhen combining large assets and new asset types, how they relate to each other becomes more complex,â€ť the Gartner report explains. â€śAs more assets are combined, the tempo of record creation and the qualification of the data within use cases becomes more complex.â€ť
The next link is the clothing companies that create the fashion. They have a much more complex job, using big data to analyze fashion trends and improve their decision-making.Â Information such as historical sales, weather predictions, demographic data and economic details help them chose the right colors, sizes and price points for the clothing they make.Â Â Â
Swim Suits and Snow Parkas
This is where we, as consumers, come into the picture.Â Just as I did many years ago, people still shop for school and winter clothing this time of year.Â The clothes on the racks at our favorite retailer or from an online catalogue were chosen and ordered 6-9 months ago.Â Take Kohlâ€™s. The nationwide retailer uses a blend of geographic weather prediction data sources to know where to best sell those snow parkas versus swim suits, economic and competitive data to price it right, demographic data sources to better predict the required sizes and customer demand, and market trends data sources to better forecast the colors and styles that will sell best.Â The more accurately Kohlâ€™s buyers can predict consumer behavior using big data, the less the retailer will need to discount overstock, and the higher its sales and profit.Â
As I stated in my previous blog posts, the HadoopÂ® architecture is a great tool for efficiently storing and processing the growing amount of data worldwide, but Hadoop is only as good as the processing and storage performance that supports it. As with flu strain and weather predictions, the more data you can quickly and efficiently analyze, the more accurate your prediction. When it comes to weather and flu vaccines, these predictions can help save lives, but in the fashion industry it is all about improving the bottom line.
Whether in fashion, medical, weather or other fields , the use of Hadoop for high levels of speed and accuracy in big data analysis requires computers with application acceleration. One such tool is LSIÂ® Nytroâ„˘ Application Acceleration. You can go to TheSmarterWayToFasterâ„˘ for more information on the Nytro product family.
Part three of this three-part series continues to examine some of the diverse and potentially life-saving uses of big data in our everyday lives. It also explores how expanded data access and higher processing and storage speed can help optimize big data application performance.
In part one of this Write Amplification (WA) series, I examined how WA works in basic solid-state drives (SSDs). Part two now takes a deeper look at how SSDs that use some form of data reduction technology can see a very big and positive impact on WA.
Data reduction technology can master data entropy
The performance of all SSDs is influenced by the same factors â€“ such as the amount of over provisioning and levels of random vs. sequential writing â€“ with one major exception: entropy. Only SSDs with data reduction technology can take advantage of entropy â€“ the degree of randomness of data â€“ to provide significant performance, endurance and power-reduction advantages.
Data reduction technology parlays data entropy (not to be confused with how data is written to the storage device â€“ sequential vs. random) into higher performance. How? When data reduction technology sends data to the flash memory, it uses some form of data de-duplication, compression, or data differencing to rearrange the information and use fewer bytes overall. After the data is read from flash memory, Â data reduction technology, by design, restores 100% of the original content to the host computer This is known as â€śloss-lessâ€ť data reduction and can be contrasted with â€ślossyâ€ť techniques like MPEG, MP3, JPEG, and other common formats used for video, audio, and visual data files. These formats lose information that cannot be restored, though the resolution remains adequate for entertainment purposes.
The multi-faceted power of data reduction technology
My previous blog on data reduction discusses how data reduction technology relates to the SATA TRIM command and increases free space on the SSD, which in turn reduces WA and improves subsequent write performance. With a data-reduction SSD, the lower the entropy of the data coming from the host computer, the less the SSD has to write to the flash memory, leaving more space for over provisioning. This additional space enables write operations to complete faster, which translates not only into a higher write speed at the host computer but also into lower power use Â because flash memory draws power only while reading or writing. Higher write speeds also mean lower power draw for the flash memory.
Because data reduction technology can send less data to the flash than the host originally sent to the SSD, the typical write amplification factor falls below 1.0. It is not uncommon to see a WA of 0.5 on an SSD with this technology. Writing less data to the flash leads directly to:
Each of these in turn produces other benefits, some of which circle back upon themselves in a recursive manner. This logic diagram highlights those benefits.
So this is a rare instance when an amplifier â€“ namely, Write Amplification â€“ makes something smaller. At LSI, this unique amplifier comes in the form of the LSIÂ® DuraWriteâ„˘ data reduction technology in all SandForceÂ® Drivenâ„˘ SSDs.
This three-part series examines some of the details behind write amplification, a key property of all NAND flash-based SSDs that is often misunderstood in the industry.
Every year I diligently get in line for my annual flu (or more technically accurate â€śseasonal influenzaâ€ť) shot.Â Iâ€™m not particularly fond of needles, but I have seen what the flu can do and the how many die each year from this seasonal virus.
When you get the flu shot â€“ or, now, the nasal mist â€“ you and I are trusting a lot of people that what you are taking will actually help protect you. According to the CDC (Centers for Disease Control and Prevention), there are 3 three strains, (A, B &C Antigenic) of influenza virus and of those three types, two cause the seasonal epidemics we suffer through each year.
Not to get too technical, but I learned that the A strain is further segregated by 2 proteins and are given code names like H1N1, H3N2 and H5N1. They can even be updated by year if there is a change in them.Â An example of this was in 2009, when the H1N1 became the 2009 H1N1. Â So where we may just call it H1N1, the World Health Organization has a whole taxonomy to describe a seasonal influenza strain.
This taxonomy includes:
As you can see, it can really get complicated quickly. If you would like to go deeper, you can read more about this here. While much of this information seems pretty arcane to the lay reader, you quickly can see that the sheer volume of information collected, stored and analyzed to combat seasonal influenza is a great example of big data.
In the US, once the CDC sifts through this data â€“ using big data analytics tools â€“ it uses its findings to determine what strains might affect the US and build a flu shot to combat those strains.Â During the 2012/2013 season, the predominant virus was Influenza A (H3N2), though some influenza B viruses contained a dash of influenza A (H1N1) pdm09 (pH1N1). (See the full report here.)
In addition to identifying dominant viruses, the CDC also uses big data to track the spread and potential effect on the population.Â Reviewing information from prior outbreaks, population data, and even weather patterns, the CDC uses big data analytics to quickly estimate and attempt to determine where viruses might hit first, hardest and longest so that a targeted vaccine can be produced in sufficient quantities, in the required timeframe and even for the right geography.Â The faster and more accurately this can be done, the more people can get this potentially life saving vaccine before the virus travels to their area.
As I stated in my previous blog post, the HadoopÂ® architecture is a great tool for efficiently storing and processing the growing amount of data worldwide, but Hadoop is only as good as the processing and storage performance that supports it. As with weather predictions, the more data you can quickly and efficiently analyze, the greater the likelihood of an accurate prediction. When it comes to weather and flu vaccines, these predictions can help save lives. In my final blog post in this series, I will explore how big data helps the fashion industry.
Whether in medical, weather or other fields that leverage big data technologies, the use of Hadoop for high levels of speed and accuracy in big data analysis requires computers with application acceleration. One such tool is LSIÂ® Nytroâ„˘ Application Acceleration. You can go to TheSmarterWayToFasterâ„˘ for more information on the Nytro product family.
Part two of this three-part series continues to examine some of the diverse and potentially life-saving uses of big data in our everyday lives. It also explores how expanded data access and higher processing and storage speed can help optimize big data application performance.
We all watch the local weather and wonder how forecasters predict (or in some cases mis-predict) the future of weather.Â While they may not all agree on the forecast, they do agree that the more current and historical data you have, the better your ability to predict what might happen over the next hours, days and weeks.
A term used to describe this growing amount of information is Big Data, and more and more of it leverages Hadoop, a flexible architecture that provides the analysis tools and scalability required to comb through and utilize all available data.Â When recently talking to a US-based meteorologist (the technical name for a degreed weather forecaster), I learned that meteorologists rely on many different weather models from various sources to help create their forecasts.
Weather spawns downpour of Big Data
These models collect massive amounts of weather information from around the world. Using this information, computers then run billions of calculations to mimic the motion of weather patterns in the Earthâ€™s dynamic atmosphere and produce forecasts for any given location over time. It was interesting to learn that not all weather models are equal.
While weather modeling websites worldwide collect this atmospheric data and provide it to meteorologists, the European community is seen as having the most accurate information.Â When I asked why, I learned that European weather modeling sites have some of the fastest computer hardware and technology, enabling them to analyze more data faster, which produces better overall forecasts. The US weather professional I spoke with tends to use these European sites as part of his analysis, and when European models conflict with those from US sites, he often leans toward the European data.
His use of the European weather modeling sites points to the value of fast, accurate analysis of Big Data. It also underscores the implications of vast amounts of data overwhelming the ability of the compute and storage resources available to process it. An accurate and timely weather forecast is critical and a bad or missed forecast can have terrible and even deadly consequences.
A case in point: Hurricane Sandy
In this article on Hurricane Sandy forecast speed and accuracy, you can see how removing just one source of data can dramatically reduce the accuracy of predicting a critical event such as where a hurricane will make landfall. To be sure, the more data you can store and the faster you can process it for analysis, the greater your potential competitive advantage, even in the vaunted halls of meteorological analysis and prediction.
The HadoopÂ® architecture is a great tool for efficiently storing and processing the growing amount of data worldwide, but Hadoop is only as good as the processing and storage performance that supports it. This gets interesting as you think about and explore the ripple effect of accurate or inaccurate forecasting in many areas. In my next blog post I will explore one of those â€“ flu vaccines.
Whether in meteorology or other fields that leverage Big Data technologies, the use of Hadoop for high levels of speed and accuracy in Big Data analysis requires computers with application acceleration. One such tool is LSIÂ® Nytroâ„˘ Application Acceleration. You can go to TheSmarterWayToFasterâ„˘ for more information on the Nytro product family.
This three-part series examines some of the diverse uses of Big Data in our everyday lives. It also explores how expanded data access and higher processing and storage speed can help optimize Big Data application performance.
Tags: application accleration, big data, European weather modeling, flash, flash storage, Hadoop, Hurricane Sandy, meterology, Nytro, processing performance, storage performance, weather modeling
In todayâ€™s solid state drives (SSDs), the NAND flash memory must be erased before it can store new data. In other words, data cannot be overwritten directly as it is in a hard disk drive (HDD). Instead, SSDs use a process called garbage collection (GC) to reclaim the space taken by previously stored data. This means that write demands are heavier on SSDs than HDDs when storing the same information.
This is bad because the flash memory in the SSD supports only a limited number of writes before it can no longer be read. We call this undesirable effect write amplification (WA). In my blog, Gassing up your SSD, I describe why WA exists in a little more detail, but here I will explain what controls it.
Itâ€™s all about the free space
I often tell people that SSDs work better with more free space, so anything that increases free space will keep WA lower. The two key ways to expand free space (thereby decreasing WA) are to 1) increase over provisioning and 2) keep more storage space free (if you have TRIM support).
As I said earlier, there is no WA before GC is active. However, this pristine pre-GC condition has a tiny life span â€“ just one full-capacity write cycle during a â€śfresh-out-of-boxâ€ť (FOB) state, which accounts for less than 0.04% of the SSDâ€™s life. Although you can manually recreate this condition with a secure erase, the cost is an additional write cycle, which defeats the purpose. Also keep in mind that the GC efficiency and associated wear leveling algorithms can affect WA (more efficient = lower WA).
The other major contributor to WA is the organization of the free space (how data is written to the flash). When data is written randomly, the eventual replacement data will also likely come in randomly, so some pages of a block will be replaced (made invalid) and others will still be good (valid). During GC, valid data in blocks like this needs to be rewritten to new blocks. This produces another write to the flash for each valid page, causing write amplification.
With sequential writes, generally all the data in the pages of the block becomes invalid at the same time. As a result, no data needs relocating during GC since there is no valid data remaining in the block before it is erased. In this case, there is no amplification, but other things like wear leveling on blocks that donâ€™t change will still eventually produce some write amplification no matter how data is written.
Calculating write amplification
Write Amplification is fundamentally the result of data written to the flash memory divided by data written by the host. In 2008, both Intel and SiliconSystems (acquired by Western Digital) were the first to start talking publically about WA. At that time, the WA of all SSDs was something greater than 1.0. It was not until SandForce introduced the first SSD controller with DuraWriteâ„˘ technology in 2009 that WA could fall below 1.0. DuraWrite technology increases the free space mentioned above, but in a way that is unique from other SSD controllers. In part two of my write amplification series, I will explain how DuraWrite technology works.
This three-part series examines some of the details behind write amplification, a key property of all NAND flash-based SSDs that is often misunderstood in the industry.
It is always good to hear the opinions of your customers and end users, and in that respect June was a banner month for LSIÂ® SandForceÂ® flash controllers.
In a survey soliciting responses from more than 1 million members of on-line groups and other sources by IT Brand Pulse, an independent product testing and validation lab, LSI SandForce controllers ranked at the top of all six SSD controller chip sub-categories: market, price, performance, reliability, service and support, and innovation. Last August, the LSI SandForce controllers won in three of the six sub-categories, so weâ€™re thrilled to see momentum building.
Winning all six awards is no easy task. Some of the sub-categories could be considered mutually exclusive, requiring customers to make trade-offs among product attributes. For example, often a product with the best price is considered to have skimped on quality compared to pricier solutions. A product with screaming performance, ironically, is seen as something of a market laggard because it usually does not carry the best price. So it is exciting to strike the right balance among all six measures and sweep the product category. You can find more details on the awards here: http://itbrandpulse.com/research/brand-leader-program/225-ssd-controller-chips-2013
For those of us in product marketing, winning product awards voted on by your peers can bring on aÂ feeling similar to that warm afterglow parents bask in when they hear their child has made the honor roll or was named the valedictorian for his or her graduating class.
So please pardon us, for a moment, as we beam with pride.
It seems like our smartphones are getting bigger and bigger with each generation. Sometimes I see people holding what appears to be a tablet computer up next to their head. I doubt they know how ridiculous they look to the rest of us, and I wonder what pants today have pockets that big.
I certainly do like the convenience of the instant-on capabilities my smart phone gives me, but I still need my portable computer with its big screen and keyboard separate from my phone.
A few years ago, SATA-IO, the standards body, added a new feature to the Serial ATA (SATA) specification designed to further reduce battery consumption in portable computer products. This new feature, DevSleep, enables solid state drives (SSDs) to act more like smartphones, allowing you to go days without plugging in to recharge and then instantly turn them on and see all the latest email, social media updates, news and events.
Why not just switch the system off?
When most PC users think about switching off their system, they dread waiting for the operating system to boot back up. That is one of the key advantages of replacing the hard disk drive (HDD) in the system with a faster SSD. However, in our instant gratification society, we hate to wait even seconds for web pages to come up, so waiting minutes for your PC to turn on and boot up can feel like an eternity. Therefore, many people choose to leave the system on to save those precious momentsâ€¦ but at the expense of battery life.
Can I get this today?
To further extend battery life, the new DevSleep feature requires a signal change on the SATA connector. This change is currently supported only in new IntelÂ® Haswell chipset-based platforms announced this June. Whatâ€™s more, the SSD in these systems must support the DevSleep feature and monitor the signal on the SATA connector. Most systems that support DevSleep will likely be very low-power notebook systems and will likely already ship with an SSD installed using a small mSATA, M.2, or similar edge connector. Therefore, the signal change on the SATA interface will not immediately affect the rest of the SSDs designed for desktop systems shipping through retail and online sources. Note that not all SSDs are created equal and, while many claim support for DevSleep, be sure to look at the fine print to compare the actual power draw when in DevSleep.
At Computex last month, LSI announced support for the DevSleep feature and staged demonstrations showing a 400x reduction in idle power. It should be noted that a 400x reduction in power does not directly translate to a 400x increase in battery life, but any reduction in power will give you more time on the battery, and that will certainly benefit any user who often works without a power cord.
Not likely. But you might think that solving your computer data security problems is very well possible when someone tells you that TCG Opal is the key. According to its website, â€śThe Trusted Computing Group (TCG) is a not-for-profit organization formed to develop, define and promote open, vendor-neutral, global industry standards, supportive of a hardware-based root of trust, for interoperable trusted computing platforms.â€ť
That might take a bit to digest, but think about TCG as a group of companies creating standards to simplify deployment and increase adoption of data security. The consortium has two better known specifications called TCG Enterprise and TCG Opal.
Sorting through the alphabet soup of data security
â€śOur SED with TCG Opal provides FDE.â€ť While this might look like a spoonful of alphabet soup, it is music to the ears of a corporate IT manager. Let me break it down for those who just hear fingernails on the chalkboard. A self-encrypting drive (SED) is one that embeds a hardware-based encryption engine in the storage device. One chief benefit is that the hardware engine performs the encryption, preserving full performance of the host CPU. An SED can be a hard disk drive (HDD) or a solid state drive (SSD). True, traditional software encryption can secure data going to the storage device, but it consumes precious host CPU bandwidth. The related term, full drive encryption (FDE), is used to describe any drive (HDD or SSD) that stores data in an encrypted form. This can be through either software-based (host CPU) or hardware-based (an SED) encryption.
Most people would assume that if their work laptop were lost or stolen, they would suffer only some lost productivity for a short time and about $1,500 in hardware costs. However, a study by Intel and the Ponemon Institute found that the cost of a lost laptop totaled nearly $50,000 when you account for lost IP, legal costs, customer notifications, lost business, harm to reputation, and damages associated with compromising confidential customer information. When the data stored on the laptop is encrypted, this cost is reduced by nearly $20,000. This difference certainly supports the need for better security for these mobile platforms.
When considering a security solution for this valuable data, you have to decide between a hardware-based SED and a host-based software solution. The primary problem with software solutions is they require the host CPU to do all of the encryption. This detracts from the CPUâ€™s core computing work, leaving users with a slower computer or forcing them to pay for greater CPU performance. Another drawback of many software encryption solutions is that they can be turned off by the computer user, leaving data in the clear and vulnerable. Since hardware-based encryption is native to the HDD or SSD, it cannot be disabled by the end user.
In April 2013, LSI and a few other storage companies worked with the Ponemon Institute to better understand the value of hardware-based encryption. You can read about the details in the study here, but the quick summary is that hardware-based encryption solutions can offer a 75% total cost savings over software-based solutions, on average.
When is this available?
At the Computex Taipei 2013 show earlier this month, LSI announced availability of a firmware update for SandForceÂ® controllers that adds support for TCG Opal. The LSI suite at the show featured TCG Opal demonstrations using self-encrypting SSDs provided by SandForce Drivenâ„˘ member companies, including Kingston, A-DATA, Avant and Edge. (Contact SSD manufacturers directly for product availability.)
Imagine a bathtub full of water and asking someone to empty the tub while you turn your back for a moment. When you look again and see the water is gone, do you just assume someone pulled the drain plug?
I think most people would, but what about the other methods of removing the water like with a siphon, or using buckets to bail out the water? In a typical bathroom you are not likely to see these other methods used, but that does not mean they do not exist. The point is that just because you see a certain result does not necessarily mean the obvious solution was used.
I see a lot of confusion in forum posts from SandForce Drivenâ„˘ SSD users and reviewers over how the LSIÂ® DuraWriteâ„˘ data reduction and advanced garbage collection technology relates to the SATA TRIM command. In my earlier blog on TRIM I covered this topic in great detail, but in simple terms the operating system uses the TRIM command to inform an SSD what information is outdated and invalid. Without the TRIM command the SSD assumes all of the user capacity is valid data. I explained in my blog Gassing up your SSD that creating more free space through over provisioning or using less of the total capacity enables the SSD to operate more efficiently by reducing the write amplification, which leads to increased performance and flash memory endurance. So without TRIM the SSD operates at its lowest level of efficiency for a particular level of over provisioning.
Will you drown in invalid data without TRIM?
TRIM is a way to increase the free space on an SSD â€“ what we call â€śdynamic over provisioningâ€ť â€“ and DuraWrite technology is another method to increase the free space. Since DuraWrite technology is dependent upon the entropy (randomness) of the data, some users will get more free space than others depending on what data they store. Since the technology works on the basis of the aggregate of all data stored, boot SSDs with operating systems can still achieve some level of dynamic over provisioning even when all other files are at the highest entropy, e.g., encrypted or compressed files.
With an older operating system or in an environment that does not support TRIM (most RAID configurations), DuraWrite technology can provide enough free space to offer the same benefits as having TRIM fully operational. In cases where both TRIM and DuraWrite technology are operating, the combined result may not be as noticeable as when theyâ€™re working independently since there are diminishing returns when the free space grows to greater than half of the SSD storage capacity.
So the next time you fill your bathtub, think about all the ways you can get the water out of the tub without using the drain. That will help you remember that both TRIM and DuraWrite technology can improve SSD performance using different approaches to the same problem. If that analogy does not work for you, consider the different ways to produce a furless feline, and think about what opening graphic image I might have used for a more jolting effect. Although in that case you might not have seen this blog since that image would likely have gotten us banned from GoogleÂ® â€śsafe for workâ€ť searches.
I presented on this topic in detail at the Flash Memory Summit in 2011. You can read it in full here: http://www.lsi.com/downloads/Public/Flash%20Storage%20Processors/LSI_PRS_FMS2011_T1A_Smith.pdf
Tags: bathtub drain, controller, data reduction technology, DuraWrite, flash, flash controller, flash memory, Flash Memory Summit, NAND, over-provisioning, SandForce, SandForce Driven SSD, SATA, Serial ATA, solid state drive, TRIM