SSDs: The New King of the Data Center? 172
Nerval's Lobster writes "Flash storage is more common on mobile devices than data-center hardware, but that could soon change. The industry has seen increasing sales of solid-state drives (SSDs) as a replacement for traditional hard drives, according to IHS iSuppli Research. Nearly all of these have been sold for ultrabooks, laptops and other mobile devices that can benefit from a combination of low energy use and high-powered performance. Despite that, businesses have lagged the consumer market in adoption of SSDs, largely due to the format's comparatively small size, high cost and the concerns of datacenter managers about long-term stability and comparatively high failure rates. But that's changing quickly, according to market researchers IDC and Gartner: Datacenter- and enterprise-storage managers are buying SSDs in greater numbers for both server-attached storage and mainstream storage infrastructure, according to studies both research firms published in April. That doesn't mean SSDs will oust hard drives and replace them directly in existing systems, but it does raise a question: are SSDs mature enough (and cheap enough) to support business-sized workloads? Or are they still best suited for laptops and mobile devices?"
Great for some apps (see netflix blog) (Score:5, Interesting)
TL/DR: "The relative cost of the two configurations shows that over-all there are cost savings using the SSD instances"
at least for their use-case (Cassandra).
At work we also use SSDs for a couple terabyte Lucene index with great success (and far cheaper than getting a couple TB of DRAM spread across the servers instead)
Re:Great for some apps (see netflix blog) (Score:5, Interesting)
So you're replacing RAM with SSD, not HD with SSD. Interesting.
And would you even be able to do this with DRAM modules? Normal PC motherboards don't support that.
Re:Great for some apps (see netflix blog) (Score:5, Interesting)
You can build a 48-core Opteron server with 512GB of RAM for under $8000. Going over 512GB in a single server gets a lot more expensive (you either need expensive high-density modules or expensive 8-socket servers - or both) but if you can run some sort of cluster that's not a problem.
Re: (Score:2)
And would you even be able to do this with DRAM modules? Normal PC motherboards don't support that.
Even low-end (dual-CPU 2-U) servers these days support either 192 or 256GB [asacomputers.com]. It's not that hard or expensive to get 4 256GB or 6 192GB servers.
But as that link to Netflix's' blog points out - SSDs can have better price/performance than DRAM at the moment if you need a lot.
Re: (Score:2)
IOPS, blah blah blah.
The more I think about IOPS the more I think it is a manufactured statistic designed to "prove" performance yet at the same time being something you can't compare to another environment.
For example, every storage environment has a different I/O size and read/write mix, rendering IOP comparisons between storage devices moot.
Re: (Score:2)
IOPS is what matters for database/Exchange server load, and now for virtualized desktop load. Certain server uses have very well known workloads with lots of very small random I/Os. You can find software to simulate the I/O workload of an exchange server that you can easily tune to the details of your shop and use it to test-drive storage arrays.
For other sorts of storage, you only care about cost/GB and reliability, and IOPS is pretty meaningless.
Re: (Score:2)
Though modprobe bcache is probably cheaper.
Re: (Score:2)
How does that make sense. Sure SSD is very similar to RAM physically, but it is still like a thousand times shower, is it not?
Re: (Score:2)
I don't understand the confusion, maybe a car analogy will help.
John smith is switching to a new Mustang from a mid 90's civic to reduce his merge time. This represents a huge savings over buying a Porsche. Make sense now?
Re: (Score:2)
Maybe if he was using a golf cart instead of a porsche it would be a better analogy.
Re: (Score:2)
I guess, but he's only going from 7.5 seconds (1995 Civic Si) 0-60 to about 6.8 seconds (2013 Mustang V6 automatic), so only about a 10% improvement. I think the overall improvement from a HDD to a SSD is significantly more than that. Now if you said a mid-90s Civic LX to a new Mustang GT you might have a better point.
Re: (Score:2)
A V6 Mustang is like the Matrix sequels or Star Wars prequels - enthusiasts know they don't actually exist.
Re:Great for some apps (see netflix blog) (Score:5, Informative)
How does that make sense.
As the link to Netflix pointed out -- they benchmarked the entire system with the same REST API in front.
They configured one cluster of SSD-based servers; which another cluster of spinning-disk-with-large-RAM-based servers. It took a cluster of 15 SSD-backed servers to match the throughput of 84 RAM+Spinning servers. With throughput matched, the SSD-based cluster provided better latency and lower cost.
TL/DR: "Same Throughput, Lower Latency, Half Cost".
Re: (Score:2)
but it is still like a thousand times shower, is it not?
Yes; but it's still like 5-500x faster than spinning disks too (obviously depending on if you're talking sequential I/O, or random-acces).
SSD & RAM (Score:2)
Yeah. When we're talking RAM, we are talking modern interfaces, such as DDR (now DDR2 or DDR3), whereas NAND flash, which is what is used here, uses page mode read & writes. Not to mention that internal writes, which are there on flash but not on RAM, would automatically slow down the process, even if the same interface were used (compare SRAM and NOR flash, as a comparison point).
I think what's contributing to the confusion is SSDs being available not just in SATA interfaces, but now, in PCI-X inte
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
Ahh, I think I understand.
Netflix has loads of data, and it is all being used all the time.
Too much data for all of it to be stored in RAM. And too much being used concurrently to even just store in the currently streaming files.
Also Netflix cannot just quickly steam a file completely using all of its bandwidth, since the recipient cannot receive it any faster than their connection.
So Netflix needs to quickly stream little bits of files and switch between them rapidly. So random access. And data is only use
Re: (Score:2)
In the old days, SSD was nothing more than RAM with a battery backup.
So the idea that you would replace memory with SSD rather than spinny disk with SSD is not terribly surprising.
The same "working set" problem that applies to RAM also applies to SSD. Your solution actually has to be appropriate to the problem and SSD isn't necessarily a cure all.
SSD is new and trended and a bit overhyped.
20x faster (Score:3, Informative)
Re: (Score:3, Insightful)
By switching to SSD's on a data intensive web application, I got 20 times speed improvement - from 20 hits per second to 400. I trust SSDs more than physical spindles any day.
When designing storage for any Business or Enterprise the disks (solid state or spinning) should always be in some sort of RAID configuration that supports disk redundancy. Failure to do this could result in loss of data when the disk eventually fails and it will. I am often asked "How long" and my answer is "How long is a peace of string".
At the moment SSD's are excellent when you need high I/O from a few disks up to say a few TB however if you look at enterprise storage solutions of 10's or even 1000's
Re:20x faster (Score:5, Insightful)
and my answer is "How long is a piece of string".
Sorry, that phrase always strikes a nerve with me. More useful answers would include an average, or even better, a graph detailing the death rate of SSDs (and how they tend to die early if they do die, but tend to last if they get past that initial phase).
Re:20x faster (Score:5, Insightful)
That requires explaining the poisson distribution to a pointy-haired boss.
Re: (Score:3, Insightful)
The length of a piece of string is twice the distance from the center to an end.
Re:20x faster (Score:5, Funny)
"How long is a peace of string"
I have never known string to break a cease-fire.
Re: (Score:2)
Do you want to hear the joke about a piece of string?
I'm a frayed knot.
Re: (Score:2)
At the moment SSD's are excellent when you need high I/O from a few disks up to say a few TB however if you look at enterprise storage solutions of 10's or even 1000's of TBytes you are still looking at spinning media with large cache front ends (BTW I am talking about $20k up to many millions of dollars storage area networks).
Well, what you're usually looking at is a storage system with multiple types and speeds of disks that automatically moves data through the tiers depending on the frequency and type of
Re: (Score:2)
About the same as a "Concordat of Worms".
Re: (Score:2)
Reliability data? (Score:2)
I trust SSDs more than physical spindles any day.
Based on what evidence? Where is your data? Faster != More reliable. Spindle based hard drives are (usually) quite reliable and there is plenty of real world usage data documenting exactly how reliable they are. Companies with big data centers like Google have extremely detailed reliability performance figures. SSDs have a lot of advantages but they only recently have started receiving wide distribution and to date they have poor market penetration in data centers where it is easiest to measure their r
Re: (Score:2)
I have had hard disks last for 7 years. I have some now that are about 5 years old. When I can say that about an SSD, I will have more trust in them. Until then, trust is really unwarranted. Without some actual experiences (yours or something else), you are really just engaging in a leap of "faith".
Re: (Score:2)
For what it's worth, I bought my first SSD, a 30GB OCZ Vertex SSD (original version) on 6/21/2009 (i just logged into newegg and checked) and it's still going strong without a single problem. It's since been "demoted" to my HTPC in the living room, which has been great because the bootup is very "appliance-like" and it's com
Re: (Score:2)
Re: (Score:2, Insightful)
TRIM isn't necessary if the SSD uses spare sectors to keep the write amplification low. You can also partition the SSD to have a swath of unused space for that purpose.
Re: (Score:2, Interesting)
No, but then I can read and understand that "once the device runs out of untouched sectors" is not an "if" but a "when". An untouched sector is not the same as a spare sector either, because sectors which are used for reducing the write amplification are touched. An SSD maintains available sectors, not untouched or free sectors.
Re: (Score:3)
Modern SSDs offer under-provisioning for just this reason.
Long-term, not short-term (Score:5, Insightful)
The question is really going to be what kind of shape the drives will be in a year or so from now after 12+ months of constant heavy usage. The usage profile in consumer computers is a lot different from that in a server, and the server workload's going to stress more of the weakest areas of SSDs. And when it comes to manufacturer or lab test results, simple rule: "The absolute worst-case conditions achievable in the lab won't begin to approximate normal operating conditions in the field.". So, while SSDs are definitely worth looking at, I'll let someone else to do the 24-36 month real-workload stress testing on them. There's a reason they call it the bleeding edge after all.
Re:Long-term, not short-term (Score:5, Informative)
We've been using SSDs in our servers since late 2008, starting with Fusion-io ioDrives and Intel drives since then - X25-E and X25-M, then 320, 520 and 710, and now planning to deploy a stack of S3700 and S3500 drives. Our main cluster of 10 servers has 24 SSDs each, we have another 40 drives on a dedicated search server, and smaller numbers elsewhere.
What we've found:
* Read performance is consistently brilliant. There's simply no going back.
* Random write performance on the 710 series is not great (compared to the SLC-based X25-E or ioDrives), and sustained random write performance on the mainstream drives isn't great either, but a single drive can still outperform a RAID-10 array of 15k rpm disks. The S3700 looks much better, but we haven't deployed them yet.
* SSDs can and do die without warning. One moment 100% good, next moment completely non-functional. Always use RAID if you love your data. (1, 10, 5, or 6, depending on your application.)
* Unlike disks, RAID-5 or 50 works pretty well for database workloads.
* We have noted the leading edge of the bathtub curve (infant mortality), but so far, no trailing edge as older drives start to wear out. Once in place, they just keep humming along.
* That said, we do match drives to workloads - SLC or enterprise MLC for random write loads (InnoDB, MongoDB) and MLC for sequential write/random read loads (TokuDB, CouchDB, Cassandra).
Re: (Score:2)
Re: (Score:2)
Not off hand, sorry. I haven't been the sysadmin for 18 months (moved back to programming), and I don't want to give a guess that might be off by a factor of two.
Re: (Score:3, Informative)
If you do RAID5 or RAID6, you should match your RAID block exactly to the write block size of the SSD. If you do not, then you will generally need two writes to each SSD for every actual write performed. This will reduce the lifetime for the SSD and reduces the efficiency. Most RAID controllers have no way of doing this automatically and it is not easy to learn what the write block size is on an SSD (it is not generally part of the information on the drive).
Re: (Score:2)
I did my first write heavy deployment of PostgreSQL on Intel DC S3700 drives about a month ago, with each one of them replacing two Intel 710 drives. The write performance is at least doubled--the server is more than keeping up even with half the number of drives--and in some cases they easily look as much as 4X faster than the 710s. I've been able to get the 710 drives to degrade to pretty miserable read performance on mixed read/write workloads too, as low as 20MB/s, but the DC S7300 drives don't seem t
Re: (Score:2)
now planning to deploy a stack of S3700 and S3500 drives.
Yep, these are the only drives I'd recommend for enterprise use - or any other use where you want to be sure that losing power will not corrupt the data on the disk thanks to actual power-loss protection.
Intel's pricing with the S3500 places it very competitively in the market - even for desktop/laptop use I would have a hard time not recommending it over other drives unless you don't care about reliability and really need maximum random write performance or really need the lowest cost.
Re: (Score:2)
Have you looked at the price point of the ioScale cards?
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
Micron and Toshiba make them, but they're hard to find. You can also get SLC ioDrives. But the Intel S3700 looks to be nearly as good, and much, much cheaper.
Re: (Score:2)
Will also depend greatly on your specific use case: whether it's lookups from a huge, mostly read-only database, or for use in a mail server which is constantly writing data as well. By my understanding at least it's the writes that wear out the SSD, not the reads.
Re: (Score:2)
Enterprise SSD's have been out for half a decade in production. I have roughly 300 Ent SSD's and more than a thousand consumer ones in servers and no failures. Retired many of the early ent SSD's well before they were pushing there write limits as we aged out servers (3-5 years service life). The consumer ones are acting as read cache for local and iscsi disk does wonders.
Silver Bullet (Score:5, Informative)
We have hundreds of SSDs in production servers. We couldn't survive without them. For heavy database workloads, they are the silver bullet to I/O problems, so much so that running a database on regular disk has become almost unimaginable. Why would you even try to do that?
Re: (Score:2)
I replaced them with spinning storage and pe
Re: (Score:2)
Depends a lot on the drive, but that can be a problem. The best solution is to either buy a drive with a significant amount of over-provisioning built in (like the Intel S3700 or Seagate 600 Pro) or over-provision it yourself. That means that when it fills up it still has plenty of spare area to remap blocks.
Enterprise drives typically have at least 20% over-provisioning; consumer drives can be 5% or less. A 400GB Seagate 600 Pro is the same as a 480GB Seagate 600, except for that setting.
Re: (Score:2)
It's a mix. We use enterprise drives for the really heavy stuff, and mainstream drives for data that's either read-only, read-mostly, or is in a database that does sequential writes like TokuDB or Cassandra.
Re:Silver Bullet (Score:4, Insightful)
Re:Silver Bullet (Score:5, Funny)
write wear is a read herring.
Are you sure it's not a reed salmon?
Re: (Score:2)
It's actually a read/write herring, but most of the writing is cached and deferred for actual physical writing later on.
Re: (Score:2)
He reads data off a herring and writes it on what he wears.
Re: (Score:2)
1. You're lying
2. You were using consumer level MLC drives in high write workloads (I might believe this)
Re: (Score:2)
Because if write wear is the prime failure mode and you're running RAID, you're likely to lose multiple SSDs in a relatively short interval.
Re: (Score:2)
So you replace them. I am still not seeing the issue. Surely you have hot spares.
Re: (Score:2)
Yes. And one of the reasons for having hot spares (and replicas and backups) is the chance of multiple drive failures close together. So it's not a problem if you've planned things properly, but it's something you need to consider to create a good plan in the first place.
Re: (Score:2)
SSDs don't change that either way.
This is all stuff that gets done with spinning rust as well.
Re: (Score:2)
The odds of that happening are vanishingly small.
It could even happen with spinning rust.
Near-line storage only: Has been for some years. (Score:5, Informative)
Re: (Score:2)
enterprise class SSDs not the same (Score:5, Interesting)
The enterprise class SSDs are not the same as the "consumer" ones: http://www.anandtech.com/print/6433/intel-ssd-dc-s3700-200gb-review [anandtech.com]
Don't be surprised if you stick a "consumer" grade one to a heavily loaded DB server and it dies a few months later.
Fine for random read-only loads.
And some consumer grade SSDs aren't even consumer grade (I'm looking at you OCZ: http://www.behardware.com/articles/881-7/components-returns-rates-7.html [behardware.com] ).
Price (Score:5, Interesting)
Pricing really needs to come down on these things. A single drive can easily cost as much as a server, and when you're talking about RAID setups, forget it. It's still much more effective to use magnetic drives and use aggressive memory caching for performance, if you really need that.
Another 3 to 5 years this idea might have more traction for companies that aren't Facebook or Google, but right now, SSD costs too much.
Re: (Score:2)
Even for sequential reads, SSDs can be an improvement. My laptop's SSD can easily handle 200MB/s sequential reads, and you'd need more than one spinning disk to handle that. And a lot of things that seem like sequential reads at a high level turn out not to be. Netflix's streaming boxes, for example, sound like a poster child for sequential reads, but once you factor in the number of clients connected to each one, you end up with a large number of 1MB random reads, which means your IOPS numbers translate
Re: (Score:2)
> Even for sequential reads, SSDs can be an improvement. My laptop's SSD can easily handle 200MB/s sequential reads, and you'd need more than one spinning disk to handle that.
Except that's not true across the board. You may find that a more reliable brand doesn't have sequential performance nearly that good.
For as much as SSD cost, you can easily double up on the spinning rust and still be way WAY ahead.
You can get very noticable improvement even with spinny rust just by having more than one spindle and
Single component failure not a big deal any more. (Score:5, Informative)
I think that the wide range adoption of server SSDs also shows how far server installations have progressed toward eliminating all single points of failure.
In the passt HA and 'five nines' was something only done by a few niches, like telephony provider switches or banking big iron. Today it is common in many cloud installations and most sizeable server setups. A single component failing will not stop your service.
If your business can support the extra cost for the SSDs, a failing drive will not stop you and the performance of the service will see great improvements anyway. The power savings may even make the SSD not so costly after all.
Re: (Score:2)
Correction: a single component failing should not stop your service, if you have done your job right (either in designing and building, or in finding a vendor to provide the service). But having a single component failing can and still does ruin somebody's day on a regular basis.
Re: (Score:2)
And beyond SSD, the future is PCIe Flash (Score:2)
SSDs are slow in that they rely on old school disk protocols like sata. Sure, you'll get better performance than spinning disk. But if you want screaming fast performance, you should look at flash devices connected through the PCIe bus.
Products from Fusion IO [fusionio.com] would be an example of this. Apple Mac Pro would be another: "Up to 2.5 times faster than the fastest SATA-based solid-state drive".
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
The year SATA 3 was put into production, SSD's designs were reconfigured to saturated it, and those fusion I/O drives saturate their PCI lane bandwidth....
SATA 3 was and always will be shortsighted bullshit brought to you by a consortium of asshats intentionally trying to undercut feature demand in their desperate attempt to preserve the old guard.
Re: (Score:2)
Re:And beyond SSD, the future is PCIe Flash (Score:5, Insightful)
Up to 2.5 times faster
Ah, "up to." Marketing's best friend.
Re:And beyond SSD, the future is PCIe Flash (Score:5, Funny)
They have to say up to. Reads and writes towards the inside of the chip are slower then they are towards the outside of the chip. I don't think anyone makes a constant linear velocity SSD.
Re: (Score:2)
Really though its linguistically equivalent to saying, "We promise that it won't be more than 2.5 times faster. Could even be slower - who knows - but it certainly isn't 3 times faster."
Re: (Score:2)
My favorite are ads that say "Save up to $X (or X%) and more!" So in other words the savings can be any amount or none at all. Whatadeal.
Re: (Score:2)
PCIe based flash is nice have more than a few in production. The downside is hot swap pcie MB's are extremely expensive and getting more than 7 pcie slots is also nearly imposible. I can get 10 or more 2.5 hot swaps on a 1ru server. I can get hardware raid even redundancy with the right back planes. I can connect up external chassis via sas if I need more room (yea pcie expansion chassis exist as well they are funky to deal with at times). The use cases for needing extremely fast IO without redundancy e
Virtualisation (Score:5, Interesting)
This is being driven primarily by increasing levels of virtualisation, which turns everything into a largely random-write disk load, pretty much the worst case scenario for regular old hard disks.
Re: (Score:2)
Prices? (Score:2)
are SSDs mature enough (and cheap enough) to support business-sized workloads? Or are they still best suited for laptops and mobile devices?
I don't see maturity as a problem. If there is money to be made drive manufacturers will throw enough engineering and computer science talent at the task of solving the teething troubles. What interests me is that if SSDs mount a major invasion of server-rooms and data-centers worldwide it also means that we will now finally start to see SSD pricing drop like rock. Cheap high capacity external SSD drives, I can't wait. If we are lucky this will also popularize Thunderbolt with PC motherboard makers since th
Any experiences on Hybrid RAID-1? (Score:2)
Re: (Score:2)
I'm using that setup. I'm using a cheap, but high Capacity OCZ drive (960GB), with a software raid 1 mirror to a SATA replacement. I'm running this on Windows, which crucially always uses the FIRST drive for reads. So reads are at SSD speeds, writes are at SAS speeds.
It's working well enough. I've not benchmarked this. We have had 1 drive failure, I suggest keeping 1 cold-spare to hand. Delivery times on SSDs are pretty variable, you won't want your entire DB running on a SAS drive for too long.
Jason.
Re: (Score:2)
Your writes will be limited to the speed of the conventional drive, so if your workload is mostly reads, then you will see a significant benefit.
Though, if your workload is mostly reads, you'd probably see the same benefit for a lot less $$$ by putting more RAM in your server...
Re: (Score:2)
That's what we ended up doing with our databases - did a bunch of comparisons and ended up sticking to 15K disks and maxing out RAM instead. Even at Rackspace prices we came out ahead on price/performance.
Re: (Score:2)
With a couple GB of data just put in ram you can get to 128GB cost effectively and if your read heavy you will end up with everything cached. If your writing just go all SSD it's night and day a single SSD pair easily outperforms a whole shelf of 15k drives.
Specific applications now, everything later (Score:2)
My company's experience (Score:2)
Of course business adoption is small (Score:2)
I recently was given the task of upgrading my development machine. We're a small company but management is happy to spend money on hardware if we need it.
I decided I'd prefer an SSD and yet when I looked at the big suppliers of office machines - Dell, HP, etc. none of them even offered SSD's as an option. SSD's only came into it when you started looking at the really high-end, £2,000+ workstations but there's no reason why this should be the case.
In the end, I just custom built the machine as it was t
Re: (Score:2)
We just buy a normal dell and toss the drive out when it arrives. Installing a hard drive is not difficult and you get to keep the NBD warranty on the rest of the machine.
Re: (Score:2)
I would agree with that, but the cost Dell was charging was higher than what I could pay for a custom built option with the same (or in fact, better) specs.
Re: (Score:2)
That makes sense.
We also do not buy one off machines for devs or really anyone. We just upgrade one of the hundreds of desktops we buy at a time.
More Common Than You Think... (Score:5, Interesting)
SSD's might not be used as primary storage, yet. The cost of using a lot of SSD's in a SAN is still too high. However, that doesn't mean that SSD technology is not being used. Many systems started using SSD's as Read/Write caches or highspeed buffers, etc. The PCIe SSD cards are popular in highend servers. This is one way that Oracle manages to blow away the competition when benchmarks are compared. They put a PCIe SSD cards into their servers and use them to run their enterprise database at lightning speeds! ZFS can use SSD's as Read/Write caches although you had better battery backup the Write cache!.
Depending on a particular solution, a limited number of SSD's in a smaller NAS/iSCSI RAID setup can make sense for something that needs some extra OOMF! But I don't yet see large scale replacement of traditional spinning rust drives with SSD's yet. In many cases, SSD's only make sense for highly active arrays where reads and writes are very heavy. Lots of storage sits idle and isn't being pounded that hard.
Hot/Crazy (Score:2)
Two years on and this is still relevant: The Hot/Crazy Solid State Drive Scale [codinghorror.com].
I love SSD's in servers and they don't burn me because I always expect them to fail. Sure, one MLC SSD is fine for a ZFS L2ARC, because if it fails reads just slow down, but for a ZFS ZIL, that gets a mirror of SLC drives, because a failure is going to be catastrophic.
If I'm using Facebook's FlashCache, two drives get mirrored by linux md and treated as a cache device and smartd lets me know when one of them goes TU. Another a
Re: (Score:3)
Re: (Score:2)
But the real *issue* here is being able to actually go though the data looking for information. Storage of this much data has been a fairly easy problem to solve if you have money, finding a way to organize and search though huge data sets to give timely results is not so easy even if you have money.
Buying spindles and connecting them in huge RAID arrays is well understood. You just build what size you need and dump your data onto it. Yea, you will have to battle OS size limits on partitions and files