So this post start came about as a result of me fishing for some information from a fellow Engineer/Architect @ another cloud-provider, Kyle Bader (@mmgaggle). Basically, I’d seen a video about DreamObjects’ Ceph Implementation and picked up on a mention of using Coraid and was intrigued.
Kyle and I exchanged a few tweets and he questioned why I would use Coraid behind an ObjectStore platform…. so I thought I’d put my thoughts together and get some feedback.
My default blog writing style is to document my though process, which may be a little obtuse…. so bare with me.
So some background
Hypothetically…. I’ve been tasked with investigating some options around a Cloud Storage platform. Now if that seems a bit vague, that’s because it is. As is all to common, the goals and requirements set out at the beginning of an IT project are often a bit short on detail, perhaps in an effort to get some creative flair thrown in by the architects (and the details I can share are then restricted too)….. but I digress.
Basically from what I was able to determine, we need a new, complementary storage platform with the following characteristics:
- Scalable (Near infinitely)
- Presentable over public IP space (aka the ‘net)
- Ultimately usable by users in friendly / familiar ways (Like DropBox, FileShares, backup software etc)
- Very granular growth at all levels (bandwidth, availability, capacity
- Low CAPEX entry point (almost none, ideally)
- Minimal OPEX to keep it running
- Utilize existing Dev and Ops teams to build / support it
- Presented from our existing Multiple Availability Zones (preferably from a global namespace)
Some of the use cases would be for Backup targets (and the ability to restore from it), filesharing etc.
So all in all, pretty standard set of requirements/nice-to-have features for most would-be Storage-as-a-Service providers. Quite simply “we want a bite of that S3 pie” And either OpenStack Swift or Ceph are the obvious choices (in my humble opinion)
How are ObjectStore hardware-platforms usually Architected?
The purpose here is not to go into the ObjectStore software architecture, there’s a few posts around that already… Plus it leaves something for me to follow up on later.
So for the hardware: It’s pretty simple, Either:
- Buy a bunch of servers with lots of drive slots each, SuperMicro is a popular choice.
- Buy a few more-traditional servers and some DAS shelves (attached with SAS, usually)
|Big Storage Nodes||
|Traditional + DAS||
Ok… so what’s your idea then?
Well, the traditional approaches for ObjectStore assume the Headers and disk trays are effectively collapsed onto one layer (or damn close to it, with DAS)…
- What if you got the design a bit off, or your requirements change and it’d be better if the drive-to-header ratio was a slightly different?
- What if you wanted say… 8-drives per 2-core/10gbit header and there wasn’t a suitable hardware platform available?
- What if you wanted to change to a different object store, with slightly different hardware ‘sweet spot’?
- What happens when the next-big-thing hits and you’d really like to put some new headers in front of raw drives? Physically redeploy whole collapsed nodes, re-cable SAS?
…. This is all sounding like pre-virtualization picking of the ‘right pizza box’, per application.
So for this platform, what I think would be great is a more scalable DAS tray. Let’s call it vDAS…
Ideally a vDAS candidate would be:
- Dead simple to provision and maintain.
- Cheap-ish, (after all, the smarts of the platform are all done by the headers, why would we pay for useless features)
- Easily able to assign raw drives to storage headers (whatever they may be).
- Low latency Header-to-Drive connection
- Purchasable with OPEX (without massive markup) would be nice too.
Well there is a platform that I think fits the vDAS idea, Coraid. I’ve spoken to the local reps (who are really friendly in Sydney, btw) and I really think they’re onto something here. Providing the storage @ layer 2 with AoE really keeps the complexity down, it can run really efficiently over a converged 10g-base-T network etc. It has all the characteristics of a good vDAS platform.
And taking a step back. It seems to me, the storage industry trend seems to be doing smart things in software. Also increasingly a lot of clever-stuff is being done @ the hypervisor, for example SSD Caching w/ PernixData. If the hypervisor level storage features start to takeover the features and possibly IO…. maybe that paves the way for commoditized back end storage too? vDAS?
Looking back in time a little, When NFS was originally proposed to compete against FC/iSCSI, how was it done? NFS presented out from commodity headers, then headers w/ SAS DAS trays….. and that’s pretty much where the hardware innovation ended?
So that’s my thought. Commodity vDAS ftw (maybe!?)! What does everyone think?
In talking about it further with colleagues… I think I can sum up my thoughts above.
Try not to think of Coraid as a dumbed down SAN
- If you directly compare with a NetApp or EMC, Nimble, whoever else…. on features alone (whatever is the must-have presentation level feature of the day) then it may lose out.
- What happens when a new presentation level feature comes out sometime in the future…. rip and replace the whole platform? Do without?
Think of it as an infinitely scalable, easy to manage set of DAS trays
- Put whatever presentation headers in front of it you like (*Now, or as something better emerges later)
- There is loads of power in that.
- What is the main reason for replacing SANs? I can’t speak for everyone, but I have ripped and replaced more SANs that I care to remember. For the most part, the drives and shelves were perfectly serviceable, if they were abstracted from the headers and just changing the headers was a serious option…. why not derive extended ROI on them?