New 7400 just for ESX cluster, CPG ideas

User avatar
Richard Siemers
Site Admin
Posts: 1333
Joined: Tue Aug 18, 2009 10:35 pm
Location: Dallas, Texas

New 7400 just for ESX cluster, CPG ideas

Post by Richard Siemers »

We just received, but have not installed yet, a new 7400 with 4x controllers, 6x shelves worth of disks (4 add on shelves, 2 internal to the controllers), 144x SAS 10k 300gb, and 24x SSD 920gb, AO license included. This system is intended to be solely dedicated to our production ESX farm of 17 hosts.

The VMware admin and I are still contemplating our CPG strategy.
The default strategy is to create all Datastores in the SAS CPG:
SAS_CPG 10K 300gb R5(3+1) cage safe + AO Tier 1
SSD_CPG 920gb R5(3+1) cage safe + AO Tier 0

My hesitation is that I would prefer for new writes to dump to SSD and then de-stage cold blocks to SAS nightly... my thought process is that new data is hot data for about a day and becomes less "used" the older it gets. I am worried about what happens if the SSD fills up before the nightly AO job can pull chunklets down to SAS, or if we have enough hot data that AO chooses NOT to move blocks down to SAS leaving little to no room for the VVs to grow inside the SSD cpg. I should have about 15 Tb of usable SSD, that is alot more than our daily change rate for sure... I would love to hear some community input and ideas on the topic.
Richard Siemers
The views and opinions expressed are my own and do not necessarily reflect those of my employer.
Darking
Posts: 77
Joined: Wed Mar 05, 2014 10:55 am

Re: New 7400 just for ESX cluster, CPG ideas

Post by Darking »

Personally ive been a bit hesitant about using the SSDs as the base for my volumes.

The hitatchi SSD drives are not exactly high endurance, compared to an SLC, in that they only allowed 2 drive writes per day over a five year period. then again, that is 2TB of data written every day on each disk. So it varies if it will be an issue.

The AO recommendations are clear tho'. Do not use SSDs on thinly provisioned volumes if you want to base it off SSD. So you would be required to not use thinly provisioned volumes.

regarding where writes are placed, i am fairly certain HP will need to address this sooner or later, since their competitors (compellent.. ) have their systems put all newly written data on the fastest tier, and it does make perfect sense in an tiering policy to do so.
User avatar
BryanW
Posts: 71
Joined: Sat May 03, 2014 2:01 pm
Location: Dallas, TX

Re: New 7400 just for ESX cluster, CPG ideas

Post by BryanW »

Richard,
This Bryan over at Match.com. What is the IO of the ESX VMs? Are they running Database or other highly IO intensive processes? We have used 7+1 R5 10k SFF or 15K LFF the tier 1 (with 8 or 16 enclosures, we used 5+1 when we were running 12) on both our F-class and 7400s with SSD R5 tier 0. It has always worked well for us but I suppose it could depend on how much your hot spots change throughout the day. You can also make your AO schedules more aggressive when you are first provisioning applications or even kick them off manually if you are concerned about not having enough SSD out of the gate. I do wish they had an option where you could set an adjustable percentage of new chunklets to automatically deploy into tier 0. Hit me up if you want to talk more.
Last edited by BryanW on Thu Sep 11, 2014 9:35 am, edited 1 time in total.
Bryan W
Senior Architect/Manager of System Infrastructure, Dallas TX
https://www.linkedin.com/in/bryanlwhite
hdtvguy
Posts: 576
Joined: Sun Jul 29, 2012 9:30 am

Re: New 7400 just for ESX cluster, CPG ideas

Post by hdtvguy »

Unless you are doing VDI or some heavy IO based VMs I would not write to SSD by default. First AO is weak at best you may not get data out fast enough and proper data back in. We have no SSD currently in of V400 that handles 800 VMs, plus numerous SQL and Oracle systems, but we moved our VMs away from even R1 to R5 across the board since we find 3par R5 with 3+1 is almost as fast as R1. We try to never (other than databases) land data on the top tier due to AO often not being able to keep up.What we are doing is standing up a new 7400 with SSD and 10K drives only and that will be for all databases (SQL VMs and Oracle LPARs).
Cleanur
Posts: 254
Joined: Wed Aug 07, 2013 3:22 pm

Re: New 7400 just for ESX cluster, CPG ideas

Post by Cleanur »

BTW Compellents architecture implementation was designed to ingest data to spinning disk as quickly as possible hence the top tier was always SAS Raid 10. However with the introduction of SSD, this had some unexpected consequences since all net new writes were forced to the top tier which was now low capacity SSD. Given the limited SSD write optimization and the 100% write workload, this top tier needs to be SLC, which in turn has a relatively high cost and low capacity. That being the case there's a very real potential incoming data could overrun the SSD tier and It's actually the reason they now propose a mix of SLC.and MLC. The relatively small SLC tier ingests data and demotes it quickly to the larger MLC tier to get around this architectural limitation.

Some things are not as cut and dried as they appear, Compellent's SLC/MCL write/read optimized tiering isn't so much a feature as a fix :-)
User avatar
Richard Siemers
Site Admin
Posts: 1333
Joined: Tue Aug 18, 2009 10:35 pm
Location: Dallas, Texas

Re: New 7400 just for ESX cluster, CPG ideas

Post by Richard Siemers »

Speaking of Compellent... all writes default to going into raid10, even if you only have 1 tier of disk. Once a day it will run its rebalance/tier mojo, and convert that days raid 10 writes into raid 5. This is separate from licensed auto-tiering features that will move data between disk types. I was able to force true raid5 storage profile by enabling some advanced options but got my hand slapped with a best practice violation.

In 3PAR terms, it would be like a single CPG set to raid5, but writing all new data to RAID10 LDs, then every night moving that data to raid5 LDs then compacting the empty raid10 LDs. I believe the custom ASICs inside the 3par hardware mitigate the need for this.

Its something to be aware of if your doing write benchmarks on a Compellent. Also notable is that CML mpio is active/passive failover between controllers, similar to Clariion.
Richard Siemers
The views and opinions expressed are my own and do not necessarily reflect those of my employer.
Cleanur
Posts: 254
Joined: Wed Aug 07, 2013 3:22 pm

Re: New 7400 just for ESX cluster, CPG ideas

Post by Cleanur »

To give them their due, Compellent's method of ingest is pretty fast even on spinning disk, but yes controllers are Active / Passive ALUA like VNX/Netapp et al. But strangely like the others :-) they always insist to potential Customers that it's true Active / Active, implying symmetric access, which it definitely is not.
afidel
Posts: 216
Joined: Tue May 07, 2013 1:45 pm

Re: New 7400 just for ESX cluster, CPG ideas

Post by afidel »

Cleanur wrote:To give them their due, Compellent's method of ingest is pretty fast even on spinning disk, but yes controllers are Active / Passive ALUA like VNX/Netapp et al. But strangely like the others :-) they always insist to potential Customers that it's true Active / Active, implying symmetric access, which it definitely is not.

As long as it's ALUA I don't care because the system handles the black magic. Then again I'm sitting at low single digit CPU usage on my controllers so maybe it matters more to folks that are actually pushing their arrays hard.
Cleanur
Posts: 254
Joined: Wed Aug 07, 2013 3:22 pm

Re: New 7400 just for ESX cluster, CPG ideas

Post by Cleanur »

That's the point, with ALUA the MPIO stack actually handles the magic or lack there of. The controllers provide rudimentary failover protection, but individual controllers still own specific LUN's and so I/O can only be processed from that controller. Each LUN has an active optimized and active non optimized path meaning manual load balancing between controllers. It's not really comparable to a Symmetric Active-Active access model, but ALUA is a bit of a slippery term allowing the claim of A/A when it's actually active / passive per LUN, a bit like a MS Failover cluster.
User avatar
Richard Siemers
Site Admin
Posts: 1333
Joined: Tue Aug 18, 2009 10:35 pm
Location: Dallas, Texas

Re: New 7400 just for ESX cluster, CPG ideas

Post by Richard Siemers »

The topic becomes more important when you have high throughput/iops hosts with HBAs faster than your storage front end ports. You don't want a situation where 1 busy host can saturate 1 or more of your shared Front End ports on the storage. Fanning out is the solution for that... 1 host adapter zones to 2 storage ports... and round robin spreads the traffic across the 2.

If these were active-passive it becomes an extra layer management to balance the # of active connections evenly across your FE ports. Much like on Clarrion it was manual process to balance preferred owner of each LUN.
Richard Siemers
The views and opinions expressed are my own and do not necessarily reflect those of my employer.
Post Reply