VMworld 2014/vSphere Storage Best Practices: Next-Gen Storage Technologies

vSphere Storage Best Practices: Next-Gen Storage Technologies
STO2496 vSphere Storage Best Practices: Next-Gen Storage Technologies

"This VMware Technical Communities Session will present a technical best practices with emerging storage technologies for vSphere. The storage industry is experiencing a high level of innovation that will influence your next datacenter refresh. Storage industry experts present this session in a vendor neutral perspective on best practices with storage infrastructure technologies spanning host-based acceleration, all-flash, and hyper-converged. Vaughn and Chad focused on delivering a deep technical session and have invited Rawlinson Rivera of VMware to join and expand the knowledge transfer. This session will present best practices that span connectivity, performance, availability and failure domains, data protection and automation."


 * Rawlinson Rivera - Sr. Technical Marketing Architect, VMware, Inc
 * Chad Sakac - SVP, Global Systems Engineering, EMC
 * Vaughn Stewart - Chief Technical Evangelist, Pure Storage

Pure Storage
Flash Array by Pure Storage | Enterprise Flash Array - http://www.purestorage.com/

"Pure Storage released a flash memory product called FlashArray" (Pure Storage - Wikipedia - http://en.wikipedia.org/wiki/Pure_Storage)

Simplicity
Keep it simple, it will save you headaches

Best Storage Practice Documents
Read them. Each vendor may have different methods and policies, so don't apply the same everywhere.

Undersand Cloud I/O
aka Shared Virtual Infrastructures

The "I/O Blender"

Know your workloads and average I/O sizes

Hybrid Storage
All flash storage arrays differ

Benchmarking
What is your objective? Know what you are testing for.

Absurd Testing - just silly

data sets have an active i/o band and a cold band
 * spikes and repeat data are common

Principles:
 * don't let vendors steer you too much
 * talk to other customers and system integrators
 * benchmark over time
 * lots of different work loads - NOT A SINGLE GUEST, SINGLE DATASTORE!
 * benchmark mixed loads
 * use SLOB or IOmeter - still artificial workloads
 * hard to generate sufficient IO from a host for modern flash storage
 * absolute performance is not the only design consideration
 * if it doesn't meet performance needs - it's a boat anchor

Architecture Matters:
 * Always benchmark performance
 * always benchmark resiliency, availability and data management features
 * recommend testing with actual data

Best Practices
Vendor best practices seem to break each other

Always separate guest VM traffic from storage and Vmkernel network

Jumbo Frames
Avoid Jumbo Frames ("just my personal recommendation") - let the flames begin

Thin Provisioning
Thin Provisioning is not a data reduction technology!

It is just an on demand feature

Deduplication and compression are

Inline vs post processing - inline happens during transfer, post processing is like a garbage collector, and happens afterwards

UNMAP
T10 UNMAP is still not here with vSphere 5.5 - in the way people expect

still a manual process

5.1: vmkfstools -k

5.5: esxcli storage vmfs UNMAP

only cleans up deleted VMs, not within guests

Storage QoS
Ensures priority of service for some applications when storage IOPs are exhausted

Operation complexity is why most people don't do it today

Inconsistent capabilities limit broad adoption

There is no QoS all the way down to the guest

VMware Virtual SAN Integration
(see image)

Features:

enterprise features:
 * network i/o control
 * vmotion
 * svmotion
 * drs
 * ha

data protection:
 * linked clones
 * snapshots
 * vdp advanced

disaster recovery:
 * vsphere replication
 * vcenter site recovery manager

cloud ops and automation
 * vcenter operations manager
 * vcloud automation center

VMware Virtual SAN Best Practices
(see picture)

seek simplicity

network connectivity:
 * 10 Gbe preferred
 * leverage distributed switch (NIOC)
 * L2 multicast

storage controller queue depth:
 * supports depth 256 or higher
 * higher storage controller queue depth will help performance, resync, rebuild
 * pass-thru mode preferred

Disk and disk groups:
 * don't mix disk types in a cluster, for predictable performance
 * more disk groups are better than one

storage and flash sizing:
 * size of cache is equal to performance
 * start with recommendation of 10% of SSD
 * maintain utilization below 80%

cluster design
 * wider is better
 * include host failures scenarios as part of design

policies:
 * consider use of Object space reservation to sustain levels of performance
 * more disk groups are better than one

Go Wide!

Automation
Automate everything you can, infrastructure must be programmable

Automation - should never do anything more than once

Every array worth their salt has Restful API - use it

Don't hard-code to any vendor API - abstract via something open

Don't get hung up on Puppet vs Chef vs Ansible vs ... - Just pick one and start

To make it easy, start with EMC ViPR controller (http://www.emc.com/getvipr) - free with community support