After writing up this post on FAST VP INYO Goodness last night, I went to delete my notes I had taken on the IPAD and realised I’d forgotten one very cool new feature which may have slipped under most people’s radar.

When creating a LUN from a FAST VP enabled pool in the current version of FLARE, you have the following options to select from.

  • Auto-Tier
  • Highest Available Tier
  • Lowest Available Tier

These of course are Auto-Tier policy’s; selecting one determines which algorithm is used to distribute data through promotions and demotions of 1GB slices between storage tiers.

At LUN creation time I refer to these as “Initial Data Placement” policy’s, a term which I’ve actually taken from one of the VNX best practice documents found on PowerLink. Each policy directly impacts which storage tier the data is first allocated to.

The Highest and Lowest options are  self-explanatory; Auto-Tier unless I’m mistaken uses an algorithm to distribute the data over all available storage tiers which in my opinion increases the risk of having performance issues before the pool has had sufficient time to warm up.

When you create a LUN, you’ll find that Auto-Tier is actually the default selection, however I always change this to Highest Available Tier to ensure that data starts off on the highest performing disk and once the migration of data is complete I switch the policy to Auto-Tier to let FAST work its magic.

But Now…… INYO introduces a new policy

  • Start High, Then Auto-Tier

The introduction of this policy effectively means I no longer have to go and remember to do this manually. While some  might think this is a non event in terms of new features, to me its a good example of how FAST VP is evolving based on feedback from partners and customers…. and that I like.

Performance and efficiency of data is all about the locality of the data, if you’re migrating data from an older storage array to a VNX, the last thing you want is for people to complain about performance. Although Auto-Tier is the default option when creating a LUN, the option Highest Available Tier is the policy recommended in the best practice documentation.

One of my favorite sessions at EMC World this year was titled “VNX FAST VP – Optimizing Performance and Utilization of Virtual Pools” presented by Product Manager Susan Sharpe.

As a technical person I always worry about having to sit through sessions which end up being too sales focused or that the presenter will not have enough technical knowledge to answer the “hard ones” come question time. This was not the case and it was evident that Susan had probably been around from the very beginning of FAST VP and answered every question thrown at her with ease.

I had hopes of taking my notes from this session and writing up a post about the changes to FAST VP but as you would expect Chad over at VirtualGeek was quick off the mark with this post covering the goodness to come with INYO.

Rather than post the same information I decided to post something on the three points which relate directly to FAST VP and share why as someone who designs and implements VNX storage solutions…these changes were much-needed and welcomed with open arms.

Mixed Raid VP Pools

Chad started off In this section by saying this was the number #1 change requested, and I totally agree.

When you look at the best practice guide for VNX Block and File, the first thing that stands out in terms of disk configuration is that EMC recommend disks be added to a pool in a 4+1 Raid 5 configuration. however when you add NL-SAS drives to the pool you get a warning message pop up… “EMC Strongly Recommends Raid 6 be used for NL-SAS drives 1TB  or larger when used in a pool” .. something along those lines.

So the problem here is that until INYO is released, you can’t mix raid types. This means to follow best practice when adding NL-SAS drives larger than 1TB to a pool, you need to make everything Raid 6, including those very costly SSD disks. This of course means your storage efficiency goes out the window.

Why The Warning ? – In my opinion this comes about because of the rebuild times associated with large NL-SAS drives, while the chances of having a double drive failure within a raid group during rebuild is very unlikely, it is potentially a lot higher than the chances of a double drive failure with the smaller faster SAS drives. Never say Never right ?

FAST VP Pool Automatic Rebalancing

As an EMC partner we have access to a really cool application which produces heat maps from CX/VNX NAR files and shows the utilization ratios for the private raid groups within a pool (and much more). It was common to see one or more private raid groups doing considerably more I/O than the others and without the “rebalance” functionality it was difficult to remedy. (to be fair this was typically seen on pools without FAST VP enabled)

Now with INYO adding drives will cause the pool to re-balance as well as what might possibly be an automated/scheduled rebalance of the pool data across all drives. This now means when a customers heat map shows that raid groups are over utilized, you can throw some more disk at it and let the re-balance do its thing.

Higher Core Efficiency

A number of times over the last few years I’ve encountered customers who were moving away from “competitor vendor X” to EMC storage and were used to much larger raid groups and sometimes it was a tough pill to swallow when they expected to get “X TB” and got considerably less having configured the pool using 4+1 Raid 5. (which to date is still best practice)

Susan and Chad both mention that EMC engineering looked at the stats from customer VNX work loads and decided that 4+1 was rather conservative so in order to improve and drive better storage efficiency they would open up support for the following parity configurations.

  • 8+1 for RAID 5 (used with 10K/15K SAS or SSDs) 
  • 14+2 for RAID 6 (target is NL SAS)

If you’ve come from a unified (Celerra) background then 8+1 is nothing new and I don’t expect this to cause too much concern. Having these additional parity configurations just makes configuring a VNX that much more flexible and allows us to keep a larger range of people happy.

What you decide to use may depend on the number of DAE’s you have, the expected work load and also if your lucky enough to also have FAST Cache enabled. The best piece of free advice I can give is “Know your workload”

I’m really excited about these improvements and I think this is really going to make a lot of people happy !

I’ve been working with VMware, Storage and Data Protection for sometime now and over the years I’ve seen data protection products come and go.

A lot of the products/applications/appliances showed innovative, cutting edge technology but often focused on one particular area or system type which left the product lacking and often missed the mark when it came to providing a complete solution.

What do I mean by “Complete ” ? – For example, alot of products over the last few years completely focused on Virtual Infrastructure which while very important, large customers found this to be just another product to manage on top of existing enterprise backup infrastructure.

This post is not a technical review about Actifio PAS (Protection, Availability and Storage) that will likely come later… At the moment I’ve just come out of a technical deep dive session with the guys who work out of the Australian office and thought id share.

Its one thing to see a presentation, or sit in on a technical deep dive session, but untill you’ve used, touched, prodded something, you really cant be sure. I’m really interested to hear from anyone who’s using it to purely protect Virtual Infrastructure as well as people using it to protect both Physical and Virtual Infrastructure.

Here’s a description from the Website: “Actifio’s Protection and Availability Storage (PAS) platform is the industry’s first solution optimized for managing copies of production data, resulting in the elimination of redundant silos of IT infrastructure and data management applications. By virtualizing the management and retention of data, Actifio transforms the chaos of multiple silos of infrastructure and point tools traditionally deployed for backup, disaster recovery, business continuity, compliance, analytics, and test and development into one, Service Level-driven, virtualized Protection and Availability storage device. Actifio PAS delivers a radically simple, application-centric, policy-driven solution that decouples the management of data from storage, network and server infrastructure, resulting in 10X reduction in costs.”

Whats Next ? – From what I saw yesterday it seems to do it all, Backup, Snapshots, Deduplication, Replication…. now its time to dig deeper and find out if it really is the whole package “complete” solution that I’m hoping it is. I have a follow up call planned with the guys from Actifio next week and I will update this post if anything interesting comes out of it.

One of the things which really got my attention yesterday was the support for VMware Site Recovery Manager 5.0, me likey likey.

Checkout the YouTube clips below.

Its been far too long since I posted ! I thought it was time to get back into it and what better way to do so than posting to say that ill be attending EMC World in Las Vagas this year.

Anyone know of any blogger events or areas like you see at VMware World ?

Just a quick post on a problem I had this week at a customer site running VMware Site Recovery Manager 5 with EMC Celerra and VNX mix.

I actually had everything up and running for a couple of weeks before I logged in again to notice in the “Array Manager” section, both arrays showed status “Error” and when I browsed through to refresh the list of replicated datastores I received the error “SRA command ‘discoverDevices’ didnt return a response”.

I logged a case with EMC and was supplied a new version of the enabler which is not yet available on PowerLink. “EMC_VNX_Replicator_Enabler_for_VNX_SRA_v5.0.11.zip

Once this was installed, I performed a refresh under the devices tab and the errors vanished.

As noted above, everything was working perfectly for a couple of weeks before it broke and it turns out what broke it was the vdm (virtual datamover) replication I had set up post the SRM install. The old 5.0.5 verision of the enabler does not filter vdm replication  sessions and I think its fair to say that it breaks the SRA.

I would recommend anyone running the 5.0.5 enabler or below, request the 5.0.11 version from EMC and upgrade.

When it comes to zoning, everyone has their own way to do things, but I thought id share a couple of the things I do to make my life easier down the road, and who knows maybe yours too.

I’m going to mainly concentrate on the zoning of EMC arrays here, but to go broader I think its worth while noting that “single initiator single target” is defiantly the way to go.

In almost every site which I go into, I will find alias names setup for a CLARIION as shown below.

cx4_spa0 or often ill see cx4_spa_port0

There is absolutely nothing wrong with this and it gives you all the information you need such as the Array Type, Storage Processror and Port which is connected. This works well when you have single array, but often people get stuck with additional alias names when they have a second array of the same type. There are of course a number of ways you can go about differentiating the arrays but few keep things tidy and consistent.

The convention I use will depend on how their existing Aliases are configured eg;

vnx_spa_port0 or vnx_spa0 or vnx_spa_port_0 (all of which are examples I have seen in the field). 

So what will I use ? I will configure vnx_3ea13252_spa_0 or  vnx_3ea13252_spa_port0 again depending on existing configurations.

So where does the 3ea13252 come from ? Well every time you zone an array the wwn will be different and that’s the point. I’ve created a diagram below to show a typical CLARRION / VNX wwn which you would expect to see when you go looking on the array or on the switch after connecting it.

As soon as you see the first 3 octets of 50:06:01 then you know it’s an EMC CLARIION array, the next octet is the Storage Processor Port, and the last 4 octets is the unique array ID assigned to the array.

While my alias might not be the most attractively constructed Alias you’ve ever seen, its does give me a very clear path for adding additional VNX arrays to the same switched network as the array ID will never be the same.

vnx_3ea13252_spa_port0

vnx_1ca19253_spa_port0

If I come back and need to either add an additional array yet again, or make zoning changes, It’s really clear and hard to get wrong. Most of all I like it because its tidy and consistent, maybe it might be a touch on the obsessive compulsive side, and one might argue that port descriptions can be used to get around this….. well I have a golden rule of never trusting port descriptions as they are often out of date. (which means wrong).

This is not EMC best practice, this is just my own way of doing things which works well for me. 🙂

This week I’ve been in Sydney Australia doing EMC Recoverpoint training and its seems there have been two big things happen while I’ve been here.

Firstly vSphere 5.0 has been released, I opened my email Monday morning to find a tone of marketing emails from VMware about the launch.

Secondly, it seems Lady GaGa is in Sydney at the same time I am.

I was sitting in my hotel room the other night watching some TV when I heard screaming and shouting, it sounded like a large group of people (mob), at the time I figured it must have been Lady GaGa leaving her hotel room causing mass fan hysteria.

Well today after a mate txt me asking about VMware’s new licensing scheme, I rushed back to the hotel to download the licensing guide and to be honest im now not sure the screaming was GaGa fans, It may well have been VMware customers walking down the street with pitch forks and machete’s protesting the new licensing scheme.

I have had a really quick read over the document and ill be honest, my first impressions I admit were quite cynical, but to be fair im going to read over the document properly and try to get my head around it.

The reason I am cynical about the change initially is because over the last couple of years I have seen a common trend where customers (and myself included) have replaced ESX host hardware with the same number of physical CPU sockets, but doubled or tripled the amount of memory per host. This was something which came about when the Nehalem CPU was released, the result was more logical cores and more VM’s you could run per core.

 

In the meantime if your interested in seeing a couple of good examples of how the changes might not be as bad as everyone’s saying it is, check out this post by Gabe of Gabes’s Virtual World.

NetWorker 7.6 SP2 with support for VMware VADP has finally become GA (Generally Available) and as with any new build, the first thing I do is download the documentation to see whats new or changed.

After downloading all available docs from Powerlink, I went and opened the document titled “Licensing Guide” which you would think should cover VADP licensing right ? ….. Wrong!.

Next I went and looked at “EMC NetWorker VMware Integration Guide“, I was hopeful this would contain the information I was after, but after reading the section “Licensing NetWorker support for VMware” I felt very much like I did when I saw the movie Mulholland Drive for the first time. 

However, today I was glad to receive an email from a colleague who found the following post on the NetWorker Community site by Eric Carter at EMC helping to clear up some of the misleading/confusing statements regarding licensing for VADP in 7.6 SP2 documentation.

Hopefully it wont be too long till we see updated documentation on Powerlink. (I’m also hoping the VADP licensing information gets added to the licensing guide document)

(click on the image below to enlarge)

Well its finally arrived, NetWorker 7.6 SP2 with support for VADP is now GA.

I’ve got a number of posts about this coming up, but for the moment, just a quick one to say it’s here, and to list some of the published features.

It’s fair to say that existing NetWorker customers have been waiting for VADP support for a considerable time now, so its great to finally see it arrive. However without putting a damper on things I would like to suggest customers do not rush off and upgrade without reading the release notes and considering there are still a number of open escalation’s in this build which are logged for issues with VADP.

VMware: vStorage API for Data Protection (VADP)

  • Integrated with vStorage API for Data Protection (VADP)
  • Change block tracking (CBT) support (file-based)
  • Single-step image-level backups & restores
  • Support for file level recovery from Windows VMs

 Flexible Single Step VM Recovery

  • To the original, new, or configure a new virtual machine
  • Recover the VM to the same/different vCenter, ESX Server, or ESX datastore

 NetWorker VADP proxy options

  • Physical or virtual servers
  • Windows Server 2008 R2, Windows Server 2008 or Windows Server 2003

DD Boost or Avamar Grid deduplication backups

  • DD Boost on VADP Proxy with Dedicated NetWorker Storage Node
  • Support for Global Encryption and Compression Directives for  NTFS image and file level backups
  • VMware VADP and In-Guest backups of the same UNIX/Linux based virtual machine

I’ve been working on a storage project over the last few months and I ran into a problem with MirrorView /S which turned out to be a bug in the CLARIION flare code. I thought id write a quick post about it incase anyone comes across the same problem.

The Problem :

The symptoms I saw were as follows;

  • Enabling the first mirror of a LUN per SP showed no issues, the LUN replicated and showed to be in a consistent state.
  • Enabling additional mirrors of LUNS on the same SP caused the initial mirror to re synchronize, (but would never complete).
  • Enabling  additional mirrors caused the hosts average read/write queue times to shoot through the roof causing huge performance problems.
Fix:

The arrays I was working on were running flare version 04.30.000.5.004 which needed to be upgraded to 04.30.000.5.517. After the upgrade I initiated a sync on all mirrors and everything starting working as expected.