Follow Me Icons

 

Follow @KendrickColeman on TwitterConnect on LinkedInWatch My Videos on YouTubeFollow me on FacebookCheck Out My Projects on GitHubStay Up To Date with RSS

Search

BSA 728x90 Center Banner

Rethinking Your vCloud Director Provider vDC Strategy with Vblock

There are two spiking trends in the market that I see today. The first is Enterprises and Service Providers are looking to build some sort of cloud infrastructure. I know everyone has a different definition of "cloud infrastructure", but I'm referring to companies that are focused higher in the stack. Speeds and feeds are great, new hardware is great, but at the end of the day those are capital expenditures and make IT departments a cost center. The goal of a cloud infrastructure is to start your ROIC, Return on Investment Capitol, as soon as possible so the business can gain value. While you keep acquiring new equipment, it still takes weeks or months for it to even be implemented as part of your production cloud. I say that it takes this long because a true cloud solution is not only based off of virtualization, but you need to have an orchestration strategy that automates the provisioning of new resources. Picture this, Cluster A only has 10% of free RAM available, a warning is sent to the cloud admin to investigate the situation. The cloud admin finds out that workloads cannot be dispersed to another cluster so an order for 2 blade servers are placed. Once those new blade servers are shipped, the cloud admin inserts those blades into free chassis slots. From there, automation takes over. The chassis recognizes new blades, applies a pre-defined service profile, auto-deploys ESXi, and the ESXi configuration tasks take over to add the new blades to the cluster for HA and DRS resources, along with mounting datastores and configuring networks. The cluster is now at 30% free RAM.  That is called cloud automation and many companies see the light today using products such as Cisco IA, vCloud Director, and vCenter Orchestrator or even using custom code. This is the new wave of innovation that developers and admins need to be prepared for.

 

 

 

The second spiking trend is the need for standardization. Standardization is a key component to moving forward with a cloud strategy. When your orchestration stack has been written or configured to talk to a standard set of components, the orchestration will always work without much hassle. Problems arise when companies bring in 10 different vendors to accomplish 1 thing such as multiple storage, networking, and compute vendors, including multiple hypervisors. Every time you introduce a new variable to your environment, your orchestration software needs to be updated and drains a lot of hours. In addition, what if you standardize and your hardware goes EOL? More than likely the vendor comes out with a next generation replacement. With newer hardware come new features, new APIs, and perhaps a completely new way their hardware can be used. Again, your stuck updating your orchestration software or re-coding to keep up. This is where Vblock comes in. Of course, Vblock makes standardization easy because we only have 3 vendors in the stack that will never change. EMC Storage, Cisco networking and compute, and VMware as the hypervisor. Building your cloud stack on best-of-breed components, that have been pre-engineered and pre-tested, should be a simple decision. In addition, the time it takes to acquire and gain ROIC can be realized much faster than traditional ways because VCE guarantees production capable workloads on day 1. Read an older blog post called Finding the True Value in Vblock to see how you can go from a 6-12 month time cycle of acquisition to production, in as little as 30 days. UIM plays a critical role here as well because it contains the API set which allows orchestration components to talk and automate. UIM is the tool that allows VCE to change hardware components underneath (such as a VNX to VNX.next) without needing orchestration software to be re-written. That universal API allows orchestration software to use the same APIs as it always has and without having to re-write for brand new hardware. That's huge! You can interchange all the pieces underneath, but the message still stays the same to the orchestrator. A layer of Foundation Management is key in any integrated hardware stack. UIM v3.0 released a new feature where a service offering, which is a vCenter Cluster object consisting of blades and datastores, can now be presented to vCloud Director 1.5 via the vCloud APIs as a Provider vDC. Companies are looking to VCE and Vblock for these reasons because it makes standardization, automation, and acquisition easy moving forward.

 

Before I continue, let's discuss what's a Provider vDC (Virtual Data Center) in relation to vCloud Director. Within vCloud Director, you need to provide resources for organizations to consume. These resources are considered Provider vDCs and everyone will have a different Provider vDC strategy. The cloud admins responsibility is to provide appropriate SLAs as they relate to a provider vDC. Most people are familiar with Gold, Silver, and Bronze approaches and we will use that going forward for our examples. To simplify, a Provider vDC can be a cluster or clusters of servers with associated datastores. A standard best practice is to associate a cluster of servers and associated datastores as a tier of service and to not use resource pools or share server clusters among different tiers. It's also important to note that a good vSphere design is still crucial to a good vCloud design.

 

 

So how does Vblock play into a Provider vDC strategy for vCloud Director? I'm going to list out different ways I've seen companies use Vblock for vCloud Pilots and their on-going cloud strategy.

 

The most simple form to tie a SLA to a Provider vDC is based on types of disk in the Vblock. This should be relatively easy because everyone can understand the differences in performance between EFD/SSD, Fiber-Channel/SAS, and SATA drives. Assigning an appropriate SLA is simple because we know gold would be aligned with EFD/SSD, silver with FC/SAS, and bronze with SATA based on performance characteristics. The con about this method is the inability to appropriately estimate the amount of each type you are going to need and the wasted costs. If you fail to figure out your tenants needs, then you will end up over or under purchasing for a particular tier. Perhaps you wasted a ton of money on Gold EDF/SSD drives and you don't have single tenant that wants to pay for that sort of premium. The wasted costs are risky.

 

A second form is still in relation to disks, but instead builds upon multiple tiers by using multiple types of RAID groups in the Vblock. This is going to be tough to standardize because there are lots of different types of RAID offerings, and you could once again back yourself into a corner of wasting money on unused disk. Different Applications may warrant the need for RAID5 vs RAID6 vs RAID1+0 for performance characteristics. Now you have to decide where to spend your money on types of disks. An example would be setting a Gold Tier to SAS/FC Raid 1+0, Silver as SAS/FC in RAID5, BronzePlus to SATA in RAID5, and Bronze with SATA in RAID6. As you may be able to tell, not only do my types of media warrant performance characteristIcs, but they also offer better levels of drive redundancy. I opted not to include a EFD/SSD tier because a key thing to remember is keeping everything simple. I could just as easily add in EFD/SSDs and more RAID offerings on all tiers of media to make a multitude of offerings. The goal in the end is to keep costs in mind and find that sweet spot for ROIC. Going with RAID types as a differentiating factor might not be the most efficient because the applications hosted within vCloud Director (at this point in time) are probably not critical enough to deem all this thought process. Sticking with 1 standard RAID type and moving forward may be a better position to make sure you aren't over or under allocating resources.

 

The third form still uses types of media as an approach, but it focuses on EMC technology. There are two types of technologies that can be packaged with a Vblock, the first is FastCache, and the other is FAST. FastCache is a mandatory component on all Vblock 300 models and is sold with a bare minimum of 100GB as EFD. FastCache, in it's simplest form, is an extension of Read/Write through Cache used by the array to move *hot* blocks of data that have been accessed atleast 3 times to the EFD drives. FastCache is an array wide technology and it can't be turned off for specific datastores. FastCache throws a wrench in figuring out your chargeback model because if you go with a disk based SLA approach, tenants in a bronze offering could be getting EFD type responses even though they only paid for SATA, at the same time, tenants in Silver or Gold may only be getting what they paid for depending upon the workload across the array and the hot blocks that got lucky enough to move up. A way to circumvent this cost is to break up the cost of FastCache among all tiers and have a standard fixed cost as an added bonus, but the tenants may or may not be able to leverage it. FastCache is still very much recommended to help performance of the array in general. The second technology is Fully Automated Storage Tiering (FAST). FAST can determine your Provider vDC strategy based on types of disks because you can offer this technology based on single datastores. FAST allows tiers of disks configured in the same RAID type to be aggregated into a single LUN/Datastore and an algorithm will determine where a block will be stored. I can put SSD, FC, and SATA all into a single datastore, and hot blocks will be moved to a higher tier of disk, while other unused blocks can be moved to a lower tier. If those lower tier blocks start seeing action, then they can potentially move up a tier or 2 based on how often the algorithm runs. FAST allows cloud admins to offer up multiple kinds of disk based SLA offerings. An example, Gold would equal 30% EFD & 70% FC, giving Gold tenants the ability more room to "burst" into EFD while not paying a premium for EFD drives in the short term. A Silver SLA could be tied to 5% EFD, 70% FC, and 25% SATA which gives tenants an offering that allow little burstability but will warrant good performance when needed. A BronzePlus offering could be 25% FC and 75% SATA, allowing tenants to burst into FC type performance while still keeping costs minimal. And round it all out with a simple Bronze offering made of 100% SATA and no FAST to offer a predictable performance tier. This strategy gives the cloud provider greater options for the level of service they can offer to their tenants while also saving money on expensive EFD drives. The only downside to a FAST offering is that you can't guarantee tenants a predictable IO pattern or performance guarantee. vCloud Director will see datastores equally in a provider vDC and if multiple tenants use the same FAST datastore, they will be competing for those higher grade tiers based on their workload.

 

Of course, we can stretch this same type of thinking to servers. Perhaps you still had a single SAN, but you were refreshing or expanding your compute cluster. You can utilize old servers to create VMware cluster that perhaps run older Dual or Quad core procs that get assigned a silver or bronze SLA, and give gold to newer Hex core servers that have greater clock speeds and RAM densities. Both clusters still rely on the same backend datastores, but your differentiating factor is going to be the processing power that are given to your VMs.

 

 

I would encourage everyone to stay away from tying an SLA to FiberChannel vs NFS. Both solutions are great and they both achieve what is needed. Instead, think about how you would tie SLAs to connections on 1GB vs 10GB NFS and 4GB vs 8GB FC. If there is a mixed environment, you could have 1GB IP = Bronze, 4GB FC = Silver, and 8GB FC = Gold or 10GB NFS = Gold.  The battle of block vs file based storage will never end so stay neutral about how you tie an SLA to a type of network medium. In addition to speed, take into account reliability. What type of switch or fabric switch is in the middle? Are the fabric switches redundant and if the loss of a switch occurs, what is the impact to the throughput?

 

 

Now that we have looked at a few types of provider vDC approaches, let's start thinking a bit bigger. Basing your provider vDC on disk is good for use in a single POD or a single Vblock because it can easily be managed. The great thing about vCloud Director is that it gives the cloud provider freedom and control over the provider vDC offering. Many companies have older VMware farms, or somewhat new VMware farms but are either looking for a refresh or to expand. We can now use vCloud Director Provider vDCs in a POD approach instead of thinking down in a granular disk approach. Say you have a collection of Dell R!!!! servers connected to two Cisco 3650s via iSCSI 1GB to a Hitachi Array. You also have a few clusters HP DL380 G4 servers connected to a single Cisco 4507R to a NetApp FAS6080 via 10GB NFS. You have now also purchased a new Vblock 300HX. The first key thing you should be pointing out is the lack of standardization, but that aside, lets continue. For simplicity sake, we will say each POD has a single cluster of 8 servers and datastores of only FC storage. From this we can derive a few differentiating factors. 1, the servers keep getting newer and we can tie appropriate SLAs. 2 & 3. The connection medium is capable of more throughput and becomes more reliable and redundant. POD 1 has 1GB connections on 2 Cisco 3560s of which only 1 is used for the uplink to overcome STP. POD 2 has much better throughput using  a 10GB connection but falls short of *true* redundancy because the 4507-R is a single chassis solution, but it does have 2 supervisor engines. POD 3, the Vblock, utilizes 8GB FC and 10GB NFS Storage for maximum throughput and is fully redundant by utilizing Virtual Port Channels between Nexus 5548UP Switches and a redundant FC network using MDS 9148s. An important thing to note is that this whole time, the backend storage stayed the same. Sure, the storage processors are fresher on the newer arrays, but it's still the same old 300GB FC disks spinning in RAID 5 delivering the same amount of IOPS. This is an example of thinking in a POD based approach because all of this equipment is still usable to a cloud provider and they can use older hardware as a service tier to still realize profits. Perhaps the Hitachi array is out of it's service period and no warranty will ever be available. You can still use it as an as-is service tier for developer usage because if something happens to it, you met your SLA because it's as-is.

 

Lets get back to standardization. Say a cloud provider has standardized on Vblocks as a building block approach for their cloud strategy because their orchestration software knows how to talk Vblock. How would they go about creating different service offerings? All Vblocks are built with 100% redundancy, they all will have the same type of switching medium, and they all have the same hypervisor. An easy way to differentiate them is based back on disks and how we tied SLAs to all the different technologies. You could also differentiate them on the types of blades in a Vblock. Perhaps you save B200s w/ 96GB of RAM as your performance blades, and you assign the B200s with 192GB of RAM as your standard blade. You can get greater densities on the 192GB blades, therefore packing more tenants in tightly, while still maintaining the same level of vSphere 5 licensing cost. The only thing left as a cloud admin would be to provide appropriate types of storage to each offering that best suits performance vs standard.

 

Another option would be to have one of the Vblocks in Datacenter 1 be replicated to another Vblock in Datacenter 2. From here, you can still provide the same levels of tiering based on disks and servers in each Vblock, except the datastores in one Vblock can have a "Plus" appended to it because the workloads are now protected by replication.

 

 

This can be done on a smaller scale as well in a single Vblock by offering up multiple types of datastores, but having select datastores be replicated to a smaller Vblock that sits in another datacenter. This gives the consumer lots of choices and the freedom to choose an SLA that meets their needs.

 

 

Many people are in the very early stages of their cloud strategy and how they want it to develop. A main reason why cloud providers come to VCE is for the ease of acquisition. Cloud providers don't feel the need look under the covers of a Vblock because all they want are resources to be consumed by their cloud. A cloud provider will already have a specification laid out, and once that Vblock gets delivered, their orchestration software points to a few IPs and gets to work. You need resources? There they are, completely configured and ready to go in 30-45 days. Multiple teams aren't spending weeks tying pieces together, but instead, the Vblock is rolled on the floor ready to be consumed by their cloud. The orchestration software already knows what's in the Vblock and can start carving out Provider vDCs or add to existing Provider vDCs. I've said it a thousand times, but standardization and automation are key to a cloud strategy. Once you define a chargeback model, Vblocks become cloud resources that can be consumed by tenants based on the SLAs you have already pre-defined and automation will take care of the rest.

Related Items

Related Tags