Successful ROI/TCO modeling of hyperconverged infrastructure

[Author Note: This article is targeted toward channel partners, but is also applicable to anyone wishing to champion Nutanix / Dell XC hyperconvergence within their organizations.]

With 32 different manufacturers now offering hyperconverged infrastructure (HCI) solutions, customers are increasingly asking channel partners to help them choose between HCI vs. conventional 3-tier infrastructure (centralized storage + storage fabric + servers). Skillfully guiding customers through the financial modeling process can help them better evaluate the differences between the legacy and next-generation technologies.

Faster Horses

Henry Ford famously said that if he had asked his customers what they wanted, they would have told him, “faster horses.” As organizations increasingly virtualized their datacenters, they encountered problems such as manufacturer finger pointing when troubleshooting, and ordering and standing up the compute, storage and networking components in a reasonable time frame.

Wild Horses - pic 1

In response to these issues, every leading storage manufacturer came out with what they called a “converged infrastructure” solution including HP Matrix, Vblock, Flexpod, IBM PureFlex, Hitachi Unified Compute Platform, EMC VSPEX and so on. But these solutions lack any true innovation and, for that matter, any infrastructure convergence. They are simply faster horses.

I wrote an article a few months ago titled, The 10 ways Nutanix is Uberizing the datacenter. Suppose we were able to go back in time 30 years and approach a would-be taxi passenger standing in the rain fruitlessly trying to hail a cab.

We could tell her that in the future a company called Uber would use new technologies such as the Internet and smart phones and GPS to transform her transportation experience. Future rides would be simple, predictable and pleasant. Lacking the context to understand these new technologies would likely result in skepticism that Uber could do this.

Uber cab - pic 2

Nutanix partners often encounter this same type of challenge with people in IT, Purchasing and Finance who are used to looking at the datacenter through a 3-tier lens. Their first impulse is naturally to evaluate hyperconverged solutions in the same manner that they have long used for analyzing their conventional infrastructure purchases.

But metrics such as cost per gigabyte and even acquisition cost are often irrelevant or misleading when evaluating hyperconvergence. A Total Cost of Ownership (TCO) or Return on Investment (ROI) analysis, depending upon the use case, provides a far better framework for evaluating a major technology decision.

Counter-intuitively, while a TCO or ROI analysis will inevitably show a lower cost for HCI than 3-tier, this is not its primary purpose. The objective of taking customers through the financial modeling process is to give them the context to understand the full implications of Nutanix technology. In this way, they can appreciate why Nutanix is not just a faster horse, but how it is going to transform their experience of managing IT.

Challenges of 3-Tier Infrastructure

I recommend that you start the TCO/ROI analysis process by explaining the inherent financial penalties of 3-tier infrastructure. When customers, for example, purchase a SAN – they typically try to predict the workloads they will need 3 – 5 years down the road, and then purchase an array with enough headroom to expand storage capacity to hopefully meet those requirements.

If a SAN-purchaser guesses wrong and under-buys, the organization faces a massive forklift upgrade that’s very expensive, complex and time-consuming. Wikibon estimates that the cost just to migrate to a new array is 54% of the cost of a new array.

As a result, customers typically buy SANs with “room to grow”. But this extra capacity requires a large investment (the “I”) which then reduces the ROI. And as shown in Table 1, this excess capacity starts depreciating on the first day it is installed.

Depreciation - pic 3

Table 1: Depreciation Expense from Purchasing Excess Capacity Up-Front

The excess capacity also requires more rack space, power and cooling even as it sits idle. And as the customer utilizes the capacity over the years, the technology becomes increasingly out of date when compared with the new equipment of the day. This equates to inferior performance, less capabilities and features, and more rack space, power and cooling expense than would be realized with newer technology.

Mnemonic: Don’t lose on price, a 5-year analysis is nice. Nutanix won’t always be less than a 3-tier up-front alternative. This is why it is important for the customer to evaluate alternatives over an extended period – typically five years.

The Advantages of Moore’s Law for HCI

HCI provides customers with the exact opposite experience of 3-tier. Unlike a SAN which requires a large up-front investment and then quickly becomes old technology, Nutanix lets customers start as small as three nodes, and then seamlessly scale out as needed – even one node at a time. This both enhances the ROI while completely eliminating the risk of over-buying.

Since Nutanix storage clusters are completely separate and removed from the virtualization clusters, they are not subject to the VMware size limitations. And scale is not limited to a single cluster; a customer can have several clusters, all managed with Prism Central. In this manner, Nutanix also eliminates the much more punishing risk of under-buying.

As customers expand their Nutanix environments by purchasing additional nodes, they bring the latest technology into their environment in terms of CPU, memory, disk and flash. This increases the workload density per node, resulting in a lower cost per workload.

Table 2 below shows an example of a typical VDI customer migrating 5,000 PC users to VDI in conjunction with their 5-year refresh rate. Each year rather than getting new PCs, 1,000 users have their devices locked down or receive zero-clients and are migrated to virtual desktops.

Moores Law example - pic 4

Table 2: Impact of Moore’s Law on Number of Nutanix Loads per 1,000 VMs

In year one, the customer starts off with eight nodes to handle the first 1,000 users. But Moore’s Law means that hardware continues to get faster. We saw increases in density just from moving from the Intel Ivy Bridge to Hawell chips ranging between around 20% – 80%.

Because of Moore’s Law, we estimate a conservative annual density increase in VMs per node of 20%. This means that for year 2, the customer only needs six more nodes to handle the next 1,000 users. And by year 5, the customer only needs three more nodes to handle the last 1,000 users.

This is a very powerful financial argument that is key to helping customers begin to understand why Nutanix hyperconvergence is not just a faster horse.

Mnemonic: Make sure you know how the environment will grow. Ask the customer about the average expected percentage increase she expects to grow her virtualized server or desktop environment over the analysis time period. Be sure to factor in density improvements when projecting the Nutanix cost.

The Tesla Effect 

If you own a Tesla and you want to accelerate faster, corner better or – most recently, enable automated parallel parking, you download a new version of the Tesla software to your car. While the hardware remains the same, your car is in many respects like a new vehicle.

The same type of thing is true for Nutanix. Customers can non-disruptively apply the latest Nutanix OS to their existing nodes which will then perform better and have more capabilities and more features. As Tim Buckholz wrote after performing an analysis, just going from Nutanix OS 3.1 to 4.1 showed an average increase in performance of around 50%. Nutanix has seen a 5 X increase in performance resulting from software from 2012 to today.

Tesla Effect - pic 5

As another example, consider Nutanix’s recently announced erasure coding. Customers applying upgrades to their older nodes will see capacity increases of around 60%.

From a financial perspective, the Tesla Effect means that Nutanix customers can grow their environments without purchasing as many new nodes. The higher workload densities, increased capacity and updated capabilities and features help optimize their investments in the original nodes.

These software-redefined enhancements are another significant differentiator of Nutanix from proprietary SANs. An array utilizes firmware that is tightly coupled with the underlying hardware. As time marches on, the existing performance, capacity and capability continues to decline in comparison to newer technology.

Mnemonic: The Tesla Effect gains increased respect. Bring up the “Tesla Effect” as a way to differentiate Nutanix from 3-tier competitors as well as add still further justification for incorporating improvements in density as part of the analysis.

Other Game-Changing Differentiators

Showing how Nutanix slashes CapEx and associated rack space, power and cooling costs over a multi-year period, while eliminating all risk of under or over buying, provides the foundation for proving Nutanix is not just a faster horse. But the TCO/ROI analysis process provides the opportunity to showcasing many of the other Nutanix game-changing capabilities as well.

Multi-hypervisor Management

Both Gartner and IDC indicate that over half of enterprise customers now run two or more hypervisors.

Nutanix helps to significantly mitigate the multi-hypervisor management challenges by providing a single pane of glass – Prism, for managing and backing up multiple hypervisors. The financial modeling can highlight the potential savings from utilizing the optimal hypervisor for different workloads.

hypervisor - pic 6

Mnemonic: Nutanix is best at the multi-hypervisor test. Discuss the new standard of multi-hypervisor environments and how Nutanix changes the game with new capabilities in management and mobility.

Acropolis Hypervisor

It can be difficult to quantify the benefits Acropolis Hypervisor can bring in terms of simplified management, better scalability, enhanced security and bridging to public cloud. But licensing savings are easy to calculate as part of a financial analysis – especially in use cases such as test/development, branch office, big data, VDI, DevOps and so on. These savings can easily run to many millions of dollars.

Mnemonic: Acropolis cost is legacy loss. Identify potential areas where Acropolis hypervisor can save the customer money, now or in the future, and incorporate them into the analysis as appropriate.

Administrative Savings

Nutanix changes the game in terms of IT administration. Partners and customers commonly say that Nutanix’s management interface is the most intuitive in the industry. Prism Central dashboards display aggregated data around multi cluster hardware, VM and health statistics into a single management window.

Nutanix also utilizes extensive automation and rich system-wide monitoring for data-driven efficiency combined with REST-based programmatic interfaces for integration with datacenter management tools. Rich data analytics such as Cluster Health enable administrators to receive alerts in real time as the Nutanix system monitors itself for potential problems, investigates and determine root cause, and then proactively resolves issues to restore system health and maintain application uptime.

The Prism management and unsurpassed analytics capabilities combined on top of all the goodness of HCI results in tremendous administrative savings. Sometimes it is easy to quantify these savings – such as when management of the VMs is outsourced. In many cases, Nutanix either eliminates the requirement for outsourced management entirely, or reduces the cost significantly because of the slashed effort.

In cases where internal staff time is utilized for managing the environment, administrative savings can be more difficult to project. A recent extensive IDC study (You can download here) of 13 Nutanix and Dell XC customers shows average 5-year IT time savings and productivity improvements of $183,720 per 100 users.

Mnemonic: Outsourcing fees mean an ROI breeze. If the customer is currently outsourcing VM monitoring / administration, ensure they understand how vastly simpler that task becomes with Nutanix, and incorporate a reduced or eliminated cost if appropriate.

Risk Mitigation – User Productivity Benefits

Most SANs use RAID technology which was invented in 1987 and is archaic by today’s standards. Loss of a storage controller can cut available performance in half. Losing two drives in a RAID 5 configuration, user errors, power failures and many other issues can cause unplanned downtime.

Nutanix keeps multiple copies of data and metadata both local to the VM running the active workload as well as throughout the cluster. In the event of failure, MapReduce technology is leveraged to deliver non-disruptive and quick rebuilds.

The Nutanix Distributed File System is designed for hardware failure and is self-healing. Always-on operation includes detection of silent data corruption and repair of errors around data consistency, automatic data integrity checks during reads, and automatic isolation and recovery during drive failures.

Downtime, whether planned or unplanned, can be very expensive for an organization. IDC estimates that a minute of datacenter downtime costs US $7,900 on average. The IDC study referenced above reflects average decreases in unplanned downtime of 98% and in planned downtime of 100%. This equates to 5-year savings of $43,825 per 100 users.

Mnemonic: Put downtime to bed and make 3-tier sellers see red. Discuss typical reductions in downtime with Nutanix and quantify and incorporate as part of the analysis if appropriate.

Business Productivity Benefits

It is particularly difficult to quantify the business benefits realized from improved IT agility such as reduced development cycles for applications and services and subsequent faster user access to applications and application updates. A more scalable business model, higher sales and greater flexibility are some of the resulting benefits. IDC says the average 5-year quantified value of higher employee productivity and revenue is $200,275 per 100 users.

Support

Manufacturer support is something particularly hard to quantify, but is an important differentiator to emphasize when engaged in a Nutanix financial modeling exercise. Our customers and partners validate again and again that Nutanix takes support to a whole new level. Nutanix has a 90 Net Promoter Score and is the two-time winner of the Omega Northface Award for “Delivering World Class Customer Service.”

TCO vs. ROI

Many organizations use the terms “TCO” and “ROI” interchangeably, but they are very different. Use a TCO analysis in situations for when a customer is either considering migrating from an existing virtualized infrastructure either to Nutanix, or to a new (or refreshed) 3-tier architecture vs. Nutanix. Use an ROI analysis when comparing remaining with a status quo environment (whether physical or virtual) vs. making the investment to migrate to Nutanix Web-scale.

A financial company, for example, was running a Vblock 320 for a mixture of XenApp and sever VM workloads – and they were getting ready to purchase a second unit. But after learning about Nutanix, they became very intrigued with the simplicity and capability for things such as one-click upgrades. They also requested a TCO analysis comparing the cost of purchasing a second Vblock to an equivalent Nutanix solution.

Table 3 below shows the results of the analysis.  This is presented in a yearly cash flow format (which is typically the way finance folks like to see it).

TCO - pic 7

Table 3: Five Year TCO Results

While the Nutanix configuration was less expensive up-front than the Vblock, this is not always the case when comparing against 3-tier infrastructure. However, when incorporating projected upgrade costs over a 5-year period along with variables such as rack space, power, cooling, administrative costs, fibre channel cabling, etc. – Nutanix should always blow away the competition.

The financial company ended up purchasing Nutanix and reported a 10% – 20% improvement in performance over the Vblock. The CIO commented, “And then there is the management simplicity — the Nutanix systems have required almost no support so far.”

Mnemonic: When competing vs. status quo, use ROI, not TCO. Generate a TCO analysis if competing against another new solution or an ROI analysis if competing against a status quo environment.

Analysis Scope

A customer typically will consider Nutanix for a specific use case or department. This is a great starting point, but for purposes of the financial analysis, I recommend expanding the scope.

If the request is for departmental VDI, for example, suggest looking at the potential economic savings from virtualizing the entire user base (or whatever percentage of that user base is reasonable to virtualize over the next five years). If the request is for a cost comparison vs. a particular server use case, expand the scope to consider all virtualized servers. And then incorporate backup and DR in order to highlight the game-changing capabilities Nutanix provides in areas such as metro availability and Cloud Connect.

Expanding the analysis scope enables the customer to better evaluate the proposed smaller initiative within the context of a big-picture scenario. This in turn enables both better decision-making and often more optimized deployment of resources when the initiative moves forward.

Mnemonic: Don’t just hope; expand the analysis scope. Expand the scope of the analysis to include as many users, VMs, use cases as make sense to enable a big-picture context for the initiative.

Before/After ROI Picture or TCO Comparison Picture

This Visio diagram was generated by Dave Hunter, Director of IT for Empire Life, named the “Best Life Insurance Company in Canada in 2014” by World Finance Magazine. The drawing shows the huge rack space savings that Dave achieved through consolidation of mainframe, physical servers and virtualization hosts to a Nutanix environment. Whether looking at ROI or TCO, a picture can help highlight the extraordinary space savings Nutanix enables.

Empire Life - pic 9

Mnemonic: A representative picture should be an analysis fixture.

The Ideal Analysis Results

A TCO or ROI analysis is most successful when the numerical comparison between Nutanix Web-scale and legacy 3-tier is no longer the primary evaluation criteria for the customer. The process of taking the customer through the analysis makes it clear that Nutanix will not only be far more beneficial for the organization overall, but that it will change her datacenter management experience in terms of simplicity, predictability, scalability and resiliency.

See Also:

IDC Study on TCO & ROI of Nutanix vs. Traditional Infrastructure. Download from Nutanix Web site.

I, for one, Welcome the Rise of the Infrastructure Endgame Machines. 08/20/2015. Trevor Pott. The Register.

Empire Life Saves 60% in Infrastructure Costs and 16:1 Reduction in Datacenter Space. 07/09/2015. Jeff Babcock (video). Nutanix YouTube.

Nutanix Customers Weigh in on “Invisible Infrastructure” and Overcoming IT Bottlenecks. 06/12/2015. Jon Reed. Diginomica.

Nutanix Beating EMC, Says It’s Cutting Customer IT Costs 62%. 11/21/2014. Peter Cohan. Forbes.

 

Thanks to @vmmike130 for editing.

Hyperconvergence players

[Author note: This post has been updated and moved to By The Bell http://bythebell.com/2016/01/hyperconverged-players-index.html

While, according to IDC (via SiliconANGLE), “Nutanix generated 52 percent of all global hyperconverged revenue during the first half of 2014”, many other legacy datacenter players and startups have introduced hyper-converged infrastructure (HCI) offerings. The following is a list of all the known (to me) hyperconvergence players:

1 Atlantis Computing Atlantis HyperScale
2 Breqwatr All-flash appliance
3 Cisco Investment in Stratoscale. Selling arrangements with Maxta & Simplivity
4 Citrix Sanbolic
5 Datacore Datacore Hyper-Converged Virtual SAN
6 Dell Dell XC (Nutanix OEM) & EVO:Rail
7 EMC VSPEX Blue, ScaleIO & VxRack (VCE)
8 Fujitsu EVO:RAIL
9 Gridstore Private cloud in a box
10 HPE StoreVirtual & EVO:Rail
11 Hitachi Data Systems Unified Compute Platform 1000 for VMware EVO:Rail
12 HTBase HTVCenter
13 Huawei FusionCube
14 Idealstor Idealstor IHS
15 IBM Announced HCI Strategy
16 Lenovo Nutanix OEM. EVO:Rail. Selling arrangements with StorMagic, Maxta and Simplivity
17 Maxta Hyper-Convergence for Open Stack
18 NetApp NetApp Integrated VMware EVO:RAIL Solution
19 NIMBOXX Hyperconverged Infrastructure Solutions
19 NodeWeaver NodeWeaver Appliance Series
20 Nutanix Xtreme Computing Platform
21 Pivot3 Enterprise HCI All-Flash Appliance
22 Pure Storage Possible HCI solution coming
23 Rugged Cloud HCI
24 Scale Computing HC3
25 SimpliVity Omnicube (hardware-assisted SDS)
26 Sphere3D V3 VDI
27 Springpath Independent IT Infrastructure
28 Starwind Starwind Hyper-Converged Platform
29 Stratoscale The Data Center Operating System
30 StorMagic SvSAN
31 Supermicro EVO:RAIL
32 VMware EVO: RAIL, VSAN, EVO: RACK
33 Yottabyte yStor
34 ZeroStack ZeroStack Cloud Platform

Hypervisor myopia limits the promise of a software-defined datacenter

 Mypopia pic

I hear again and again from customers that they’d like to move to the cloud. Although the economics might not justify migration today, they want to eventually be free of the challenges in acquiring, provisioning and managing infrastructure.

Public cloud offers potential benefits, but reducing infrastructure complexity should not be counted among them. Hyper-converged infrastructure (HCI) can provide the simplicity of public cloud in customers’ own datacenters. And by facilitating a hybrid cloud strategy determined by workload needs, it can enable the same type of agility, efficiency and risk management as public cloud.

Seamless infrastructure requires not just abstraction of storage, but abstraction of cloud computing. Infrastructure should be intelligent enough to run applications on the most appropriate platforms whether on-premise or public cloud. This requires an HCI vision that goes far beyond dependency upon a single hypervisor.

The Hypervisor is no Longer the Center of the IT Universe

If you are running a virtualized datacenter, the odds are that you already have more than one hypervisor.

A September 2014 IDC market analysis states, “Over half of the enterprises (51%) have more than one type of hypervisor installed…VMware still leads the pack in terms of installed production deployments, but Microsoft is closing the gap. Other hypervisors are increasing their share, primarily by stealing from VMware’s historically predominant share.”

This statistic is corroborated by a Gartner poll showing that by July of 2014, 48% of VMware customers were already using Hyper-V as their secondary hypervisor. The poll also said that Microsoft’s share of new virtualized workloads is gaining.

Tightly integrating HCI with the kernel of a single hypervisor may bind a customer to the manufacturer’s product suite, but it disregards the trends of openness, agility and choice (not to mention resulting in a much fatter hypervisor). Senior Wikibon analyst, Steve Chambers, recently poked fun at this type of hypervisor myopia by comparing how the datacenter solar system would have looked pre and post Copernicus.

Copernicus pic

Operating System Centricity

A VMware spokesperson for its Storage and Availability group recently stated, “The harsh market reality is that there’s just not a lot of demand for non-vSphere-based hyperconverged solutions…I would argue that it’s hard to compete with features that are simple extensions of the hypervisor.”

IDC Mkt Share pic

This argument resembles the one Microsoft used to make in the late 2000s, “Virtualization is simply a role within the Windows operating environment.” Many industry analysts believed the messaging and told VMware that it needed to be more price competitive.

“If I were VMware, I would be looking to lower my prices.”

    -Laura DiDio, an analyst with ITIC. (Reuters, July 6, 2009).

 

Despite the analyst warnings and all of Microsoft’s marketing muscle, VMware continued to dominate the industry for years. IT leaders knew that virtualization could save them a vast amount of money – but only if it worked flawlessly. An IT manager would look pretty foolish telling her users that all of the VMs might be down, but the company saved several thousand dollars on a less expensive hypervisor.

Today, Microsoft has significantly decreased its operating system centricity. It has also reversed its opposition to open source and has made more contributions to the Linux code than any other vendor. The company no longer pitches virtualization based upon lower cost, but instead emphasizes enterprise-class virtualization, IT agility and flexibility.

Hyper-V still lags vSphere in management, and Microsoft has not developed the virtualization focus and community support that VMware has built over the years. But customers understand that Microsoft is striving to give them what they want, and they’re bringing Hyper-V into their datacenters.

Hypervisor Dependency is Contrary to a Software-Defined Datacenter

The term, “software-defined datacenter” (SDDC), was coined by VMware’s former CTO, Steve Herod, but it’s taken on a life of its own. Multi-hypervisor demand belies the concept of SDDC as merely an extension of vSphere.

VMware’s NSX team understands this new reality. VMware promotes multi-hypervisor support  as a “key feature…instrumental to the value NSX delivers.” In response to Cisco claims of hypervisor dependency, VMware fired back that some NSX environments don’t use VMware hypervisors at all.

A software-defined datacenter demands more than a single hypervisor HCI strategy. What if, for example, customers determine that KVM-based HCI enhances availability and performance of running containers in production? Or perhaps they want to run Hyper-V to lower the cost of their Citrix VDI environment. Or maybe deploying a combination of KVM and vSphere optimizes the application lifecycle from test/dev to production.

Multi-hypervisor HCI not only gives customers choice, it can also provide them with superior capabilities. Nutanix, for example, increases flexibility by supporting not just multiple hypervisors, but multiple versions of hypervisors. And these versions can run on the same cluster, potentially even in multiple datacenters.

Separating the operating system from the hypervisor enables non-disruptive 1-click upgrades of the Nutanix operating system, the hardware firmware and even the hypervisor – whether ESX, Hyper-V or KVM. Storage layer release cycles are higher frequency and bring improved performance, security and functionality with each release.

The Future of the Software-Defined Datacenter

If customers had a way to efficiently run and manage multiple hypervisors in the same environment – and to seamlessly meet their business and application needs; if they had training and certifications geared to a multi-hypervisor datacenter; if they had community support for their efforts to optimize performance while reducing cost – then the rapidly growing landscape of multi-hypervisor environments would undoubtedly accelerate faster still.

Nutanix Next pic

At Nutanix.NEXT in Miami in June, Nutanix is unveiling our Act II. We will reveal our plans to take multi-hypervisor capabilities to a new level. I hope that I will see you and your customers there to participate in the future of the software-defined datacenter.

 

Thanks to Prabu Rambadran (@_praburam), Steve Dowling, Payam Farazi (@farazip) and Angelo Luciani (@AngeloLuciani) for suggestions and edits.

The 10 reasons why Moore’s Law is accelerating hyper-convergence

SAN manufacturers are in trouble.

IDC says that array vendor market share is flat despite continued massive growth in storage.

pre 1 IDC mkt share

Hyper-convergence (HC) contributes to SAN manufacturer woes. The March 23, 2015 PiperJaffray research report states, “We believe EMC is losing share in the converged infrastructure market to vendors such as Nutanix.”

One of the most compelling advantages of HC is the cost savings. This is particularly evident when evaluated within the context of Moore’s Law.

Moore’s Law – Friend to Hyper-Convergence, Enemy to SAN

Moore’s Law, which states that the number of transistors on a processor doubles every 18 months, has long powered the IT industry. Laptops, the World Wide Web, iPhone and cloud computing are examples of technologies enabled by ever faster CPUs.

1 macMoore’s Law in Action  (via Igmur)

Innovative CPU manufacturing approaches such as increasing the number of cores, photonics and memristors should continue the Moore’s Law trajectory for a long time to come. The newly released Intel Haswell E5-2600 CPUs, for example, show performance gains of 18% – 30% over the Sandy Bridge predecessor.

Here are the 10 reasons why Moore’s Law is an essential consideration when evaluating hyper-convergence versus traditional 3-tier infrastructure:

1.  SANs were built for physical, not virtual infrastructure.

Virtualization is an example of an IT industry innovation made possible by Moore’s Law. But while higher-performing servers, particularly Cisco UCS, helped optimize virtualization capabilities, arrays remained mired in the physical world for which they were designed. Even all-flash arrays are constrained by the transport latency between the storage and compute which does not evolve as quickly.

The following image from Chad Sakac’s post, VMware I/O queues, “micro-bursting”, and multipathing, shows the complexity (meaning higher costs) of supporting virtual machines with a SAN architecture.

2 - sakac

HC: Hyper-convergence was built from the ground up to host a virtualized datacenter (“Hyper” in hyper-convergence refers to “hypervisor”, not to “ultra”). The image below from Andre Leibovici’s post, Nutanix Traffic Routing: Setting the Story Straight, shows the much more elegant and efficient access to data enabled by HC.

 3 - Andre

2.  Customers are stuck with old SAN technology even as server performance quickly improves.

A SAN’s firmware is tightly coupled with the processors; new CPUs can’t simply be plugged in. And proprietary SANs are produced on an assembly line basis in any case – quick retooling is not possible. When a customer purchases a brand new SAN, the storage controllers are probably at least one generation behind.

HC: HC decouples the storage code from the processors. As new nodes are added to the environment, customers benefit from the performance increases of the latest technology in CPU, memory, flash and disk.

Table 1 shows an example of an organization projecting a 20% increase in server workloads per year. The table also reflects a 20% density increase of VMs per Nutanix node – conservative by historical trends.

Fourteen nodes are required to support 700 VMs in Year 1, but only 8 more nodes support the 1,452 workloads in Year 5. And the total rack unit space required increases only 50% – from 8U to 12U.

4 Nodes
Table 1:  Example of decreasing number of nodes required to host increasing VMs

3.  A SAN performs best on the day it is installed. After that it’s downhill.

Josh Odgers wrote about how a SAN’s performance degrades as it starts scaling. Adding more servers to the environment, or even more storage shelves to the SAN, reduces the IOPs per virtualization host. Table 2 (from Odger’s post) shows how IOPs decrease per server as additional servers are added to the environment.

5 Odgers

Table 2:  IOPs Per Server Decline when Connected to a SAN

HC: As nodes are added, storage controllers (which are virtual), read cache and read/write cache (flash storage) all scale either linearly or better (because of Moore’s Law enhancements).

4.  Customers must over-purchase SAN capacity.

When SAN customers fill up an array or reach the limit on controller performance, they must upgrade to a larger model to facilitate additional expansion. Besides the cost of the new SAN, the upgrade itself is no easy feat. Wikibon estimates that the migration cost to a new array is 54% of the original array cost.

In order to try and avoid this expense and complexity, customers buy extra capacity/headroom up-front that may not be utilized for two to five years. This high initial investment cost hurts the project ROI. Moore’s Law then ensures the SAN technology becomes increasingly archaic (and therefore less cost effective) by the time it’s utilized.

Even buying lots of extra headroom up-front is no guarantee of avoiding a forklift upgrade. Faster growth than anticipated, new applications, new use cases, purchase of another company, etc. all can, and all too frequently do, lead to under-purchasing SAN capacity. A Gartner study, for example, showed that 90% of the time organizations under-buy storage for VDI deployments.

HC: HC nodes are consumed on a fractional basis – one node at a time. As customers expand their environments, they incorporate the latest in technology. Fractional consumption makes under-buying impossible. On the contrary, it is economically advantageous for customers to only start out with what they need up-front because Moore’s Law quickly ensures higher VM per node density of future purchases.

5.  A SAN incurs excess depreciation expense

The extra array capacity a customer purchases up-front starts depreciating on day one. By the time the capacity is fully utilized down the road, the customer has absorbed a lot of depreciation expense along with the extra rack space, power and cooling costs.

Table 3 shows an example of excess array/controller capacity purchased up front that depreciates over the next several years.

6 Excess Depreciation

Table 3:  Excess Capacity Depreciation

HC: Fractional consumption eliminates requirement to buy extra capacity up-front, minimizing depreciation expense.

6.  SAN “lock-in” accelerates its decline in value

The proprietary nature of a SAN further accelerates its depreciation. A Nutanix customer, a mortgage company, had purchased a Vblock 320 (list price $885K) one year before deciding to migrate to Nutanix. A leading refurbished specialist was only willing to give them $27,000 for their one-year old Vblock.

While perhaps not a common problem, in some cases modest array upgrades are difficult or impossible because of an inability to get the required components.

HC: An HC solution utilizing commodity hardware also depreciates quickly due to Moore’s Law, but there are a few mitigating factors:

  • In a truly software-defined HC solution, enhancements in the OS can be applied to the older nodes. This increases performance while enabling the same capabilities and features as newer nodes.
  • Since an organization typically purchases nodes over time, the older nodes can easily be redeployed for other use cases.
  • If an organization wanted to abandon HC, it could simply vMotion/live migrate VMs off of the nodes, erase them and then re-purpose the hardware as basic servers with SSD/HDDs ready to go.

Tesla

7.  SANs Require a Staircase Purchase Model

A SAN is typically upgraded by adding new storage shelves until the controllers, or the array or expansion cabinets, reach capacity. A new SAN is then required. This is an inefficient way to spend IT dollars.

It is also anathema to private cloud. As resources reach capacity, IT has no option but to ask the next service requestor to bear the burden of required expansion. Pity the business unit with a VM request just barely exceeding existing capacity. IT may ask it to fund a whole new blade chassis, SAN or Nexus 7000 switch.

Table 4 shows an example, based upon a Nutanix customer, of a comparison in purchasing costs of a SAN vs. HC – assuming a SAN refresh takes place in year 4.

8 staircase purch

 Table 4: Staircase Purchase of a SAN vs. Fractional Consumption of HC

HC: The unit of purchase is simply a node which, in the case of an HC solution such as Nutanix, is self-discovered once attached to the network and then automatically added to the cluster. Fractional consumption makes it much less expensive to expand private cloud as needed. It also makes it easier to implement meaningful charge-back policies.

8.  SANs have a Much Higher Total Cost of Ownership

When evaluating the likely technology winner, bet on the economics. This means full total cost of ownership (TCO), not just product acquisition.

SANs lock customers into old technology for several years. This has implications beyond just slower performance and less capabilities; it means on-going higher operating costs for rack space, power, cooling and administration. Table 5 shows a schematic from the mortgage company mentioned above that replaced a Vblock 320 with two Nutanix NX-6260 nodes.

9 vblock 320 tco

Table 5: Vblock 320 vs. Nutanix NX-6260 – Rack Space

Rack space, power and cooling costs are easy to calculate based upon model specifications. They, along with costs of associated products such as switching fabrics, should be projected for each solution over the next several years.

Administrative costs need to also be considered, but they are typically more difficult to gauge. They can also vary widely depending upon the type of compute and storage infrastructure utilized.

Some of the newer arrays, such as Pure Storage, do an excellent job at simplifying administration, but even Pure still requires storage tasks related to LUNs, zoning, masking, FC, multipathing, etc. And this doesn’t include all the work administering the server side. Here’s my recent post comparing upgrading firmware between Nutanix and Cisco UCS.

Table 6 shows the 5-year TCO chart for the mortgage customer including a conservative estimate of reduced administrative cost.

10 Cumm TCO

Table 6: TCO of Vblock 320 vs. Nutanix NX-6260

HC: In addition to slashed costs for rack space, power and cooling, HC is managed entirely by the virtualization team – no need for specialized storage administration tasks.

9.  SANs have a higher risk of downtime / lost productivity

RAID is, by today’s standards, an ancient technology. Invented in 1987, RAID still leaves a SAN vulnerable to failure. In some configurations, such as RAID 5, two lost drives can mean downtime or even data loss.

Both disks and RAID sets are getting larger. Disk failures require longer rebuilds, increasing both risk to performance along with another failure taking out the set.

And regardless of RAID type, a failed storage controller cuts SAN performance in half (assuming two controllers). Lose two controllers, and it’s game over.

11 BEarena tweet

Sometimes unexpected events such as a water main breaking on the floor directly above the SAN can create failure. And firmware upgrades, in addition to being a laborious process, carry additional risk of downtime. Then there’s human error. Array complexity makes this a realistic concern.

As demands on the array increase over time, the older SAN technology becomes still more vulnerable to disruption or outright failure. Even temporary downtime can be very expensive.

HC: Rather than RAID striping, an HC solution such as Nutanix includes replication of virtual machines onto two or three nodes. A lost drive or even entire node has minimal impact as the remaining nodes rebuild the failed unit non-disruptively in the background. And the more nodes that are added to the environment, the faster the failed node is restored in the background.

10.  Downsizing Penalty

Growth is not the only source of SAN inefficiency; downsizing can be a problem as well. Downsizing can result from decreased business, but also from a desire to move workloads to the cloud. The high cost and fixed operating expenses of a SAN make it difficult to justify reduced workloads.

HC: Customers can sell off or redeploy their older, slower nodes. This minimizes rack space, power and cooling expenses by only running the newest, highest-performance nodes. The software-defined nature of HC makes it easy to add new capabilities such as Nutanix’s “Cloud Connect” which enables automatic backup to public cloud providers.

The Inevitable Transition from SANs to HC

SANs were designed for the physical world, not for virtualized datacenters. The reason they proliferate today is that when VMware launched vMotion in 2003, it mandated, “The hosts must share a storage area network”.

But Moore’s Law marches relentlessly on. Hyper-convergence takes advantage of faster CPU, memory, disk and flash to provide a significantly superior infrastructure for hosting virtual machines. It will inevitably replace the SAN as the standard of the modern datacenter.

Thanks to Josh Odgers (@josh_odgers), Scott Drummonds (@drummonds), Cameron Stockwell (@ccstockwell), James Pung (@james_nutanix), Steve Dowling and George Fiffick for ideas and edits.

10 Reasons why Nutanix leads the hyper-converged industry

When I started this blog site last October, hyper-converged infrastructure (HCI) was still a fringe technology. Just five months later and HCI has entered the mainstream. Rather than fielding questions about hyper-convergence, the inquiries I get today are much more often about what sets Nutanix apart from the rapidly growing pack of HCI players?

Nutanix isn’t just the HCI front-runner; it has a 52% market share. Here are my top ten differentiators as to why Nutanix leads the hyper-converged industry:

1.  A true distributed file system with roots to Google and GFS

Nutanix brings the same type of no-SAN distributed file system infrastructure developed by Google, and now utilized by all leading cloud providers, to the enterprise. The commodity hardware, “share nothing” Nutanix Distributed File System (NDFS) model enables capabilities such as self-healing and data locality that uniquely position Nutanix for enterprise requirements.

 2.  Google-like transparency

Google published papers in 2003 about how it redefined infrastructure with GFS including innovative technologies like Map Reduce and NoSQL. Nutanix is similarly transparent about how our technology works. No secrets, no politics, no misleading claims. (As an example, see www.nutanixbible.com).

3.  Passionate employees

There are 1,648 billionaires on the planet, but only 190 VCDXs. Eleven of them work at Nutanix with more coming. These folks can pretty much work wherever they want. They’re joining Nutanix because of the technology, the culture and the opportunity to be part of revolutionizing the virtualized datacenter.

4.  Dedication to channel partners

Nutanix has been a channel focused company from day one. We strive to help our partners to not just be competent Nutanix resellers, but rather to be leaders in the new era of Web-scale HCI and associated cloud integration. We work with them to achieve differentiation, customer trust and expanded skill sets with programs such as Breakaway, Authorized Consulting Partner, and the recently launched NPX certification.

The learning, of course, goes both ways. We listen to our partners about how we can better serve their customers both today and in the future.

5.  A focus on an uncompromisingly simple customer experience

FW 9

Nutanix doesn’t just sell products, we deliver a customer experience. Nutanix simplifies the lives of IT administrators. Nutanix’s HTML5 Prism UI, for example, is beautifully designed and exceptionally intuitive and easy to use. Or consider firmware upgrades. Even the most advanced 3-tier architectures still require painful and risky firmware upgrades. But Nutanix changes the game with one-click, non-disruptive upgrades for all components including hypervisors.

6.  Industry-leading customer support

Most manufacturers claim that they emphasize customer service. Our customers and partners validate again and again that Nutanix takes support to a whole new level. Nutanix has a +88 Net Promoter Score and is the two-time winner of the Omega Northface Award for “Delivering World Class Customer Service.”

7. A furious pace of innovation

Nutanix’s engineering department doesn’t have a lot of ex-storage folks. Instead, engineers with backgrounds from companies such as Google, Facebook and Twitter build massively scalable, very simple and low-cost infrastructure. It’s a completely different mindset, and it leads to rapid development in response to customer requests.

While some innovations, such as the industry’s first HCI all-flash node, incorporate hardware form factors, most are delivered strictly via software (Tesla-style). Recent examples include Metro Availability (for active/active datacenters), encryption, MapReduce Deduplication, Cloud Connect, shadow volumes, Plugin for Citrix XenDesktop, etc.

8.  Multi-hypervisor support

Some HCI manufacturers claim to be hypervisor agnostic though they only work with one hypervisor. Nutanix started with VMware in 2011, began supporting KVM in 2012 and then Hyper-V in 2013.

9.  Dell validation

While Nutanix isn’t the only HCI producer to work with legacy manufacturers, we go far beyond just providing reference architecture. Out of all the HCI players, Dell approached Nutanix to establish a true OEM partnership. The Dell XC Converged Appliances – powered by Nutanix software were painstakingly vetted by both manufacturers to ensure the same high standards of quality, simplicity and support that customers receive with Nutanix-branded appliances.

10. A Web-Scale mindset

The legacy storage manufacturers treat HCI as a storage option in a large line card. Nutanix eats, breathes and sleeps HCI as not only a vastly superior platform for hosting a virtualized datacenter, but as the inevitable future.

Nutanix’s success with HCI has already shaken up the industry as almost every large legacy storage manufacturer now has, or has announced, an HCI solution. And Nutanix’s efforts resonate with customers who tend to have a passion for the technology rivaling that of Nutanix’s own employees.  Nutanix’s first customer conference, Nutanix.NEXT, this June in Miami is chock full of customer presentations. I have six different customers speaking on my ROI panel alone.

 

Thanks to @vmmike130, @Sandeep_NTNX, @farazip and @vEd_NYC for edits and suggestions.

 

 

EMC, Pure and NetApp weigh in on Hyper-converged infrastructure

Nearly every leading legacy and startup datacenter hardware player has, or has announced, a Hyper-Converged Infrastructure (HCI) solution. But how do they really see HCI?

Yesterday provides some clues: An article from The Register discusses declining array sales; a blog post from EMC President of Global Systems Engineering, Chad Sakac, covers the new VCE HCI announcements; and a post from Pure Storage Chief Evangelist, Vaughn Stewart, makes a case for why HCI won’t replace storage arrays.

Disk Array Disarray

Chris Mellor’s article in The Register, Disk array devastation: New-tech onslaught tears guts from trad biz, reveals what is perhaps a significant reason that the storage manufacturers are entering the HCI market, “An EMC chart shows a steep decline in legacy SAN drive array sales.”  The article goes on to say, “EMC sees the market moving “toward converged and hyperconverged systems, all-flash arrays and purpose-built back-up appliances.”

Sakac Tweet

Chad Sakac’s post, “A big day in converged infrastructure,” discusses how EMC’s Vblock is helping the company address the sea change in storage. The post was not clear (at least to me) about how Vblocks will incorporate HCI – but Sakac left no doubt that they will, “This is the experience of an ‘engineered system’ like a Vblock or a VxBlock – whether it’s converged, or hyper-converged.”

Sakac also references both VSPEX Blue and EVO:Rack – both of which, along with Vblock, are now part of EMC’s VSPEX converged infrastructure division.

Pure Storage

Vaughn Stewart, former Cloud Evangelist atNetApp, wrote an interesting post yesterday about HCI, Hyper-Converged Infrastructures are not Storage Arrays. Stewart starts off endorsing HCI, “I’m a Huge Fan of Hyper-Converged Infrastructures,” but then quickly changes course and relegates the technology to “the low end storage array market.”

Stewart goes on to outright bash HCI – making an argument that data mirroring on a virtual disk basis is inferior to RAID (a technology invented in 1987). Stewart also presents lots of calculations claiming low storage utilization and other supposed HCI limitations.

Vaughn tweet

I’m not going to address Stewart’s claims in this post; they may very well be applicable to other HCI players. They do not apply to Nutanix. Josh Odgers (aka FUDbuster) is writing a post in response to Vaughn’s piece.

Stewart made no mention in his article about Pure’s own apparent plans to introduce an HCI solution.

NetApp

Since NetApp’s Mike Riley wrote the post, VSAN and Hyper-Converged will Hyper-Implode, last June, it’s unfair to assume that it reflects NetApp’s current day perspective on HCI. On the other hand, even when NetApp unveiled ONTAP EVO:Rail a few months ago, the company made it clear that HCI, without NetApp storage, is not suitable for the enterprise.

Duncan Tweet

A Question of Mindset

Sakac, Stewart and Riley are among the most respected technologists in our industry. But they also work for array manufacturers and naturally see the world through the lens of protecting legacy business.

The tremendous gain in mind share of HCI is driving the storage players to enter the market. This further validates the technology even though the array manufacturers position HCI as a low-end alternative to disk or flash arrays.

Nutanix, on the other hand, eats breaths and sleeps web-scale HCI in all that we do. It’s a question of mind set. The array manufacturers offer customers yet another storage option. Nutanix is revolutionizing the virtualized datacenter.

 

 

 

 

 

IBM jumps on the hyper-converged bandwagon

IBM jumps on the hyper-converged bandwagon

Last week’s announcements further show that HCI has gone mainstream.

One of the world’s largest and most storied legacy players, IBM, said it is investing $1 billion in SDS. This BusinessInsider article, In another brilliant move, IBM just budgeted $1 billion to take down EMC,  discusses IBM’s strategy. It also features Nutanix as the “poster child for this new market”.

In the introductory article to this blog site, I described how seven legacy datacenter manufacturers control $56 billion of the annual $73 billion server and storage market. Here’s an updated status of their participation in the HCI space:

HP:                         StorVirtual & EVO:Rail

IBM:                      Announced HCI strategy

Dell:                       Dell XC (Nutanix OEM) & EVO:Rail

Oracle:

Hitachi:               EVO:Rail

Cisco:                    Teamed with Maxta & Simplivity. Investment in stratoscale.

EMC:                     VSPEX Blue & ScaleIO

NetApp:               ON TAP EVO:Rail

As IBM’s server business transitions to Lenovo, the Chinese giant should replace IBM on the list. Lenovo hasn’t yet announced an HCI offering – but undoubtedly it will.

Springpath

Springpath, another HCI start-up, came out of stealth mode last week. Formerly known as Storvisor, Springpath was founded by a couple of VMware veterans (maybe they decided to grab the new name since VMware spun off SpringSource to Pivotal?).

Springpath has $34 Million in funding from Sequoia, Redpoint and other VCs. VMware’s Duncan Epping wrote a complimentary piece about the company on Yellow-Bricks, though Forbes was somewhat critical. I do think that their subscription model is intriguing.

FalconStor

Last week also saw an erroneous headline claiming that small but 15-year old IT company, FalconStor, announced a hyper-converged solution. The new FreeStor is actually a “horizontal converged data services platform”, but it just goes to show how hyper-convergence has become so top-of mind.

Other HCI Players

In addition to industry leader Nutanix and the large legacy players and their partners mentioned above, the other manufacturers who have, or who have announced, HCI solutions include:

Atlantis Computing

Citrix Sanbolic (recent acquisition)

DataCore

Huawei (partnered with DataCore)

NIMBOXX

Pivot3

Pure Storage

Scale Computing

StorMagic

VMware VSAN