Battle of the Archiving Models - Part 1: On-premise vs. Hosted vs. Cloud Computing
This will be the first of a blog series, "Battle of Archiving Models." To start it off, we will cover the basics; on-premise vs. hosted vs. cloud computing. Below is a general description of all three, and the benefits that each model offers.
It’s been said that we generate more data on a daily basis than all of the data combined from the beginning of time until 2003. Most of that is attributed to you and I and what we do every day. From our daily emails, Facebook posts, tweets on Twitter, to our connections on LinkedIn, the files we download and the pictures upload, this all amounts to an enormous data load.
We in IT have struggled for years to address the ever-growing impact of user-generated data. Email is a great example; within the past decade, the average corporate mailbox allocates over 10 gigabytes of storage—and the trend shows no sign of relenting. Organizations now want to provide users with “unlimited” mailboxes that could contain hundreds of gigabytes of data.
IT is also tasked with managing new data types. Priorities related to improved collaboration, data loss prevention, and intellectual property protection are driving us to move files from client storage (laptop and hard drives) to IT-managed storage. So now, IT is being called upon to address the unmitigated rise of social media, blogging, and mobile communications as business tools.
Regulations and Their Consequences
The explosion in IT-managed data has occurred during a period when regulators are placing new requirements on the handling, retention, and disposition of content. For example:
The United States Federal Rules of Civil Procedure (FRCP) require that organizations of all sizes maintain data archives that are readily accessible in the event of litigation
The Sarbanes-Oxley Act (SOX) requires that companies preserve a variety of correspondences (including email messages) for a period of seven years
The Financial Regulatory Authority (FINRA) and Securities and Exchange Commission (SEC) place numerous restrictions on financial services firms related to the management and preservation of email, instant messaging, and social media data.
The Health Insurance Portability and Accountability Act (HIPA) requires that companies operating in the healthcare industry retain certain communications and documentation (which can include email messages and attachments) for a minimum of six years.
In addition to the examples cited above, industries ranging from energy to state and federal government to education and the non-profit sector all fall under specific regulations that govern how user-generated content should be managed, and the penalties for non-compliance have never been higher:
In 2004, the SEC fined Bank of America $10 million for failing to retain and produce emails in accordance for SEC regulations.
In 2008, non-compliance with FRCP mandates compelled a judge in the United States to award $29 million to a plaintiff in a suit against UBS Warburg.
In 2010, FINRA fined Piper Jaffray and MetLife Securities a combined $2M+ for email-related failures. While the amounts may be modest in the scheme of the financial services industry, we should consider the reputational impact to the companies; particularly after FINRA published press releases referring to “supervisory and reporting violations” as well as “investigations of broker misconduct.”
More Storage, More Problems
While the impact of fines and the publicity they generate should not be understated, they are dwarfed by the costs born by all customers as the storage of user-generated data places an ever-increasing drain on our budgets, focus, and productivity.
Organizations have tried to address the data explosion by buying more storage. But after years of this, it has been proven that on-premise storage systems are considerably more expensive than a line item on a budget would ever reflect.
Beyond the direct and indirect costs of storage systems, maintaining large data stores also impedes the performance of our IT infrastructure and applications. Mail systems buckle under the weight of giant data stores. Network latency increases as backup windows span an ever-growing portion of the business day. Overall, businesses risk losing profits when critical applications are slow and unstable.
In order to address this “perfect storm” of unprecedented growth in unstructured data, most organizations have found that information archiving represents the only viable solution.
Archiving is the Answer
At a minimum, information archiving satisfies regulatory requirements and reduces the burden placed on IT applications, such as email. In most cases, archiving provides significant storage and infrastructure cost savings, and in some cases, it enables IT to redirect focus and resources away from infrastructure and toward value-added activities.
With this solution, archiving is no longer a value-added service for IT; it is an essential component of the IT portfolio, and it is required to tame skyrocketing storage costs while maintaining compliance. Now comes the next step, which archiving vendor do you choose; traditional on-premise archiving, hosted archiving, or cloud-powered archiving.
With the traditional on-premise model, archiving systems are completely located within a businesses’ data center, and the business maintains responsibility for the installation, configuration, and operation of the archiving system and underlying infrastructure. With on-premise systems, customers experience fairly rapid migration of legacy data—attributable in large part to the physical proximity of the archive system to the legacy data store.
The on-premise archiving model was the most popular model for early adopters of archiving solutions (particularly large financial services customers in the early 2000s). Due to the cost and complexity of the systems, which require investments in hardware, software, and storage as well as ongoing operations and support, adoption of this model has been waning as organizations are opting for a third-party archiving service.
In the hosted model, archiving systems are housed within an archiving vendor’s data center. Unlike the on-premise model, customers are not required to install, configure, or maintain the archiving system or its underlying infrastructure—the vendors manage these activities on behalf of the customer.
With this service, the customer only needs to be concerned with capacity management to the extent that it impacts pricing. Otherwise, hosted vendors shoulder the burden of capacity management. Customers can also focus on activities related to the archiving process and functionality, such as defining retention policies, searching for specific content, and exporting data for discovery.
The benefits of this solution is that it reduces IT complexity and offers cost savings relative to on-premise systems. It is also a fairly low-risk evolution of the legacy model in that (unlike cloud-powered archiving, discussed below), the archiving system leverages traditional infrastructure technologies. However, this solution comes with many of the same issues on-premise systems have; capacity management, service availability, and large capital expenses.
Rather than operating their own infrastructure, cloud-powered archiving vendors build their applications to operate on top of cloud infrastructure from third parties, such as Amazon or Rackspace. In this model, neither the customer nor the archiving vendor operates physical infrastructure directly. The archiving vendor builds and maintains an archiving system (software layer) that is operated on top of cloud infrastructure.
Of the three archiving models, the cloud-powered approach best capitalizes on the proposition values specializing in scale and elasticity. The infrastructure vendor, archiving service provider, and businesses are able to focus on core competencies, i.e. operating data centers, developing archiving software, and facilitating business processes, respectively. Likewise, the cloud vendor procures and operates infrastructure at tremendous scale, enabling them to offer the lowest prices in the market. Finally, cloud-optimized technologies such as ElasticSearch and Chef enable archiving vendors to maximize availability performance based on their customers’ real-time processing, bandwidth, and storage requirements.
Some Things are Certain in IT...
Moving forward, the volume of user-generated data will only continue to increase. The number of restrictions placed on the management of that data will also go up, along with the number of requests (and demands) for data to support litigation, compliance, and business intelligence. IT leaders need to be prepared for the convergence of these trends that, if left unaddressed, will drain the productivity of their teams, increase storage expenses, and put the reputations and financial viability of their organizations at risk.
For most organizations, the only way to effectively address the data explosion is with a robust and effective archiving system. Fortunately, customers have their choice of numerous vendors and at least three archiving models in the market today that each offers unique benefits. IT leaders should choose the archiving option that best suits their needs and budgets, but this should be done relatively soon—before an audit, discovery request, or regulatory inquiry arises that makes them wish they had.
blog comments powered by