TopNav + search

Messaging Newswire

Bi-monthly email newsletters
on email security & collaboration

Latest Newswire Issue
Subscribe to Newswire
Newswire Back Issues
Advertise

Messaging News Magazine

Messaging News Magazine

Subscribe to Magazine
Back Issues
Advertise

Stormy Daze: Preparing for Disaster Recovery

By Melisa LaBancz-Bleasdale

Companies across the globe inevitably face, and must plan for, technology failure, malicious human activity, and the unpredictable ravages of Mother Nature.

It has been two years since hurricane Katrina landed on the Gulf Coast. Despite a relatively calm 2006, the uncertainty of the unknown is top of mind for many businesses worldwide. The devastation of Katrina asked a very sobering question: Are we prepared for the completely unexpected?

When talking about email disaster recovery, the word "disaster" connotes hurricanes, yet natural disasters are the least likely cause of email downtime. In a recent email availability survey commissioned by Illinois-based Neverfail, Osterman Research found that hardware failure ranked as the number one cause of downtime (61 percent). Number two was network failure (59 percent), and tied for third place were power failure (45 percent) and application failure (44 percent). In 2006, a Neverfail commissioned survey by Open Sky Research also found that hardware and network failure were cited as the top two reasons for IT failure. Adds Rhoda Phillips, research manager with IDC, "We are finding that while natural disasters occur, a larger portion of disasters are caused by human beings. There's a lot of human error."

Defining Disaster

So what does "disaster" mean for today's businesses? Steve Lewis, CEO of Teneros points out that disasters are not merely incidents making front-page news. In business, disaster is defined as any event that causes a site to fail, people to stop working and brings business processes to a halt. These events include power outages, network outages, manmade outages (whether legitimate or maliciously induced) and of course, storms.

Brian Mullins, director of corporate communications for Marathon Technologies explains that although businesses and government agencies have established disaster recovery systems and processes in preparation for the "big one", the reality is many are under prepared. "In the past year or so, organizations have come to realize that the same systems and processes currently in place are inefficient at addressing the more common, and more likely, mini-disasters—such as network failures and disk crashes."

How High is High Availability?

Today, high availability is the term du jour. Yet the concept of high availability has "Been around forever," says Phillips. "It's a term that gets mixed in with a lot of solutions. Plus there are a number of ways to achieve it. But, are we doing anything differently with the concept of high availability than we did ten years ago? No. What is new about high availability are the companies focused on email recovery and archiving."

Andrew Barnes, senior VP of corporate development for Neverfail, agrees that high availability has been a well-used term in the IT industry for many, many years. "Typically, it is used in the context of on-line transaction processing systems, for example, ATM networks, financial systems, etc.," says Barnes. "It describes how IT tries to achieve 24x7 operations for critical systems. During recent years, email has come to be regarded as a critical system."

Barnes notes that Neverfail adopted the term high availability in its messaging several years ago to draw attention to the need for users to remain connected to email applications—even if the system is suffering an outage. "Our view is that no matter what the cause of downtime, email should be available to the business," states Barnes. "High availability and disaster recovery are very much related, allowing manual or automated failover to a secondary system whether the outage is planned (such as maintenance) or unplanned (such as a disaster). We like to think of the combination of the words as continuous availability. Failover should be seamless and enable users to continue working. For any type of planned or unplanned downtime, Neverfail provides the ability to seamlessly failover to a secondary server, while users remain connected to working applications."

Brian Mullins, director of corporate communications for Marathon Technologies points out that high availability may add undue complexity in the form of vendor solutions. "Failover high availability solutions are often complex to set-up, test and maintain. They also can require application scripting or modifications to properly failover. During the minutes it takes to failover, Microsoft Exchange is not available to users, application state is lost, and data that was in transit during the disaster may also be lost." Mullins notes that Marathon's high availability approach allows email applications like Exchange to compute through the disruption, preventing loss of data and application state. "We provide fully automated fault handling and policy management. Once our software is installed, email just keeps going and going—and you don't have to modify your email application in any way."

Understanding Percentages

High availability is usually expressed as a percentage of uptime in a given year. The availability of a system is determined by tallying the number of minutes of unplanned downtime; divided by the total number of minutes in a year. According to Lewis, IT teams concerned with high availability like to use high availability percentages to describe the reliability of systems. "It is often expressed in terms of the "nines" of reliability," explains Lewis. "As a way to describe the amount of actual downtime a company will experience per year. For example: 3-nines is 99.9 percent annual availability or uptime, which translates to 8.76 hours of actual downtime per year." (See table opposite page.) Lewis notes that "Teneros appliances provide 5-nines of uptime for Microsoft Exchange—it's a target that dazzles IT people and that most vendors cannot offer. Downtime translates to lost productivity and revenue dollars so high availability is a critical business metric."

When "Good Enough" Isn't

If disaster recovery and high availability have long been a business mantra, can today's organizations weather the storm (whatever the storm may be) better than in the past? "Years ago when this whole disaster recovery push started, it wasn't really about hurricanes. When I started in the industry, a disaster was if Morgan Stanley's trading system went down. That's what IT managers were trying to prepare for," Phillips explains. "I believe that post 9/11 people are much more aware and sensitive to the problem of disaster." Since then, Phillips thinks the concept of risk management has been elevated. How do we prepare for risk? How do we define needs to be backed-up and protected. "This awareness is fairly high and is (at least moderately) well-addressed. In order to keep business processes running, the applications and data feeds that people use must also be protected and operational for business continuity to be a reality. The enterprise segment of the market—which has budget dollars for teams of IT people and equipment—has embraced the need for continuity. But the midmarket and SMEs have neither the personnel nor equipment budgets for continuity solutions." Adds Phillips, "Business continuity can be expensive, so it's often a budgeting issue or a personnel issue that influences the decisionmaking process. Do companies have people with the expertise to install a business continuity system? Do IT managers tasked with researching, purchasing and implementing these systems, understand what's available?" Frequent Testing Mullins points to a common disaster recovery weak spot plaguing today's companies: "A risk? "Disaster could be the unthinkable, or it could be a disgruntled employee. I think that companies are much more aware and they recognize how important it is to have some sort of business continuity plan in place," says Phillips.

Barnes agrees that organizations are more prepared to weather a disaster as a result of high profile events such as Hurricanes Katrina and Rita, 9/11, and terrorist bombings. In early 2007, Neverfail commissioned another survey by Open Sky Research, which found that 59 percent of survey respondents reported having a high availability/disaster recovery plan in place. This is a 9 percent increase from the 2006 survey. "However," Barnes cautions, "Having a plan is one thing, implementing solutions is another. That same research identified that although a majority of companies said they had a plan, 75 percent of organizations did not believe their plans and solutions were sufficient, with 68 percent expecting to address this issue."

While Lewis thinks some companies ignore the threat of disaster and local server failure, overall, organizations today have a better understanding that data needs to be backed-up and protected. "This awareness is fairly high and is (at least moderately) well-addressed. In order to keep business processes running, the applications and data feeds that people use must also be protected and operational for business continuity to be a reality. The enterprise segment of the market—which has budget dollars for teams of IT people and equipment—has embraced the need for continuity. But the midmarket and SMEs have neither the personnel nor equipment budgets for continuity solutions."

Adds Phillips, "Business continuity can be expensive, so it's often a budgeting issue or a personnel issue that influences the decisionmaking process. Do companies have people with the expertise to install a business continuity system? Do IT managers tasked with researching, purchasing and implementing these systems, understand what's available?"

Frequent Testing

Mullins points to a common disaster recovery weak spot plaguing today's companies: "A prevalent theme is that although organizations put the right systems and processes in place, they are not testing them on a regular basis." This failure to test the system can be devastating. Depending on the size and scope of disaster, a company might have to operate from its emergency system for a matter of hours to months. "If your system doesn't work, the losses could be incalculable," says Mullins. MB/TMP