disaster recovery

26
COMPUTERWORLD EXECUTIVE BRIEFINGS EXECUTIVE GUIDES FOR STRATEGIC DECISION-MAKING STRATEGIC INSIGHTS FROM THE EDITORS OF COMPUTERWORLD High-Availability IT Strategies for keeping your systems working 24/7 and protecting corporate data. Disaster Recovery & INTRODUCTION Continuity, Availability and Security . . . . . . . . . . . . . . . . . . . . . . . . . .2 RECOVERY STRATEGIES Rising From Disaster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4 Five Classic Mistakes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7 Realistic Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9 Synchronizing With Suppliers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11 E-mail Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .13 SAFE & SECURE Storage Security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .14 Long-Distance Data Replication . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17 Advances in Tape Backup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .19 Backing Up the Edge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .21 EMERGING TECHNOLOGIES Grid Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .23 MAID Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .25

Upload: api-3709999

Post on 13-Nov-2014

456 views

Category:

Documents


3 download

TRANSCRIPT

COMPUTERWORLD

EXECUTIVEBRIEFINGS

EXECUTIVE GUIDES FOR STRATEGIC DECISION-MAKING

STRATEGIC INSIGHTS FROM THE EDITORS OF COMPUTERWORLD

High-Availability ITStrategies for keeping your systems working 24/7and protecting corporate data.

Disaster Recovery &

INTRODUCTIONContinuity, Availability and Security . . . . . . . . . . . . . . . . . . . . . . . . . .2

RECOVERY STRATEGIESRising From Disaster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4Five Classic Mistakes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7Realistic Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9Synchronizing With Suppliers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11E-mail Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .13

SAFE & SECUREStorage Security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .14Long-Distance Data Replication . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17Advances in Tape Backup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .19Backing Up the Edge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .21

EMERGING TECHNOLOGIESGrid Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .23MAID Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .25

dpetersen
Stamp
dpetersen
Oval
dpetersen
Text Box
Compliments of

Introduction Computerworld Executive Briefings 2

Computerworld editor in chief Don Tennant n Executive briefings editor Mitch Betts n Designer Julie Quinn n Design director Stephanie Faucher n Managing editor/production Michele Lee

DeFilippo n Copy editors Bob Rawson, Eugene Demaître, Mike Parent, Monica Sambataro

DISASTER recoveryand data protectionare like parenting —neither job is everreally finished. This

report takes a wide-ranginglook at high-availability IT anddata protection in an effort tohelp you meet this never-end-ing — and interrelated — setof responsibilities.

Obviously, the goal is to pro-vide the IT infrastructure anddata that your business needsto operate (and better yet,thrive). That means avoidinginfrastructure downtime; en-suring business continuitywhen there is unavoidabledowntime (such as in disas-ters); and protecting the cor-porate data (such as backupand long-distance data replica-tion).

Consider this: Some monthsafter the Sept. 11, 2001, terroristattacks, the CIO of a large WallStreet law firm — located onlyblocks from the collapsedWorld Trade Center towers —talked about the tremendousoutpouring of sympathy andconcern from hundreds of theattorneys’ clients in the 24hours after the disaster.

Then Day 2 dawned, and thestory changed. The clientswho called wanted reassur-

ances that their files were safeand that business wouldpromptly get back on track. Itwas a reminder that even amajor disaster has a short shelflife as an excuse in the busi-ness world. What matters mostis the speed and effectivenessof recovery.

High AvailabilityIBM has gotten the message.IBM recently launched an ini-tiative focused on ensuringthat the iSeries and its otherserver lines are highly avail-able, an area of increasing in-terest among users who can’tafford system downtime be-cause of round-the-clock glob-al supply chain demands.

“In many cases, our clientsmay be trying to do high avail-ability but don’t have all thepieces put together to make ita truly resilient set of infra-structures,” says John Reed,the IBM executive who was re-cently picked to lead the de-velopment of the company’sHigh Availability Design Cen-ter. He says the center couldlead to new products, servicesand business partnerships.

Part of the plan involves as-sembling best-practices guid-ance and tools, according toReed. IBM will conduct systemassessments, and help usersdefine and develop high-avail-ability architectures and runapplication benchmarks, hesays.

Gerald Lake, a program-mer/analyst at Sovereign Spe-cialty Chemicals Inc.’s Buffalooperations, says increasing de-mands from outside auditorsfor IT redundancy haveprompted his company to im-prove system availability. Aspart of a server consolidation

project, Sovereign convertedan iSeries machine located at adifferent facility from the onethat houses its primary serverinto a backup system, Lakesays.

But Lake is eyeing IBM’splan warily. “The way IBMcharges so heavily for every-thing, I think a lot of peopleare going to continue to doeverything on their own,” hesays.

Growing ImportanceIT availability has always beenan important IT function, butthe stakes are even higher inan environment where IT isbeing used for real-time analy-sis and business transactions,the IT infrastructure continuesto get more complex and regu-lators are keeping a close eyeon business continuity. Someextremely IT-dependent com-panies have driven towardfive-nines’ availability, orabout five minutes of down-time per year. Most IT organi-zations operate at three- tofour-nines’ availability, orabout four to eight hours of

Continuity, Availability and

Security

INTRODUCTION

Causes of Human ErrorIDC says 25% of IT downtimeoccurrences are caused by hu-man operator error. The typicalproblems include the following:

n Complex or inadequateoperational processes

n Lack of training

n Poor organizational struc-ture and communications

n Overextended staff

Introduction Computerworld Executive Briefings 3

downtime per year.You should do your own

downtime and vulnerabilityassessments, of course, but re-searchers at IDC say that, onaverage, IT organizations findthat downtime has the follow-ing causes:

45%of downtime occur-rences are caused by

application-related failure, such asapplication software, database soft-ware, Web servers or middleware.

25%of downtime occur-rences are caused by

human operator error.

30%of downtime occur-rences are caused

by a hardware component failure,involving the network, a server or adesktop PC.

Preventing unplanneddowntime requires a combina-tion of remedies, includingmore training for IT staff, andpushing vendors for technolo-gies that are better at identify-ing IT service problems thatcan disrupt the business.

“IT must recognize that

downtime is caused by severalfactors, controllable and un-controllable,” IDC says. “Tech-nology process, human error,and external factors such asnatural disasters, terrorist at-tacks and power failures pres-ent IT with difficult downtimescenarios,” IDC says.

In fact, we really need to re-define the concept of disaster,says Mike Mulholland, co-founder of Evergreen Assur-ance Inc. in Annapolis, Md. It’sreally any event that blocks ac-cess to corporate data and ap-plications. By that definitiondenial-of-service attacks andplanned downtime are also ITdisasters, he says.

Three Tiers of ProtectionImproving from four-nines’availability to five-nines’ avail-ability can be extremely ex-pensive, and sometimes notworth the effort, IDC notes.Are you overspending on dis-aster recovery? It seems like aridiculous question. Newspa-per headlines throw more risks— and regulators throw more

requirements — in your facealmost every day. But it’s pos-sible to overspend on disasterrecovery, especially if you lis-ten to every vendor saying youmust do x, y and z to complywith the Sarbanes-Oxley Act.

Tim DeLisle, managing prin-cipal at Corigelan LLC, a disas-ter recovery consultancy inChicago, says the way to avoidoverspending is to establishthree tiers of disaster recoverybased on business require-ments. It begins with the CIOasking business managerswhich few applications are tru-ly critical and require recoverywithin 24 hours to keep thebusiness afloat. You don’t haveto mirror everything. The sec-ond tier of applications, whichrequire recovery in 48 to 72hours, may need only inexpen-sive tape backup, while thethird tier may need nothing atall, DeLisle says.

All that the business execu-tives and regulators really re-quire is that you take prudentsteps for business continuity.You don’t have to bankrupt thecompany.

User TipsIn this report we provide plen-ty of cost-conscious tips andinsider advice from IT man-agers who have faced disasterand recovered. Their experi-

ences raise questions youshould be able to answer. Forstarters:

n How strong is your disas-ter recovery documentation?What if the head of sales is theone who has to turn on thesystems in the data center?“We fashion our document soanyone in the business shouldbe able to restart an applica-tion,” says Elbert Lane, a leadsoftware developer at Gap Inc.in earthquake-prone San Fran-cisco.

n Which applications are re-ally the most important onesto restore first? At most com-panies, it’s probably e-mail, notthe SAP system or the Oracledatabase.

n How robust and ready arethe plans at your suppliers,your outsourcers, your busi-ness partners? Who’s checkingon them?

n What are your most criti-cal access issues? Getting tothe data, the systems or thepeople?

Disaster recovery is one testthat IT can ace — without bigbudgets or expensive consult-ants. It’s a matter of common-sense planning, attention toprocess and doing your disas-ter homework.

Rules of Thumb1. REDEFINE DISASTER. For busi-nesses today, a disaster is any eventthat blocks access to corporate dataand applications. It’s not just hurri-canes and terrorism. It includes de-nial-of-service attacks and planneddowntime, for example.

2. PRIORITIZE APPLICATIONS.Every company has specific applica-tions that are critical to keeping thebusiness running. Typical candi-dates for most businesses includee-mail and ERP systems. By priori-tizing your applications, you can al-locate your IT budget appropriatelyand protect what is most importantto your company instead of spread-ing that budget across noncriticalapplications.

3. WEB-ENABLE APPLICA-

TIONS. Whenever possible, mis-sion-critical applications should beWeb-enabled so employees can ac-cess them anywhere, anytime. Inthe event of a disaster that preventsstaff from entering the office, Web-enabled applications allow employ-ees, customers and partners to stayconnected.

4. MOVE DATA 100 MILES AWAY.In preparation for regional disasters,keep your data at least 100 milesfrom your primary site. You shouldalso replicate your data continuouslyto maintain complete data integrity.

5. AUTOMATE THE RECOVERYPROCESS. Whenever possible, au-tomate disaster recovery processesto reduce bottlenecks and humanerror.

SO

UR

CE

: M

IKE

MU

LH

OL

LA

ND

, E

VE

RG

RE

EN

AS

SU

RA

NC

E I

NC

., A

NN

AP

OL

IS,

MD

.

Causes ofHardwareFailureIDC says that 30% of downtimeoccurrences are caused by ahardware component failure.This is typically caused by oneof the following problems:n Physical changes

n Incorrect configurations

n Overworked circuitry

n Lack of redundancy

True CostsBe sure to analyze the truecosts of IT downtime by includ-ing the following metrics:

n Revenue impact

n Reputation impact

n Financial performance(short- and long-term)

n Staffing costs

n Process impact

n Overtime

n Travel

n Stock-market impactSOURCE: IDC, FRAMINGHAM, MASS.,NOVEMBER 2004

Recovery Strategies Computerworld Executive Briefings 4

ONE key to keepingyour business on itsfeet in a disaster isanticipating thesometimes cascading

effects a catastrophe can haveon your IT operation.

Take Miami-Dade County,for example. When a hurricanehit southern Florida in 1992,the county’s data center lostpower. Diesel generators hadoverheated when well waterran out because high windshad broken water mains andlowered the water table. ITmanagers later had air-cooledgenerators installed.

One of the problems withdisaster recovery, experts say,is that although most compa-nies have plans for commonscenarios — weather-relatedemergencies, headquarterslockouts and massive poweroutages — those plans aren’tregularly tested or communi-cated to end users. In fact, in arecent survey of 283 Comput-erworld readers, 81% of the re-spondents said their organiza-tions have disaster recoveryplans. But 71% of the respon-dents at companies with planssaid the plans hadn’t been ex-ercised in 2003.

It takes forethought to avoida business shutdown during adisaster.

Experts and users agree that there are steps you cantake to increase your chancesof coming through the most

common disasters unscathed.

Weather-Related Emergencies “If you look at why facilitiesfail [during weather disasters],it’s all pretty predictable. Theycall it an act of God, and I callit an act of stupidity,” says KenBrill, executive director of TheUptime Institute in Santa Fe,N.M.

Hurricanes threaten Miami-Dade County’s data centerevery year from June throughNovember, yet IT managersstill struggle with gettingeveryone to understand theimportance of disaster plan-ning. “The challenge we al-ways have is to make sure thestaff is completely involvedand we have participation,”says Ruben Lopez, director ofthe enterprise technologyservices department for thecounty.

Miami-Dade County givesitself a 56-hour window to testits disaster recovery plan eachyear by cutting over to its al-ternate data center and restor-ing data. It uses the time tofind deficiencies and later cor-rects them. “Business continu-ity and disaster recovery pre-paredness is all about figuring

out what your deficiencies areand how you’re going to fixthem. It’s not about how to getan A+ on paper,” says Joe Tor-res, disaster recovery coordi-nator for Miami-Dade County.He points out that it’s not thepeople he’s testing during adisaster recovery exercise butthe plan — “because you can’tdepend on the people beingavailable.” “You’re going togive them a book with instruc-tions, and they need to be ableto follow that,” Torres says.One step Miami-Dade has tak-en in that direction is to con-sider call-tree software thatcould help employees contactkey managers in an emergency.

Walter Hatten, senior vicepresident and technical servic-es manager at Hancock Bankin Gulfport, Miss., has focusedon consolidating his serverfarm and creating a redundantcommunications network foran area of the country that getshit or brushed by a hurricaneevery three and a half years.The 100-branch bank, withheadquarters on the Gulf ofMexico, is consolidating 500servers onto a Linux-basedmainframe to reduce recoverytime in a disaster.

“Just the sheer magnitude ofrebuilding 500 servers puts usat risk for not being able to doit quickly enough,” says Hat-ten, who chose Linux for itsopen standard and scalability.He says the mainframe will of-fer greater speed for recoveryof data, reducing the amountof time it would take to restoredata from days to hours.

Headquarters Lockouts Maria Herrera is chief technol-ogy officer at Patton BoggsLLP, a Washington-based lawfirm with 400 attorneys spe-cializing in international tradelaw. Because of the firm’sproximity to the U.S. Capitolbuilding, one constant concern

Rising fromDisaster

RECOVERY STRATEGIES

“Business conti-nuity and disaster

recovery pre-paredness is all

about figuring outwhat your defi-ciencies are andhow you’re goingto fix them. It’s

not about how toget an A+ on paper.”

JOE TORRES, DISASTER RECOVERY COORDINATOR, MIAMI-DADE COUNTY, FLA.

Recovery Strategies Computerworld Executive Briefings 5

is a building lockout broughton by terrorist threats, shesays.

Herrera has set up duplicateoperating environments inseveral remote offices and hascontracted with two disasterrecovery vendors: SunGardData Systems Inc. in Wayne,Pa., for server recovery andworkstation services, andAmeriVault Corp. in Waltham,Mass., for data backup.

AmeriVault recently in-stalled its CentralControl in-terface on desktops and anagent on each of Patton Boggs’servers. After completing aninitial full backup of all data,AmeriVault now performs dai-ly incremental backups ofdeltas, or changes, to disaster

recovery centers in Walthamand Philadelphia. In an emer-gency, data restores can beperformed remotely, even fromhome, by administrators usinga point-and-click function on aWeb portal provided byAmeriVault, or data can beshipped on tape for large re-stores.

“Every month or couple ofmonths, we access several doc-uments and download themfrom AmeriVault to test thesystem,” says Herrera. Duringfull testing, she spends 16hours recovering full data sets.“We’re able to restore every-thing within the firm in about10 hours,” she says.

Herrera also suggests in-volving all IT personnel in the

disaster recovery testingprocess, because in an emer-gency, you never know whomight be available to help. Shehas trained employees in allfour satellite offices aroundthe country on disaster recov-ery procedures.

SunGard also has several fa-cilities where IT personnel andlawyers can meet to continuework in the event of a head-quarters lockout, Herrera says.

Officials at Mizuho CapitalMarkets Corp., a subsidiary ofthe world’s second-largest fi-nancial services firm, MizuhoFinancial Group Inc. in Tokyo,say that some of the most ef-fective disaster recovery toolsare the simplest.

For example, when a protestkept employees from enteringthe firm’s Times Square head-quarters late last year, IT man-agers passed out laminatedbusiness cards with a directoryof managers’ home phonenumbers.

Doug Lilly, a senior telecom-munications technologist atthe Delaware Department ofTechnology and Information,says his agency has three datacenters that support about20,000 state employees. Thedepartment uses EMC Corp.’sSymmetrix Remote Data Facil-ity to replicate data among thedata centers. It also uses back-up software from Oceanport,N.J.-based CommVault Sys-tems Inc. as a central manage-ment tool.

“If this site were bombed . . .we’d have servers running to re-place them, but we’d still haveto restore data from tapes,” Lillysays. “CommVault’s softwaretransfers between 60GB and65GB of data per hour. It wouldbe a few hours before we gotpeople up online.”

Lilly’s IT team also keeps acopy of disaster recovery pro-cedures at home. “Team lead-ers notify everyone, and wecarry cell phones and Black-Berries that are on redundantnetworks,” he says. “It’s a pret-ty unified messaging platform. . . that ties data, voice, fax and

video into one application.They can get hold of us any-time, anywhere.”

Massive Power Outages Edward Koplin, an engineer atJack Dale Associates PC, anengineering firm in Baltimore,says a lack of disaster testing isthe No. 1 cause of data centerfailures during a blackout. Ko-plin suggests that companiestest their diesel generators of-

Earthquake Law Requires IT Response

A California law that mandatesearthquake-proof hospitals issparking massive investments in ITinfrastructure upgrades by healthcare companies in the state, start-ing with the hardening of data cen-ters but also including the deploy-ment of faster networks, wirelesssystems and other new technolo-gies.

For example, Sacramento-based Sutter Health expects tospend the better part of $1 billionon technology upgrades at its 26hospitals over the next 10 years asa result of the law, CIO John Hum-mel says. As the not-for-profitcompany rebuilds some of its facili-ties to comply with the law, it plansto invest in new bandwidth andstorage capabilities in an effort tomeet processing demands wellinto the future.

Mark Zielanzinski, CIO at ElCamino Hospital in Mountain View,says his facility is building a datacenter and demolishing its existingone as part of an overhaul of itsentire campus to meet the law’s re-quirements. The new data center isdue to be fully operational byMarch 2005.

In addition, the data center re-

construction prompted a serverconsolidation and upgrade project,Zielanzinski says. El Camino Hos-pital is consolidating more than150 smaller servers onto twoUnisys Corp. ES 7000 systems,each of which can support up to32 Intel processors. A matchingset of servers is being installed at anew disaster recovery site 120miles away, in more geologicallystable Sacramento.

The law, known as the CaliforniaFacilities Seismic Safety Act, waspassed in 1994 after the North-ridge earthquake struck north ofLos Angeles and caused $3 billionin damage to 23 hospitals. But themeasure is just now becoming anurgent matter for many health carecompanies, which must comply by2008 — or 2013 if extensions aregranted.

The California HealthCare Asso-ciation estimates that it will cost$24 billion to earthquake-proof orrebuild a total of about 2,700 hos-pital buildings throughout the state.IT costs could account for $2.4 bil-lion to $3.6 billion of that, saysGerard Nussbaum, a consultant atKurt Salmon Associates Inc. in At-lanta.

Practical Tipsn Choose vendors that areproactive and don’t requireprodding to upgrade or testyour disaster recovery plan.

n Don’t test people; test yourdisaster recovery plan. Peo-ple come and go. Make theplan easy to follow and use.

n After a disaster, don’tcount on employees beingwilling to fly to alternatework sites.

n Distribute key disaster re-covery personnel acrossmany geographic locations.

n Turn disaster recoverydata centers into activework sites.

n Disaster recovery plansare living, breathing things.Keep them up to date andmake sure employees arewell versed in them.

n Seek vendors with plentyof longevity and geographi-cally dispersed offices fordisaster recovery.

n Make sure portals to youroutsourcing vendor are dedi-cated or have enough band-width to handle multiplecompanies seeking fast re-stores.

n Make sure that not justyour vendor but you under-stand how to back up andrestore systems.

n Verify that backup tapescan restore data.

n Train and involve all ITpersonnel in the disaster recovery process.

Recovery Strategies Computerworld Executive Briefings 6

ten and at full load for as longas they’re expected to be inuse during a blackout.

The Uptime Institute’s Brilladds to that advice: Alwaysprepare for a blackout with atleast two more generators thanneeded, and test them by liter-ally pulling the plug. “I wouldtest it for as long as I expectedit to work under load. I’d do

that at least every two or threeyears. And I would run it inthe summer,” Brill says.

Jim Rittas, a security admin-istrator responsible for net-working at Mizuho, says thecompany can now perform fulldata restores after blackouts orother disasters in an hour in-stead of two days because itnow mirrors its data to a New

Jersey office that’s also an ac-tive work site. “The otherthing we did was diversify ourInternet connections. Internetconnections now flow in andout of New York and New Jer-sey, where we only had one inNew York before,” Rittas says.

Needham, Mass.-based re-search firm TowerGroup rec-ommends turning parts of dis-

aster recovery or business con-tinuity data centers into profitcenters by going with an ac-tive/active operations model.Traditionally, companies haveset up an active primary datacenter and unmanned backupsite. An active/active modeleliminates the need for ITstaffers to relocate in a disasterbecause they’re permanentlystationed at the disaster recov-ery site, which is also used torun active business applica-tions.

Integrating disaster recoveryIT assets and personnel intooperations budgets across geo-graphically dispersed data cen-ters will also help blur the linebetween disaster recovery andoperations spending.

It’s best to have a completecopy of your data in an alter-nate site at all times, “not justsome of it,” says Wayne Schlet-ter, associate director of globaltechnology at Mizuho CapitalMarkets. “You don’t want to bepiecing things together aftersomething happens. You justwant to be ready to go.”

Critical Success FactorsVIEW disaster recovery not as a tac-tical IT project but as a strategic ITasset. Both disaster recovery andbusiness continuity planning are es-sentially event-triggered, nonstrate-gic insurance policies that offer noreasonable expectation of a returnon investment.

TURN portions of your disaster re-covery sites from a cost center intoa profit center by splitting your busi-ness operations between your head-quarters and the alternate data cen-ter. This eliminates the need for staffrelocation and backup sites.

INTEGRATE disaster recovery IT as-sets and personnel into operationsbudgets across geographically dis-persed data centers to blur the linesbetween what is disaster recoveryand what is operational expense. Inall cases, disaster recovery must beviewed as a mission-critical ex-pense.

BUILD operational efficiency objec-tives into your disaster recoverystrategies.

CURRENT IT wisdom dictates thatIT departments use part of the capi-tal freed up by driving down opera-

tional costs to invest in IT projectsgeared toward building strategic ad-vantage. Instead, invest the resultingsavings in IT resilience across criti-cal lines of business.

SHARE the cost of disaster recov-ery with partners and fellow institu-tions. Disaster recovery isn’t a mat-ter of competitive differentiation oradvancement, but a matter of sur-vival. It may make sense for compa-nies to pool their assets and person-nel to provide resilience capabilitiesfor interconnected systems or col-laborative technologies such as pay-ments or check processing.

Here’s a strategy for the IT depart-ment to consider when determiningits role in closing the business conti-nuity gap:

1Make a priority list of appli-cations to optimize the recoveryprocess, taking into account the

resources required, cash flow andtime frames. Breaking down businesscontinuity in this way gives IT a rea-sonable framework that affords bothIT and management the opportunityto work together to determine the ap-propriate amount of effort to spendon closing the gap to an acceptablelevel.

2Address all physical andlogical vulnerabilities to re-duce the probability of disaster

and ensure information integrity, in-cluding building access, physical se-curity and firewalls. IT must show

management how specific vulnerabil-ity zones could affect the financialside of the business. To truly under-stand vulnerabilities and risks, the ITdepartment must lead the charge infinding the answer to this question:How far off are senior management’sexpectations from the reality of ITavailability?

3Validate IT availabilityservice levels, including re-covery-time objectives, recov-

ery-point objectives, system perform-ance, information access and deliv-ery, network performance and moni-toring, and security, to enable moreeffective business-unit continuityplanning. This allows the IT depart-ment to determine the business’scurrent capability in terms of com-pute utility restoration and lost data,in comparison to the business’s per-

ceived or desired baseline and theoperational, logistical and financialimpact of the business’s currentavailability vs. its desired availability.

4Coordinate, plan, docu-ment and practice within thebusiness units the synchro-

nization and reproduction of lostdata/transactions and manual re-en-try of data, taking into considerationthe organization’s needs. For exam-ple, the business continuity gap willbe larger when continuous availabilityis absent, which would be the casefor financial services companies. Ifonly best efforts are required, asmight be the case for some manufac-turing companies, the gap may be thesmallest. Optimum points of availabil-ity could be one of the following:

n Best efforts(could take days or longer)

n Traditional recovery (hours to days)

n Transaction protection (minutes to hours)

n High availability (minutes) n Continuous availability (al-

ways up and running with mini-mal information loss)

5Validate access to trans-portable information amongall business units, including re-

mote/alternate facilities and return tohome, a WAN/LAN, information ex-change via e-mail and the Web site.While validating the information andthe access to that information, organ-izations must not overlook the securestate of the infrastructure.

6Implement a more effectivemanagement process to sup-port the business continuity

program, paying special attention tocross-training and staff rotation, pro-gram currency and accuracy, and dis-tribution and access.

SO

UR

CE

: T

OW

ER

GR

OU

P,

NE

ED

HA

M,

MA

SS

.

Six Tips for Continuity Planning

SOURCE: MICHAEL CROY, DIRECTOR OF BUSINESS CONTINUITY SOLUTIONS, FORSYTHE SOLUTIONS GROUP INC., SKOKIE, ILL.

Recovery Strategies Computerworld Executive Briefings 7

DISASTER RECOVERY isan unpleasant task.And that makes it alow-priority projectin almost all compa-

nies, says Scott Lundstrom, ananalyst at AMR Research Inc.

“There are no users scream-ing over business continuity,”he says. “So given the firefight-ing nature of most IT organi-zations, [disaster recovery]never gets the resources it de-serves.”

Because disaster recoverytakes a back seat to other ITprojects, mistakes are boundto happen. We asked IT man-agers and other experts what’smost likely to be forgotten oroverlooked in disaster recov-ery planning. Here are the fiveclassics.

MISTAKE 1: Failing to do yourhomework. IT groups often neg-lect to ask users and line-of-business executives which ap-plications they need most.This leads to faulty assump-tions about disaster recoverypriorities. In particular, ITtends to assume that heavy-duty enterprise applicationsshould be restored first.

In reality, the most neededapplications may be muchmore basic — e-mail andscheduling tools such as Mi-crosoft Outlook, for example.How do you find out? Ask theusers. “The business itselfneeds a plan in case operations

are disrupted,” says ElbertLane, a lead software develop-er at San Francisco-based re-tailer Gap Inc. and a 20-yearveteran of disaster planning atseveral companies. “They’llneed procedures for doing pa-perwork, etc., so the questionis, How would they recover?That’s not just an IT issue, buta business [issue].”

The lesson: IT constantlyhears the term mission-criticalused in reference to CRM andERP software. But to find outwhich applications the usersreally want restored first, sim-ply ask them.

MISTAKE 2: Thinking it’s purely anIT issue. In a crisis, the per-formance of the IT staff maybe the least of a company’sworries. “A common assump-tion is that disaster recoveryand business continuity aresynonymous,” says Don O’Connor, CIO at SouthernCalifornia Water Co., a utilitybased in San Dimas. “They’renot.”

Even underprepared IT or-ganizations have done somethinking about what to dowhen disaster strikes. But canthe same be said of othergroups? “In my experience, ITcan respond relatively quick-ly,” O’Connor says. “The partthat’s missing is the users.”

The lesson: Company officersneed to understand that re-booting systems and recover-ing data is just one part of theproblem. Disaster recoveryplans need to include line-of-business managers and endusers who, in a crisis, will runthe business in the midst of ad-versity. “Too often, continuityis something we task IT with,”Lundstrom says. “It’s really abusiness issue.”

MISTAKE 3: Fighting the last war.If, as the saying goes, generalsare always preparing to fight

the last war, too many enter-prises spend their disaster re-covery budgets and energypreparing for the most recentcatastrophic event. While un-derstandable, this is self-de-feating; disasters are, by theirnature, well-nigh impossible topredict.

Recent history offers a com-pelling example. The Sept. 11,2001, terrorist attacks on theWorld Trade Center devastat-ed many New York-based fi-nancial services firms. Many

Five ClassicMistakes

RECOVERY STRATEGIES

Three Tips From: Dorian Cougias, CEO ofNetwork Frontiers LLC in SanFrancisco and author of TheBackup Book: Disaster Recov-ery From Desktop to Data Cen-ter (Schaser-Vartan Books,2003).

TIP NO. 1: Figure out how torecover from “stupid-usertricks,” such as the userwho accidentally drags anempty file directory on topof a very important file di-rectory and wipes it out, orthe janitor who disregardsthe “Don’t touch thisswitch” sign. Ask your helpdesk staffers to list theproblems they’ve dealt within the past 12 months.

TIP NO. 2: Have a disasterrecovery plan for your e-mail system, the most-used system on the net-work. Consider a productlike the Emergency Messag-ing System from Message-One Inc. in Austin.

TIP NO. 3: Make sure eachemployee’s daily, weekly ormonthly work proceduresinclude disaster recoverypractices, just like a sailor’sduties include checking theboat’s rigging and pumpsbefore every excursion.

Recovery Strategies Computerworld Executive Briefings 8

wished they’d had nearbybackup facilities, and they pro-ceeded to build such facilitiesat great expense across the riv-er in Jersey City, N.J. But Man-hattan’s next major business-continuity crisis — the August2003 blackout — took out elec-tricity in Jersey City as well.

The lesson: While it’s sensibleto consider certain broad crisiscategories (terrorist or hackerattacks, earthquakes, fires andso on), don’t think you can an-ticipate future events. Plan notfor specific crises, but ratherfor their effects. The Gap hadservers located in the WorldTrade Center on Sept. 11, Lanesays, but “we had set them upto fail-over to backups locatedin the South.”

MISTAKE 4: Overlooking the peo-ple. This is another lesson from

Sept. 11: Top-notch backupequipment helps only if some-body is able to use it. “Somebusinesses had recovery datacenters in Lower Manhattan,”says Carl Claunch, an analystat Gartner Inc. However, hesays, immediately followingthe collapse of the WorldTrade Center towers, “policewouldn’t let people in. Theequipment was fine, but it justsat there unused.” This canhappen if a building is quaran-tined, an elevator stuck or amajor road closed.

The other part of this gotchais the expertise of those whofinally do access backupequipment. Too many compa-nies — especially those thatfudge their recovery exercises— count on IT heroics to pullthem out of a crisis. However,as the Gap’s Lane says, “younever know if key personnelwill be back.”

The lesson: This is wherestrong documentation comesin. “We fashion our documentso anyone in the businessshould be able to restart an ap-plication,” Lane says. “Youshould be able to have some-body from the mail room starteverything up.”

MISTAKE 5: Conducting phony-

baloney practice drills. “Sure,companies do testing. But be-cause full tests are so re-source-intensive, they’rescheduled in advance,”Claunch says. The result: ITworkers, driven by the naturaldesire to ace a test, cheat.“They prepare. They collecttools, review procedures,” hesays. “Then, when a real disas-ter hits, blooey.”

This is a sticky problem forIT organizations stretched thineven before disaster planningis factored into their work-loads. Lane says practices atthe Gap are planned in ad-

vance. “We are a retailer; weneed to support our stores”around the clock, he says.

The lesson: There is no easyanswer here. Everybody con-cedes that surprise disastertests are more effective, butperforming one in a round-the-clock, e-business environmentis a massive undertaking.Claunch suggests surprisetests of one IT subgroup at atime, leaving the rest of thestaff to run operations. Andsome businesses use auditorsto make sure IT workers don’tlean on prepared information.

Sweat the Small StuffWhen a crisis hits, IT staffers seek-ing to maintain or restore opera-tions are often tripped up by themost basic items. Disaster planninganalysts and experts say you needto think about things like the follow-ing:

ACCESS. Who has keys or accesscards for the building? How do youget in if the electrical grid is shutdown? What local public-safety of-ficials (police, fire or town officials)can you turn to for help?

COMMUNICATION. In a crisis, ITstaffers may need to contact corpo-

rate officers whose names theydon’t even know. An emergency“telephone tree” that includes mo-bile numbers is a must.

LIGHT. At home, we’ve all felt stu-pid when a blackout hit and ourflashlight batteries were dead. Thesame goes for the workplace — af-ter all, backup generators fail, too.

PASSWORDS. Security is good,but in an emergency, even low-lev-el staffers may need extraordinarysystems access. Organizationsneed to put a crisis-only override inplace.

“In my experi-ence, IT can

respond relativelyquickly. The part

that’s missing is the users.”

DON O’CONNOR, CIO, SOUTH-ERN CALIFORNIA WATER CO.

Recovery Strategies Computerworld Executive Briefings 9

IF YOU WANT to really testyour disaster recoveryplan, you have to get outfrom behind your desk andstep out into the real

world. Because in the realworld, the backup site lostyour tapes, your emergencyphone numbers are out of date,and you forgot to order Chi-nese food for the folks workingaround the clock at your off-site data center.

“Unless it’s tested, it’s just adocument,” says Joyce Repsh-er, product manager for busi-ness continuity services atElectronic Data Systems Corp.

How often should you test?Several experts suggest real-world testing of an organiza-tion’s most critical systems atleast once a year. In the wakeof Sept. 11 and with new regu-lations holding executives re-sponsible for keeping corpo-rate data secure, organizationsare doing more testing thanthey did 10 years ago, saysRepsher. An exclusive Com-puterworld online survey of224 IT managers supports thatassertion, indicating that 71%had tested their disaster recov-ery plans in the past year.

Desktop disaster recoverytesting involves going througha checklist of who should dowhat in case of a disaster. Suchwalk-throughs are a necessaryfirst step and can help youcatch changes such as a new

version of an application thatwill trigger other changes inthe plan. They can also identi-fy the most important applica-tions, says Repsher, “beforemoving to the expense of amore realistic recovery test.”

Companies do desktop testsat different intervals. FluorFernald Inc., which is handlingthe cleanup of a governmentnuclear site in Fernald, Ohio,does both desktop and physi-cal tests of its disaster re-sponse plans every three years“or anytime there’s a signifi-cant change in our hardwareconfiguration,” says Jan Arnett,manager of systems and ad-ministration at the division ofengineering giant Fluor Corp.

What’s Critical? Determining which systemsneed a live test is also critical.Fluor Fernald schedules livetests on only about 25 of itsmost critical applications andthen tests only one server run-ning a representative sampleof these applications, says Ar-nett. “We feel if we can bringone server up, we can bring 10servers up,” he says, especiallysince the company uses stan-dard Intel-based servers andnetworking equipment. Themost common form of livetesting is parallel testing, saysTodd Pekats, national directorof storage alliances at IT serv-ices provider CompuCom Sys-tems Inc. in Dallas. Paralleltesting recovers a separate setof critical applications at a dis-aster recovery site without in-terrupting the flow of regularbusiness. Costly and rarelydone, the most realistic test isa full switch of critical systemsduring working hours to stand-by equipment, which Pekatssays is appropriate only for themost critical applications.Businesses that are growing orchanging quickly should testtheir disaster recovery plans

more often, says Al Decker, ex-ecutive director of securityand privacy services at EDS.He cites one firm that hasgrown eightfold since 1999,when its disaster plan calledfor the recovery of critical sys-tems in 24 hours. Today, justmounting the tapes requiredfor those systems would takefour to 10 days, he says.

Proper BalanceDeciding how realistic to makethe test “is a balance betweenthe amount of protection youwant” and the cost in money,staff time and disruption, saysRepsher. As an organization’sdisaster recovery program ma-tures, the tests of its recoveryplans should become more

RealisticTesting

RECOVERY STRATEGIES

Ditch the Script

A disaster drill isn’t much goodif everyone knows what’s com-ing. But too many organiza-tions script disaster testsweeks ahead of time, ship spe-cial backup files to an off-siterecovery center and evenmake hotel reservations for therecovery staff, says John Jack-son, vice president of businessresilience and continuity serv-ices at IBM in Chicago.

That eliminates messy butall-too-likely problems such aslosing backup tapes in transitor discovering that a conven-tion has booked all the hotelrooms in town. He advisestelling the recovery staff, “Wejust had a disaster. . . . Youcan’t take anything out of thebuilding. . . . You have to relyon the disaster recovery planand what’s in the off-site recovery center.”

That makes the test more“exciting,” he acknowledges,but it also makes it a lot moreuseful.

Recovery Strategies Computerworld Executive Briefings 10

challenging, adds Dan Bailey,senior manager at risk consult-ing firm Protiviti Inc. in Dallas.While the more realistic exer-cises provide more lessonsabout what needs improve-ment, he says, an organizationjust starting out with a rudi-mentary plan probably can’thandle a very challenging drill.

Never assume that every-thing will go as planned. Thatincludes anything from havingenough food or desks at a re-covery site to having up-to-

date contact numbers. Com-munications problems arecommon, but they’re easilyprevented by having everystaff member place a test callto everyone on their contact

list, says Kevin Chenoweth, adisaster recovery administra-tor at Vanderbilt UniversityMedical Center in Nashville.

Also, never assume that thedata on your backup tapes iscurrent or that your recoveryhardware can handle your pro-duction databases. Arnettfound subtle differences in thedrivers and network configu-ration cards on his replace-ment servers that forced himto load an older version of hisOracle database software to re-

cover his data. Chenoweth or his staffers

review each test with the af-fected business units and de-velop specific plans (withtimelines) for fixing problems.

Finally, Chenoweth says,thank everyone for their help,especially if the test kept themaway from home. “If you’ve gota good relationship, they’remore likely to be responsive”to the firm’s disaster recoveryneeds, he says.

“Unless it’s tested, it’s just a document”

JOYCE REPSHER, ELECTRONIC DATA SYSTEMS

Recovery Strategies Computerworld Executive Briefings 11

BUSINESS-TO-BUSINESS

dependencies createthe opportunity forgreat benefits. But ifa disaster strikes any

company in the supply chain,the risks to all are equallygreat.

At Ryder System Inc., cus-tomers routinely vet their sup-ply chain partners to ensurethat they meet minimum stan-dards for robustness and secu-rity. “If they can’t make thecut, we won’t do business withthem,” says Chuck Lounsbury,senior vice president of salesand marketing at the Miami-based transportation, logisticsand supply chain managementservices company. “We don’twant to jeopardize the capabil-ities of all the other companiesinvolved.”

“It is a matter of working to-gether,” adds Richard Arns, ex-ecutive director of the ChicagoResearch & Planning Group,which spun off a post-Sept. 11effort called the SecurityBoard. A key lesson from theterrorist attacks, he says, isthat organizations should en-large their circle of prepared-ness.

But that message may not begetting through. An AmericanManagement Association survey conducted last yearshowed a sharp increase in thenumber of companies with cri-sis plans, drills or simulations.Yet only about a third of thosecompanies reported havingongoing and backup emer-

gency communications planswith their suppliers.

To make their operationstruly disaster-resistant, ITmanagers should determine ifbusiness partners are ready tohandle a disaster, experts say.Then they must work closelywith those suppliers to achieveparity in their disaster recov-ery efforts and get their recov-ery times in sync. Here aresome more tips:

TIP: Tighten SLA Language. A good startingpoint, says Roberta J. Witty, ananalyst at Gartner Inc., is thelanguage of the service-levelagreement. SLAs are normallyapplied to IT providers butalso offer a framework for talk-ing about critical IT supportfrom partners. But that’s onlythe beginning. Witty says ITmanagers should conduct aninternal inventory assessmentto determine which points out-side the enterprise are criticalto a company’s functions.They should then extend theprocess to suppliers.

“Have a conversation withthem about what the risks are

within their own supplychain,” she says. “You are out-sourcing functions; maybethey are, too.” It may be worth-while to line up backup suppli-ers for your outsourced servic-es so you have more redun-dancy — and encourage part-ners to do the same, says Wit-ty. In any case, at each step inthe supply chain — includingwith your internal operations,your outsourcers, your suppli-ers and their outsourcers andsuppliers — there needs to bea credible recovery plan, shesays, “or their disaster will be-come yours.”

And nothing beats testing.Whenever possible, it’s a goodidea to include partners inyour own tests and vice versa,Witty says.

TIP: Test ERP Connections. Jim Grogan,vice president of alliances atSunGard Data Systems Inc. inWayne, Pa., says he’s seeingmore clients embrace the idealof the real-time enterprise.And enterprise applications,such as ERP software, thatsupport that vision almost in-variably have links outside theorganization. “We encourage[clients] to do an information-availability study of their trad-ing partners and suppliers,even if they have to foot thebill,” he says.

Most worrisome to Groganis the fact that many organiza-tions have entrusted key busi-ness processes to software —to the point that unaided hu-mans would have difficultyhandling those functions ontheir own.

“Even a few years ago, youcould count on someone beingable to get on the phone andfix things,” he says. Likewise,Grogan notes, phone commu-nication used to be planners’first priority. But not anymore.“Now, everyone tells us that

SynchronizingWith Suppliers

RECOVERY STRATEGIES

“In some cases,companies find

that they are do-ing far more than

their partners, andtheir partners ei-ther have to catchup, or they need toconsider spending

less, since theywon’t really getmuch benefit.”

JOHN JACKSON, VICE PRESIDENT OF IBM

BUSINESS CONTINUITY ANDRECOVERY SERVICES

Recovery Strategies Computerworld Executive Briefings 12

the first thing they need to getback in business with partnersis e-mail,” he says.

At a granular level, Grogansays SunGard always looks forpotential single points of fail-ure within a supply chain, suchas a server, switch or cableupon which many operationsdepend. Companies also needto coordinate their recoveryplans because for many appli-cations, particularly ERP, “sys-tems are connected in realtime with others that may havedifferent recovery times or dif-ferent recovery points, whichcan complicate efforts to getback to business,” he says.

TIP: Secure PartnerCommunications. It’s alsoimportant to look at the secu-rity of business partner com-munications because glitchesin that area could precipitate adisaster. Nick Brigman, vicepresident of strategy at Red-Siren Inc., an IT security man-agement firm in Pittsburgh,says it’s important to under-stand whether you’re connect-ed to partners via a privatenetwork, a virtual private net-work or the Internet.

One of the best ways to en-hance the security of that com-

munication is to assign “least-privileged” accounts to part-ners that define the nature andeven the volume of expectedtraffic, says Brigman. This notonly eliminates potentiallyspurious communications, butit also provides a basis for de-tecting abnormal activities, hesays.

Finally, John Jackson, vicepresident of IBM BusinessContinuity and Recovery Ser-vices, says business-to-busi-ness dependencies make itcritical for companies to “gettogether and do a business im-pact analysis to determine howtheir individual recovery timescould be made to mesh.”

“In some cases, companiesfind that they are doing farmore than their partners, andtheir partners either have tocatch up, or they need to con-sider spending less, since theywon’t really get much benefit,”he says.

Communication infrastruc-ture is the key, Jackson adds.Partners, especially smallerones, may not have the knowl-edge needed to ensure robustand resilient performance.And they may just need help toget there.

Gaining TrustEnsuring disaster resistancealong the supply chain is labor-intensive, but there is hope thatit might get easier.

Don Houser, a security archi-tect at Nationwide Mutual Insur-ance Co. in Columbus, Ohio,has developed a technologycalled XOTA, or Extensible Or-ganization Trust Assertion. Us-ing XOTA, partners in a busi-ness relationship set standardsfor that relationship — for exam-ple, the format and security re-quirements for message trans-missions. That information isthen embedded in a digital cer-tificate.

“An organization would ex-change that with their businesspartner, and it can then be grad-ed for compliance with that or-ganization’s standards or withcontractual language in realtime,” he says. The goal is tomake it easier to set up andmaintain communications andrelationships among differentorganizations while meetingeach organization’s specificneeds.

Rich Mogull, an analyst atGartner Inc. in Stamford, Conn.,says XOTA seems to be a goodstart.

However, Mogull says Gart-ner advocates a more automat-ed approach to the problem.

“What is really needed is acapability to automatically de-tect and analyze the compliancelevel of anyone with whom youare connecting,” he says. So far,however, the industry has takenlittle action in that direction,Mogull says.

Nationwide wants to bringXOTA to market through a con-sortium, according to Houser.“We already have several majorcompanies signed up, and wehave about 90 on the sidelineswaiting to come aboard,” hesays. Those interested includeconsulting, banking, financialand health care organizationsas well as “pure-play softwarecompanies,” Houser says.

“We are currently building aproof of concept, and we hopeto spin up the consortium by thethird quarter,” adds Houser.

Recovery Strategies Computerworld Executive Briefings 13

THE increasingly busi-ness-critical nature ofe-mail is promptingsome companies totake backup measures

specifically designed to retainaccess to their e-mail systemsin the event of a disaster.

Reinsurance company MaxRe Ltd. in Hamilton, Bermuda,had taken such measures be-fore Hurricane Fabian hit theisland. And online businesspublication Forbes.com inNew York was prepared whenthe massive blackout struckthe Northeast. But the twocompanies took dramaticallydifferent approaches to theproblem.

MAX RE LTD.Max Re took a bare-bones,software-centric route, usingthe Emergency Messaging Sys-tem (EMS) backup applicationfrom Austin-based Mes-sageOne Inc. The software en-abled the company to set upbackup e-mail capabilities for52 users in only a few hours,says Kevin Lohan, vice presi-dent of technology and sys-tems at Max Re.

“Fabian came in at quite aninopportune moment,” Lohansays, noting that the companywas still several months awayfrom fully plotting its disaster

recovery strategy. Max Re is setting up a disas-

ter recovery system in itsDublin offices for redundancy.But even when it’s completed,the system will take 12 to 24hours to go live in an emer-gency, Lohan says. While MaxRe’s other critical businesssystems might be able to waitthat long, e-mail has to be backup much faster, Lohan says.

MessageOne EMS is Linux-based software that backs upusers’ address books, contactlists and other critical infor-mation to provide instant ac-cess in an emergency if themain e-mail system goesdown, says Mike Rosenfelt, aMessageOne spokesman. Thatdata is hosted on Mes-sageOne’s servers and can beaccessed from any Internet-connected PC.

The service doesn’t back upold e-mail, cutting expensesfor storage and bandwidth.“It’s a life-support system untilyou can go to recovery,” saysRosenfelt. Pricing for EMSruns between 80 cents and $8per user per month, dependingon the number of users.

FORBES.COMForbes.com, meanwhile, usesMicrosoft Exchange backupservices from Evergreen As-surance Inc. in Annapolis, Md.Its hardware-based approachprovides full backup of all oldmessages, as well as addressbooks and contact lists.

Evergreen uses dedicatedservers that activate in 15 min-utes following a service out-age. These redundant e-mailservers reside in an Evergreendata center.

“Our customers are de-manding that they have accessto both their [old e-mail] and

their [current e-mail] applica-tions,” says company founderMichael Mulholland.

Michael Smith, chief tech-nology officer at Forbes.com,says his 85 users had e-mail ca-pability almost immediatelyafter the blackout hit.

Evergreen’s fees begin atabout $5,000 monthly for 250users and can be up to $30,000monthly for 5,000 users.

Overkill?But both approaches may beoverkill for some users, saidMike Gotta, an analyst at MetaGroup Inc. in Pleasanton,Calif. “I’m not denying that e-mail is critical communica-tion, but so is the telephone,”Gotta says.

For marketing companies orcommunications businesses,where “the bloodstream is in-formation,” there’s a reason-able need, he says. But formanufacturing companies, get-ting factories up and runningquickly is likely to be morecritical, Gotta says.

“I’m just not sure that I’m inthe camp that I can only con-duct my business if I can getmy e-mail back up,” he says.

E-mail Recovery

RECOVERY STRATEGIES

“Our customersare demandingthat they have access to both

their [old e-mail]and their

[current e-mail] applications.”

MICHAEL MULHOLLAND, CO-FOUNDER, EVERGREEN

ASSURANCE INC., ANNAPOLIS, MD.

Safe & Secure Computerworld Executive Briefings 14

STORAGE SYSTEMS

weren’t designedwith security in mind.They started out asdirect-attached, so if

the host was secure, the stor-age was too. That’s allchanged.

Fibre Channel storage net-works often have multipleswitches and IP gateways, al-lowing access from a myriad ofpoints. Compound this withpoor work by systems admin-istrators, new data securitylaws and recent high-profilecases of consumer informationtheft, and the need for im-proved storage security be-comes urgent.

But if systems administra-tors can’t follow the basicsteps of network storage secu-rity, better tools may not help.That’s part of the reason whyencryption is becoming themost widely adopted solutionto the problem.

Misconfiguring logical unitnumber (LUN) zones and notmaintaining network-accesslists are two major causes ofunauthorized access to storagenetworks, says Nancy Mar-rone, an analyst at The Enter-prise Storage Group Inc. inMilford, Mass. Another com-mon mistake administratorsmake is not bothering tochange the device default pass-word, according to DennisMartin, an analyst at Evaluator

Group Inc. in Greenwood Vil-lage, Colo.

Beyond the human failings,Fibre Channel itself isn’t a se-cure protocol. Through it, ap-plication servers can see everydevice on a storage-area net-work (SAN). Switch zoningand LUN masking on a storagearray can restrict access to de-vices on a SAN. Zoning segre-gates a network node either byhard wiring at the switch portor by creating access listsaround device world-widenames (WWN). Masking hidesdevices on a SAN from appli-cation servers either throughsoftware code residing on eachdevice or through intelligentstorage controllers that permitonly certain LUNs to be seenby a host’s operating system.

According to Marrone, man-aging access through LUNmasking works on smallerSANs but becomes cumber-some on large SANs becauseof the extensive configurationand maintenance.

Encryption Makes Gains Given these human errors andtechnology shortfalls, someusers are turning to encryp-tion.

Michelle Butler, technicalprogram manager for the Na-tional Center for Supercom-puting Applications (NCSA) atthe University of Illinois at Ur-bana-Champaign, managesthree SANs — two with 60TBof capacity and one with40TB. For her, security meansthat data needs to be encrypt-ed, both when it’s in transitand stored on a disk — or “atrest.”

“There are some tools outthere, but there are also somebig gaping holes being left thatso far don’t seem that interest-ing to hackers,” Butler says.

Nevertheless, the NCSAplans to buy Brocade Commu-nications Systems Inc.’s newly

released Secure Fabric operat-ing system and Fabric Managersoftware. Butler says the prod-ucts will allow her storage ad-ministrators to create networkmanagement access-controllists using public-key infra-structure (PKI) technologyand device access-control listsbased on WWN. The softwarealso offers authentication andencryption for control infor-mation or management dataon SAN devices.

Examples of the necessity ofencryption abound. For in-stance, in January, a disk drivewith 176,000 insurance policieswas stolen from Guelph, On-tario-based Co-operators LifeInsurance Co.

California LawIn response to events like this,California adopted a new law.SB 1386 requires any companythat stores information aboutCalifornia residents to publiclydivulge any breach of securityaffecting that data within 48hours.

In addition, Sen. Dianne Fe-instein’s (D-Calif.) office is de-veloping a federal version ofthe bill — called the DatabaseSecurity Breach NotificationAct — that would provide sim-

StorageSecurity

SAFE & SECURE

Top of Mind IT managers rate the followingstorage topics as “extremely im-portant” in the near future:

Disaster recovery/ 54%business continuity

Storage security 52%

Storage-area 31%networks

Regulatory 27%compliance

BASE: 91 IT MANAGERS; MULTIPLE RESPONSES ALLOWED.

SOURCE: COMPUTERWORLD’S ITLEADER RESEARCH PANEL, 2004

Safe & Secure Computerworld Executive Briefings 15

ilar protections to all U.S. resi-dents. The only companies ex-empt from the California lawand the proposed national leg-islation are those that encryptdata at rest.

Several newly released prod-ucts address concerns posed

by the recent legislation. Mis-sissauga, Ontario-based KastenChase Applied Research Ltd.announced its Assurency Se-cure Networked Storage plat-form, agent-based softwarethat provides a stripped-downPKI-based authentication andencryption for networked stor-age devices. The company esti-mates that a complete encryp-tion system is generally 7% to10% of the cost of a SAN.

Other Vendors Another company getting no-ticed is start-up appliance ven-dor Decru Inc. in RedwoodCity, Calif., which uses propri-etary software to encrypt dataon the storage array, but usesthe IPsec protocol on the ap-plication server to encryptdata while in transit. ItsDataFort security applianceswork for for both SANs andnetwork-attached (NAS) stor-age.

Vormetric Inc. in SantaClara, Calif., sells an appliancethat supports SANs as well asboth NAS and direct-attachedstorage devices and can beused to do high-speed encryp-tion of data at the file systemlevel, on a file by file basis.And NeoScale Systems Inc. inMilpitas, Calif., sells a productcalled CryptoStor FC, that pro-vides wire-speed, policy-basedencryption for SAN and NASdata.

Although most currentlyavailable storage security tech-nologies offer encryption, ana-lysts say it’s important forusers to make sure that thedata is encrypted both at restand while being transmittedacross networks.

SAN-ity CheckCathy Gilbert at AmericanElectric Power Inc. isn’t tooworried about security on her2-year-old storage-area net-work (SAN). There are “veryfew people in our building thatwould actually know what todo” to reconfigure her FibreChannel SAN — assumingthey could reach it on its inter-

nal private network, which canbe administered only from alocked room, says Gilbert, asenior IT architect at theColumbus, Ohio, energy pro-ducer.

She uses the built-in config-uration capabilities of herEMC Corp. Symmetrix storagearrays, McData Corp. IntrepidDirectors and McData Enter-prise Fabric ConnectivityManager 6.0 software to con-trol which servers can accesswhich storage devices.

But protecting SANs will be-come more difficult, and moreimportant, as customers begindeploying SANs more widely,to enable the money-savingconsolidation of servers, appli-cations and data. And as moreSAN traffic migrates from therelatively unknown FibreChannel protocol to IP, it willbecome vulnerable to the samewell-known attacks usedagainst the Internet and corpo-rate networks.

Future Threats SAN security will become alarger problem as companiescut costs by forcing differentdepartments to share storagenetworks, says Wayne Lam,vice president at FalconStor.

In most companies, IT man-agers from one departmentdon’t have the authority tomanage data from other de-partments. But companies of-ten need to commingle datafrom multiple departments ona single SAN to drive downtheir storage costs. “You can’tafford to have five islands ofSANs,” he says.

The need for more granularcontrol over who can managewhich portions of a SAN is oneof the features customers askfor most frequently, says KamyKavianian, a product market-ing director at Brocade Com-munications Systems Inc. inSan Jose. He says customersalso need the following:

n Stronger authentication toverify the identities of both ad-ministrators and devices.

n The ability to use a wider

variety of methods, such asTelnet and Simple NetworkManagement Protocol, to man-age SANs.

n Encryption to protect SANdata from eavesdropping if itcrosses public networks suchas the Internet.

Identity SpoofingAuthentication — the ability toprove the identity of a personor device — becomes crucialas more users are able to tapinto SANs and as data frommore sources is commingledin corporate storage networks.Spoofing the identity of a per-son, or even of a device suchas a host bus adapter, is a realthreat, Lam says.

Spoofing the identity of adevice should be impossiblebecause manufacturers giveeach device a unique WWNthat identifies it to other partsof the storage network, saysLam. But manufacturers delib-erately let customers changethe WWN through an upgradeto the firmware in the device,he says. That makes it easier,for example, for a customer toreplace a switch in a storagenetwork without having to up-

SAN SecurityGlossaryFABRIC: The hardware andsoftware that connect a net-work of storage devices to oneanother, to servers and eventu-ally to clients.

LUN MASKING: Using theLogical Unit Number (LUN) of astorage device, or a portion ofa storage device, to determinewhich storage resources aserver or host may see.

PORT: A physical connectionon a storage switch that linksthat switch to storage devices,servers or other switches.Many SAN security techniqueslimit which devices a port canconnect to or the manner inwhich it connects to those de-vices.

SPOOFING: Impersonating theidentity of an individual (suchas a storage administrator) orof a device (such as a storageswitch) to gain unauthorizedaccess to a storage resource.

TRUSTED SWITCH: A switchwithin a storage network thatuses a digital certificate, key orother mechanism to prove itsidentity.

VSAN: A virtual SAN, whichfunctions like a zone but uses adifferent layer of the FibreChannel protocol to enforcewhich devices in the fabric canspeak to other devices.

WORLD WIDE NAME: Aunique numeric identifier for adevice on a storage network,such as a disk array or aswitch.

ZONE: A collection of FibreChannel device ports that arepermitted to communicate witheach other via a Fibre Channelfabric.

Security TipsMAINTAIN current network-access lists.

GET up to speed on theport-zoning method yourvendor uses (they’re not allthe same).

CHANGE default passwordson new hardware.

DESIGN the topology of aSAN with network securityadministrators.

ENCRYPT data both at restand in transit.

CONSIDER carefully howyou dispose of old harddrives and backup tapes.

Safe & Secure Computerworld Executive Briefings 16

date every device that commu-nicates with that switch’s newWWN.

Many vendors are planningkey-based authentication tocreate “trusted” administratorswith the authority to manageonly a subset, such as a zone,of a corporate SAN. This mightbe overkill in small environ-ments such as Alloy’s, butTajudeen says, “I could see itbeing an issue if you have alarger set of administrators.”

Encryption may increase inimportance as more SAN datamigrates from Fibre Channelto IP and as storage over IP al-lows data to travel farther out-side the data center than ispossible with Fibre Channel.“It is nice to have certain typesof data encrypted,” saysTajudeen, but only if the en-cryption isn’t too expensiveand doesn’t exact too much ofa toll on performance.

Building the Business Case Storage managers must alsoget ready to explain the intri-cacies of SAN security to theirless-technical peers, says JohnWebster, a senior analyst atData Mobility Group Inc. inNashua, N.H. Some pioneerslooking to consolidate corpo-rate data on SANs are facingtough questions from depart-ment heads worried about howtheir data will be kept separate

from data generated by otherbusiness units, and from chiefsecurity officers worried aboutwhether the SAN will be se-cure from outside threats.

First, “you’ve got to figureout how, or if, you can over-come” such objections, saysWebster, and be prepared todefend your plan in under-standable terms. “If you’re notprepared to answer them, youcan be in trouble,” he says.

Safe & Secure Computerworld Executive Briefings 17

EVEN before the 9/11terrorist attacks andthe temporary shut-down of the nation’sairlines, IT managers

were beginning to use thewords disaster recovery andstorage in the same sentence,especially in the financialservices industry.

But afterwards, the marriageof the two disciplines seemedeven more urgent. The poten-tial for disasters seemed big-ger, and disaster recoveryplans that called for flyingtapes across the countryseemed naive.

The result is that the wordsdisaster recovery are now actu-ally driving many storage tech-nology projects, as corporateIT managers look for ways toreplicate data and send it tosites that are 10, 20 or evenhundreds of miles from head-quarters. Here are three tech-nology strategies they’re using.

1THE STORAGE SUBSYSTEMSAPPROACH. When LajuanaEarwood began looking for

a new disaster recovery sys-tem, she found she was alone.“No one else had done whatwe were trying to do — wewanted to mirror massiveamounts of data over a verylong distance, about 700miles,” she says.

Earwood, director of main-frame systems at NorfolkSouthern Railway Co. in Nor-folk, Va., realized that the com-pany’s business systems weretoo vulnerable. “At the time we

were replicating a small subsetof our data in real time,” saysEarwood. “But it was not reallyenough to carry us through inthe event of a major disaster.”

The data sets amounted toabout 6TB of critical railroad,payroll and order entry infor-mation on two IBM main-frames — essentially the rail-road’s IT hub. So Earwoodsent out a bid request to all themajor storage vendors; HitachiData Systems Corp. in SantaClara, Calif., got the job.“IBM’s proposal would haverequired too much additionalhardware,” says Earwood.“EMC’s solution gave onlysnapshots. HDS gave us theclosest thing to real-time mir-roring, and it required lesshardware.”

And she liked the price.Along the way there was onemajor change in direction,though. The original plan wasto replicate to a site in NorthBergen, N.J., about 700 milesaway. “But after Sept. 11, we re-alized that might not be such agood idea. The logistics oftransporting personnel couldeffectively negate all our otherefforts,” Earwood says. So Nor-folk Southern decided insteadto use a much closer backupcenter in Buckhead, Ga., forthe mirrored mainframe data.

The sheer volume of datapresented another challenge.“We wanted to put all our datain one consistency group,”says Earwood. A consistencygroup is a set of data that isshared by a number of criticalapplications, but in Norfolk’scase all of the data is shared byall of the applications. It madesense on paper, but the execu-tion pushed the technology en-velope a bit too far.

“We are using HDS 9960storage hardware, the HDSTrueCopy replication softwareand two OC3 network pipes,”says Earwood. “It all worked,

but we kept hitting the ceilingon high-volume write transac-tions.”

The solution was to split thedata into three consistencygroups. But Earwood isn’t giv-ing up on the original goal.“We are looking at new hard-ware from HDS that can han-dle more volumes. This mightallow us to consolidate all ourdata back to one consistencygroup,” she says.

Earwood tests the systemwith a simulated disaster re-covery almost every week.“We are almost down to afour-hour recovery time,” shesays, “and we now feel that wecan go to our board of direc-tors and say that we have con-fidence in our disaster recov-ery system.”

Long-DistanceData Replication

SAFE & SECURE

Business Continuity Tips

DECIDE on your recovery ob-jectives before selectingtechnologies and spendingmoney.

DON’T NEGLECT the peoplepart of business continuity.The best data replication sys-tem in the world won’t help ifyour people aren’t trainedand in place to take advan-tage of it.

LEVERAGE the infrastructureyou already have. For exam-ple, if you have dark fiber inplace, it might be cost-effec-tive to go with a high-endSAN and Dense Wave Divi-sion Multiplexing for datareplication.

CONSIDER that if a disasteroccurs and you have to usethe airlines to get to a remotesite, your recovery time willincrease — if you can fly at all.

Safe & Secure Computerworld Executive Briefings 18

2HOST-BASED SOFTWARE.When Chadd Warwick,operations manager at

Comprehensive Software Sys-tems Inc., a financial softwaredevelopment house in Golden,Colo., went shopping for a newbusiness continuity system, hewanted something a little moreflexible than the hardware-based systems from vendorssuch as Hitachi and Hopkin-ton, Mass.-based EMC Corp.

He found it in Veritas Vol-ume Replicator (VVR) soft-ware from Veritas SoftwareCorp. in Mountain View, Calif.“We liked VVR,” says War-wick, “because it is a host-based, software solution.”

Because the software runson the server instead of on thedisk array, it’s independent ofthe storage hardware. “Itmeant we didn’t have to fork-lift a new hardware infrastruc-ture in, which meant lots ofsavings for us,” says Warwick.

Warwick started using VVRin November 2001 as a betatester and decided to stickwith it. “This is block-leveldata replication so the soft-ware doesn’t need to knowanything about the applica-tions or the data. And thehardware independence is re-ally nice,” he says. “You can ac-

tually restore to different hard-ware, so, in the event of a ma-jor disaster, we could even rundown to Best Buy and pick upwhatever machines we couldfind to get us up and runningquickly.”

Another advantage, War-wick says, is the absence ofcomplex, proprietary networkprotocols. “VVR uses standardIP,” he says. Currently, thecompany replicates about400MB to 1GB of data per day— over T1 and T3 lines —from the data center in Goldento a site in downtown Denver.

The software approach isusually cheaper than subsys-tems products, especially forreplication over long dis-tances, says Bob Guilbert, vicepresident of NSI Software Inc.in Hoboken, N.J., which com-petes with Veritas. “The sub-systems products typically re-quire dedicated fiber, and thatcan get very expensive.”

3THE HYBRID: SAN OVER IP.Shimon Wiener, likemany of his peers, start-

ed looking for a better way toprotect his firm’s data after theterrorist attacks of Sept. 11,2001. Wiener is the manager ofthe Internet and networkingdepartment at Mivtachim, aleading provider of pension in-surance in Ramat Gan, Israel.It’s a part of the world where,unfortunately, disasters aren’ta rare occurrence.

The firm has two data cen-ters: one with 600MB, in Ra-mat-Gan, and the other with400MB, about seven milesaway in Tel Aviv. “We wantedto do a double replication,”says Wiener, “so if Tel Avivgoes down, Ramat Gan cantake over, and vice versa.” Butwhen Wiener went shopping,he was dismayed at the highprices. “We first looked at

Compaq, IBM and EMC. Noneof them could do this withouta Fibre Channel connection,which was very expensive,” hesays.

Wiener finally found whathe wanted from Dot Hill Sys-tems Corp. in Carlsbad, Calif.Mivtachim wanted a storage-area network (SAN) at eachsite, and Dot Hill’s Axis Stor-age Manager software sup-ports IP replication for SAN-based systems.

Beginning in January, thepension company installed theDot Hill SANnet 7100 hard-ware in Ram Gat and a secondSANnet 7100 in Tel Aviv, andthen started replicating be-tween sites. Testing lastedabout four weeks. “Initially, wehad some problems getting thetwo systems synchronized,”says Wiener, “but we had goodsupport and now we are verysatisfied. We replicate aboutonce a day.”

Although Wiener got thesystem for disaster recoverypurposes, he says it had a sig-nificant side benefit: “TheSAN also centralized our dataso backups are much easier tomanage.”

“You can actuallyrestore to different

hardware, so, inthe event of a ma-

jor disaster, wecould even run

down to Best Buyand pick up what-ever machines wecould find to get us

up and runningquickly.”

CHADD WARWICK, OPERATIONSMANAGER, COMPREHENSIVE

SOFTWARE SYSTEMS INC.

Safe & Secure Computerworld Executive Briefings 19

LIKE many IT execu-tives, Eric Eriksen,chief technology offi-cer at New York-basedDeloitte Consulting,

would like tape to just go away.The added cost of managingtape backup systems, slow andunreliable restoration, car-tridge inventorying and off-site storage headaches havehim hoping that cheap diskdrives may someday replace50-year-old tape technology inthe data center.

“We only need tape for caseswhen we can’t restore fromdisk. It’s a necessary evil,” hesays.

Yet despite a drastic shift to-ward low-cost Advanced Tech-nology Attachment disk arraysfor backing up business data,there’s no end in site to the useof tape in the data center — es-pecially for archival storage.Administrators may complain,but tape still has an enormousinstalled base and remains 10to 50 times less expensive thandisk. It’s also very secure, sincedata stored off-line on remov-able media is physically inac-cessible to hackers and viruses.

And vendors and analystssay evolutionary advances inthe basic technology inmidrange tape drive systems,improvements in managementtools, and the emergence ofcombined disk/tape subsys-tems are likely to answer someuser complaints — and keeptape technology in data cen-

ters for at least anotherdecade.

Bigger and Faster Manufacturers of the threeleading midrange tape drivetechnologies — digital lineartape (DLT), linear tape-open(LTO) and advanced intelli-gent tape (AIT) — are prepar-ing significant capacity andspeed improvements. Ad-vanced drives, including Su-perDLT (SDLT), SuperAIT (S-AIT) and LTO Ultrium 2(LTO-2), are the latest varia-tions. Each uses half-inch tapeand offers roughly five timesthe capacity and performanceof standard DLT, AIT and LTOtapes.

For example, DLT was de-veloped in 1986 and the aver-age cartridge originally heldabout 96MB of data. SDLT to-day holds 160GB. Over thenext decade, SDLT will growto about 2.5TB native capacitywith 250MB/sec. throughput.LTO, which derives its namefrom its open architecture,could grow to 10TB native ca-pacity by 2011.

Vendors say 1TB tape car-tridges could appear as earlyas next year. Tape manufactur-ers such as Quantum Corp.,Certance LLC and StorageTechnology Corp. expect tapeto more than meet futureneeds. That’s a tall order, sincethe amount of data producedby the average enterprise isdoubling every year, accordingto Gartner Inc.

To keep up, tape media willevolve to have more than 1,000tracks and a thickness of 6.9microns (about as thick as cel-lophane). And it will also workwith drives that write on bothsides of the tape, says JeffLaughlin, director of strategyfor the automated tape solu-tions unit at StorageTek inLouisville, Colo.

In contrast, StorageTek’s

current high-end tape drive,the proprietary T9940B, uses200GB, one-sided tape that has576 tracks and is 9 micronsthick. Laughlin expects trans-fer rates to keep up with thelarger capacity tapes as well.“There’s more money beingspent on tape media researchthan ever before in history.You’re going to see greatertransfer rates at the head inter-face, transfer rates of100GB/sec., 200GB/sec.,” hesays.

Smarter Emerging management soft-ware that can monitor thehealth of tape drives, FibreChannel switch port connec-tions to libraries and even thetape cartridges themselves willhelp ensure that users are ableto restore from tape, more eas-ily manage backups and pre-dict problems and backup fail-ures, vendors say. AdvancedDigital Information Corp.(ADIC) and Quantum, for ex-ample, have recently intro-duced native managementsoftware tools on their tape li-brary and drive technology.

ADIC sells all major tapecartridge technologies in itsautomated libraries and tapeautoloaders, but Dave Uvelli,an executive director at theRedmond, Wash.-based com-pany, says he believes car-tridge formats and drive tech-nologies are becoming irrele-vant. Instead, ADIC is bettingon new, intelligent tape librarysystems that will eventuallyprovide detailed informationon drives and tape, whetherit’s related to a downed switchport, a stuck drive or a tapecartridge that’s reaching theend of its life.

One example of archival in-telligence is ADIC’s Scalari2000 tape library. The Scalari2000 is designed to eliminatethe need for an external library

Advances inTape Backup

SAFE & SECURE

Safe & Secure Computerworld Executive Briefings 20

control server. Among otherthings, the system can sendbackup failure alerts via pageror e-mail, partition a libraryinto multiple logical libraries

and perform mixed media, per-formance and proactive sys-tem readiness checks.

San Jose-based Quantumalso introduced DLTSage, a

suite of predictive and preven-tative diagnostic tools that runon its SDLT tape drives to helpensure that backups have com-pleted successfully. The appli-

cations can also tell adminis-trators when drives havereached critical thresholds forcapacity and predict whereand when errors may occur.

Here Come the Hybrids While disk-to-disk backup isalready popular, during thecoming year, manufacturersplan to introduce more hybridsystems that combine diskwith tape libraries in storage-area networks for faster back-ups and restores and easierarchiving. ADIC, for example,plans to introduce a combinedtape/disk library this month.

“You won’t just have tape.One could imagine RAID-pro-tected disk where I/Os fromthe backup job are completedat [wire] speed while the [li-brary] robot, through manage-ment software, stages it ontape drives for archival,” saysStorageTek’s Laughlin.

Ultimately, however, scala-bility and restorability willcontinue to be the key criteriato take into account when se-lecting tape systems, says De-loitte Consulting’s Eriksen.“We’re looking for a single so-lution that can cover every-thing, regardless of the needswe have,” he adds.

Choosing the Right FormatWill the SDLT, LTO-2 or S-AIT tapedrive technology you’re using todaybe around tomorrow? Most likely,vendors and analysts say, althoughsome users are finding reasons toswitch from one format to another.

Even with software advances inSDLT, more users are buying LTO-2drives these days. Bob Abraham, ananalyst at Freeman Reports in Ojai,Calif., says LTO-2 appeals to usersbecause its open architecture offersa choice of vendors. Hewlett-Packard Co., IBM and Costa Mesa,Calif.-based Certance all manufac-ture LTO-2 products, whereas onlyQuantum produces DLT and SDLTdrives.

In July, Quantum put self-diag-nosing intelligence into its SDLTdrives, a move that analysts say willhelp boost sales. Quantum alsosays it has plans for at least fourmore incarnations of SDLT, and thevendor has 31% of the overall tapemarket — more than any of its com-petitors.

But John Pearring, president of

StorServer Inc. in Colorado Springs,a manufacturer that sells all threetape technologies, still gives LTOthe edge. “LTO is open and makesmore sense, and it’s 200GB native[vs. 160GB for the latest SDLT 320drives],” he says.

Deloitte’s Eric Eriksen says he’slooking at moving from four HPtape libraries, with eight SDLTdrives each, to a single HP or ADICScalar 10K tape library using LTOdrives for greater capacity in asmaller footprint. He says his deci-sion isn’t being driven so much byLTO-2’s openness, but by its com-pression rates and speeds, which —for the moment — exceed those ofSDLT. He also says that the newLTO-2 libraries are more scalablethan his older system.

“One of the things that’s impor-tant when we’re doing streamingacross multiple tape drives is to beable to restore quickly,” he says, re-ferring to LTO-2’s 200GB capacityand 35MB/sec. throughput.

And while LTO has a capacity

and performance edge over SDLTtoday, analysts say the two tapetechnologies continuously leapfrogeach other in capacity and through-put, so other factors may be moreimportant.

SDLT and LTO-2 may be neckand neck in speeds and feeds, butSony Electronics Inc.’s S-AITleapfrogged both with the vendor’sintroduction of a 500GB,30MB/sec. drive in December —and it’s likely to remain ahead forsome time, based on current SDLTand LTO road maps.

S-AIT also has the edge in pric-ing: S-AIT tape cartridges are $80,vs. $120 for LTO-2 and $130 forSDLT. Sony intends to develop andsupport S-AIT through at least asixth generation, says Stephen Bak-er, vice president of storage solu-tions at Sony in San Jose. But S-AIT’s appeal has been limited be-cause, as with SDLT, only one man-ufacturer produces the drives.

Safe & Secure Computerworld Executive Briefings 21

MIKE LUCAS, IT di-rector at Hogan &Hartson LLP, hadhad enough. TheWashington-based

law firm was paying $30,000 amonth to back up data onmore than 400 servers locatedin 27 offices worldwide andstore the tapes off-site. Lucassays he couldn’t stomach thecost of buying more tapedrives to back up every newprint, file or application server.

Along with the increasingcosts, the tape-based infra-structure created administra-tion issues, including the needto sometimes rely on nontech-nical staffers to swap out tapecartridges in each remote of-fice every night and take themoff-site.

Then there were the soft-ware glitches. “We’d have trou-ble from time to time with atape getting hung, having to doa reboot of a server during offhours. We were at risk of nothaving a backup,” Lucas says,adding that retrieving tapes forrestoring data in an emergencycould take more than a day.

Hit or MissData protection executed at re-mote sites is often a hit-or-miss scenario because “no oneknows if the backup actuallyhappened or if a restore canoccur,” says Arun Taneja, ananalyst at Taneja Group Inc. inHopkinton, Mass.

Those frustrations led Lucasto use a remote backup strate-gy that brings backup data intothe data center, where it can becentrally managed. Vendorsoffer a variety of network-based schemes that pull dataacross a WAN to a centralrepository. These systems aresimpler to manage and morecost-effective than local tapebackups, analysts say.

Most include software andappliances that replicate datafrom branch offices to the datacenter, where it is backed up toa disk device and/or tape li-brary. This model eliminatesthe need for media handling orIT support at remote sites andoffers greater security, sincebackup data is centralized.

The increasing popularity ofthese systems is starting to af-fect sales of entry-level tapedrives commonly used to backup direct-attached storage.IDC in Framingham, Mass., isforecasting a 20% decline thisyear as administrators increas-ingly decide not to back upbranch servers locally.

The Options Vendors offer several ap-proaches to remote backup.Software such as Veritas Soft-ware Corp.’s Storage Replica-tor and CYA TechnologiesInc.’s HotBackup first executea complete backup of direct-attached storage on each re-mote server or network-at-

tached storage appliance andthen move incremental or“delta” changes over the WANto the data center.

Some organizations withbranch offices that host multi-ple servers are choosing tofirst consolidate backups to alocal disk-backup appliancebefore replicating data acrossthe WAN. The appliance cancomplete server backupsquickly across a LAN and thenstream updates over the slow-er WAN connection to thedata center, where it can bearchived to tape.

For workstation backups,some storage administratorsare creating virtual drives onremote end-user PCs and map-ping those to a file server backin the data center. To avoidperformance problems overthe WAN, administrators in-stall a local data-caching appli-ance that gives users access totheir files at LAN speeds whileupdates stream in the back-ground to the back-end appli-ance in the data center.

Lucas contracted with DS3Data Vaulting LLC, a serviceprovider in Fairfax, Va., for hisnetwork backup system, whichincludes disk-based appliancesand software from Asigra Inc.in Toronto. Asigra’s Televault-ing DS-Client software runs onservers, desktops and laptopsconnected to each remote of-fice LAN and automates thebackup of about 3TB of com-pressed data from local backupappliances in 10 offices overthe WAN to an AT&T datacenter. After completing aninitial full backup, the remoteappliance provides updatesonly for changed data blocks.It eliminates duplicate files,encrypts the data and com-presses it at a 2-1 ratio beforeautomatically sending it acrossthe WAN on a scheduled basis.Lucas expects a two-year pay-back on his investment. The

Backing UpThe Edge

SAFE & SECURE

“I don’t think it’s afad. I think more

people are going toadopt this technol-

ogy because it’scost-effective.” TONY ASARO, ANALYST, ENTERPRISE STORAGE

GROUP INC., MILFORD, MASS.

Safe & Secure Computerworld Executive Briefings 22

initial system installation inHogan & Hartson’s central of-fice cost about $13,000. He hasdeployed 10 offices to date andis continuing to roll out thetechnology.

Caching Up Companies such as ActonaTechnologies Inc. (recently ac-quired by Cisco Systems Inc.),Riverbed Technology Inc.,Disksites Inc. and Tacit Net-works Inc. use appliances atboth the remote site and thecentral data center for globalfile sharing. The appliancesspeed up access to shared filesin part by removing the over-head associated with file-serv-ing protocols such as the Com-mon Internet File System andNetwork File System.

Mukesh Shah, director ofnetwork services at The Asso-ciated Merchandising Corp.(AMC) in Plainfield, N.J., is incharge of file-sharing opera-tions among 40 remote loca-tions in a worldwide networkthat includes data center hubsin Hong Kong and New Jersey.

AMC uses MetaFrameserverware from Fort Laud-erdale, Fla.-based Citrix Sys-tems Inc., which gives Win-dows XP PCs and wireless de-vices virtual, thin-client accessto applications running onback-end servers. It also usesthe New Jersey data center forglobal file sharing of Excelspreadsheets, Microsoft Worddocuments and other files.

But users in Asia and Europewere waiting more than twominutes for remote files toopen. The system also lackedadequate file-locking safe-guards for some shared files.Users were “quite unhappy,”Shah says. Eight months ago,Shah began piloting a cachingappliance from South Plain-field, N.J.-based Tacit Net-works in his New Jersey datacenter. File-access timesdropped from an average of122 seconds to 11 seconds onfirst access and eliminated the

end-user wait altogether onsubsequent attempts after thefile was loaded into the localappliance’s cache. “Tacit has aprocess where you can pushfiles to a local cache on ascheduled basis,” Shah says.“So when users go to accessthe file, it’s already there.”

When users change and savethe file back to the cache, it’salso saved on the main file-sharing server in New Jersey,where AMC staffers back it up.“All restores can be done cen-trally, whereas if we had tosubstitute the cache appliancewith a file server, we’d havethe complexity of backups andrestores at the remote officelevel,” Shah says.

Outsourcing It All Overworked IT organizationsthat don’t have the time or re-sources to set up a remotebackup system can considersimilar offerings from serviceproviders.

Brian Asselin, IT director atHarborside Healthcare Corp.,a Boston-based long-term carecompany, oversees operationsfor 55 locations and 8,500 em-ployees, but he says he hasonly one IT person for each ofthe nine states in which facili-ties are located.

Harborside had been usingdirect-attached tape backupfor its remote applicationservers, but Asselin says en-suring that backups occurred

and performing restores werea nightmare.

“Our people working in thefacilities are definitely techni-cally challenged,” he says. “Lo-gistically, it would be impossi-ble to restore with the people Ihave.”

What’s more, the Health In-surance Portability and Ac-countability Act requiresgreater security around patientinformation than Harborside’sIT infrastructure can provide,Asselin says. “There’s just aslew of security that needs tobe in place by 2005 forHIPAA,” he says. Instead ofbuilding a central data centerwhere data could be replicatedfor disaster recovery purposesand further burdening his ITstaff, Asselin chose serviceprovider AmeriVault Corp. inWaltham, Mass., to host back-up data storage and handledaily replication from the re-mote sites.

AmeriVault installed itsCentralControl software onHarborside’s desktops and anagent on each of its servers.After completing an initial fullbackup of all data, the vendorperforms daily, incremental,encrypted backups over theInternet to its disaster recov-ery centers.

Point-and-Click PortalIn an emergency, administra-tors at Harborside can performdata restores, even from home,

using a point-and-click appli-cation on AmeriVault’s Webportal. Alternatively, data canbe shipped on tape for largerestores.

Asselin says AmeriVault has“processes and procedures”that are HIPAA-compliant,which relieves his staff fromhaving to set up its own com-pliance program. And Asselinsays he also reduced laborcosts by outsourcing his re-mote backup and recovery ar-chitecture because “we don’thave to have people runningaround dedicated to the task ofbackup.”

But while the remote backuptechnology made processesmore efficient, the outsourcingapproach wasn’t necessarilycheaper.

“In terms of actual backupcost, it’s pretty much a wash.When you consider bandwidthand license payments for soft-ware, it’s pretty much evenwith other backup solutions,”Asselin says.

Tony Asaro, an analyst atEnterprise Storage Group Inc.in Milford, Mass., says thecosts of edge network backuptechnologies are continuing todrop, and as large companiesinvestigate using these sys-tems, big vendors are steppingin with new products.

Asaro points to EMC Corp.’sentry-level Clariion AX100 ar-ray, which can be directly at-tached to its NetWin 110 NASGateway or bought as a pre-configured storage-area net-work with backup and storagemanagement software for re-mote office backup.

And EMC’s Legato RepliStorreplication software is bundledwith switches from BrocadeCommunications Systems Inc.

“I don’t think it’s a fad,”Asaro says. “I think more peo-ple are going to adopt thistechnology because it’s cost-effective.”

Remote Backup SystemsPROS

n Ease backup managementheadaches by removing tapedrives from branch offices andconsolidating them in the datacenter.

n Afford greater security by cen-tralizing backup data.

n Eliminate need for branch officestaff involvement in backupprocesses such as tape rotations.

CONSn Experience problems with repli-cation throughput if WAN band-width is insufficient.

n Require additional software,adding complexity to the backupprocess.

n Require upfront investment insoftware and possibly hardwareas well, depending on the systemchosen.

Emerging Technologies Computerworld Executive Briefings 23

DEFINITION: Grid storage, analo-gous to grid computing, is a newmodel for deploying and managingstorage distributed across multiplesystems and networks, making ef-ficient use of available storage ca-pacity without requiring a large,centralized switching system.

We routinely talk about theelectrical power grid or thetelephone grid, and it’s prettyclear what we mean — a large,decentralized network withmassive interconnectivity andcoordinated management. Agrid is, in fact, a meshed net-work in which no single cen-tralized switch or hub controlsrouting. Grids offer almost un-limited scalability in size andperformance because theyaren’t constrained by the needfor ever-larger central switch-es. Grid networks thus reducecomponent costs and producea reliable and resilient struc-ture.

Applying the grid concept toa computer network lets usharness available but unusedresources by dynamically allo-cating and deallocating capaci-ty, bandwidth and processingamong numerous distributedcomputers. A computing gridcan span locations, organiza-tions, machine architecturesand software boundaries, of-fering power, collaborationand information access to con-nected users. Universities andresearch facilities are using

grids to build what amounts tosupercomputer capability fromPCs, Macintoshes and Linuxboxes.

After grid computing cameinto being, it was only a matterof time before a similar modelwould emerge for making useof distributed data storage.Most storage networks arebuilt in star configurations,where all servers and storagedevices are connected to a sin-gle central switch. In contrast,grid topology is built with anetwork of interconnectedsmaller switches that can scaleas bandwidth increases andcontinue to deliver improvedreliability and higher perform-ance and connectivity.

What Is Grid Storage? Based on current and pro-posed products, it appears thata grid storage system shouldinclude the following:

Modular storage arrays: Thesesystems are connected across astorage network using serialATA disks. The systems can beblock-oriented storage arraysor network-attached storagegateways and servers.

Common virtualization layer:Storage must be organized as asingle logical pool of resourcesavailable to users.

Data redundancy and availabili-ty: Multiple copies of datashould exist across nodes in

the grid, creating redundantdata access and availability incase of a component failure.

Common management: A sin-gle level of managementacross all nodes should coverthe areas of data security, mo-bility and migration, capacityon demand, and provisioning.

Simplified platform/managementarchitecture: Because commonmanagement is so important,the tasks involved in adminis-tration should be organized inmodular fashion, allowing theautodiscovery of new nodes inthe grid and automating vol-ume and file management.

Three Basic Benefits Applying grid topology to astorage network provides sev-eral benefits, including the fol-lowing:

RELIABILITY. A well-designedgrid network is extremely re-silient. Rather than providingjust two paths between anytwo nodes, the grid offers mul-tiple paths between each stor-age node. This makes it easy toservice and replace compo-nents in case of failure, withminimal impact on systemavailability or downtime.

PERFORMANCE. The same fac-tors that lead to reliability alsocan improve performance. Notrequiring a centralized switchwith many ports eliminates apotential performance bottle-neck, and applying load-bal-ancing techniques to the mul-tiple paths available offers con-sistent performance for the en-tire network.

SCALABILITY. It’s easy to ex-pand a grid network using in-expensive switches with lowport counts to accommodateadditional servers for in-creased performance, band-width and capacity. In essence,grid storage is a way to scaleout rather than up, using rela-tively inexpensive storagebuilding blocks.

Grid Storage

EMERGING TECHNOLOGIES

A well-designedgrid network is ex-tremely resilient.Rather than pro-viding just two

paths between anytwo nodes, the grid

offers multiplepaths between

each storage node.

Emerging Technologies Computerworld Executive Briefings 24

Vendor OfferingsThe biggest player in the gridstorage arena seems to beHewlett-Packard Co., the firstmajor storage vendor to deliv-er a grid storage product. HP’sStorageWorks Grid architec-ture stores information in nu-merous individual “smartcells” — standardized, modu-lar and intelligent devices.HP’s smart cells will be small-er than the monolithic storageusually found in storage-areanetworks. They will be basedalmost entirely on low-cost,commodity hardware and pro-vide a basic storage unit thatcustomers can configure andchange as needed to run manydifferent tasks.

According to an HP techni-cal white paper, smart cellscontain a CPU and, optionally,cache memory in addition tostorage devices (disk, opticalor tape drives). The cells areinterconnected to form a pow-erful, flexible, peer-to-peerstorage network. All smartcells have a set of commonsoftware installed in them, buteach can be given a specificfunction (or personality) byloading appropriate opera-tional software for tasks suchas capacity allocation, policyand reporting, block or fileserving, archiving and re-trieval, or auditing and an-tivirus services. Administra-tors can change the functionsof specific smart cells to deliv-er different types of services asbusiness needs change. And

it’s easy to expand the grid bysimply adding modules, whichare automatically detected andincorporated into an appropri-ate domain.

Smart CellsSmart cells are more capablethan single-purpose disk ar-rays or tape libraries. Becausethe data path is completely vir-tualized, any smart cell canmanage any I/O operation.Smart cell software maintainsconsistency between smartcells as well as ensures data re-dundancy and reliability. Inthe HP StorageWorks grid, allcomponents are integrated topresent a single system imagefor administration as a singleentity. The system is designedfrom the ground up to be self-managing — tasks traditionallyassociated with storage re-source management are per-formed by the utility itself,with no human involvement.The only time an administra-tor needs to know anythingabout individual smart cells iswhen failed hardware must berepaired or replaced. Eventhen, the single-system-imagesoftware provides fault isola-tion and failure identificationto simplify maintenance.

So far, HP has announcedfour StorageWorks Grid prod-ucts:

n Document Capture, Retentionand Retrieval, using multifunc-tion printers, scanners anddigital senders to simplify theconversion of paper-based

records to digital form. n Sharable File System, a self-

contained file server that dis-tributes files in parallel acrossclusters of industry-standard,Linux-based server and stor-age components.

n Reference Information StorageSystem (RISS), an all-in-onearchive and retrieval systemfor storing, indexing and rap-idly retrieving e-mail and Mi-crosoft Office documents,available in 1TB and 4TB con-figurations.

n StorageWorks XP120000, anew enterprise-class disk arrayproviding a two-tier storagearchitecture with single-sys-tem image management

Nasdaq Stock Market Inc.has deployed RISS in an effortto comply with regulationsabout e-mail archiving. Theproduct also provides NewYork-based Nasdaq with astrategic foundation for man-aging its overall informationlife cycle.

“HP’s StorageWorks Grid isa visionary architecture for anintelligent, scalable, reliableand agile storage platform,”says Joseph Zhou, a senior ana-lyst at D.H. Brown AssociatesInc. in Port Chester, N.Y. Zhounotes that the grid leveragesHP’s storage and server tech-nology expertise, as well as re-search done at HP Labs, to cre-ate a new storage environmentthat delivers long-promised ca-pabilities in novel and veryuseful ways. An open question,Zhou says, is whether HP will

eventually converge its devel-opment toward a single unifiedgrid encompassing bothservers and storage.

Other Players Oracle Corp. has announcedpartnerships with several stor-age providers (including EMCCorp., HP and Hitachi Ltd.) tosimplify storage managementfor customers and offer sup-port for the new Oracle Data-base 10g. Oracle and its part-ners plan to enhance featuresto automate the storage admin-istration and provisioningtasks, thereby freeing up data-base and storage administra-tors for more productive work.

In Europe, New York-basedExanet Inc. has launched itsExaStore Grid Storage 2.0 soft-ware-based product builtaround grid computing. It clus-ters all storage system re-sources, including RAID ar-rays, servers and controllersand cache memory, into a sin-gle unified network-attachedstorage resource with a singlenamespace operating in a het-erogeneous environment usingseveral storage protocols. TheExanet product is aimed at ap-plications for digital media andpremedia, media streaming, oiland gas, digital video anima-tion, medical imaging and oth-ers with performance, high-volume and high-availabilityrequirements.

Emerging Technologies Computerworld Executive Briefings 25

IN THE ONGOING STRUGGLE

to automate and speeddata backups and restores,storage administrators areincreasingly turning to

Advanced Technology Attach-ment disk subsystems. Nowtwo vendors are pitching theidea of using specialized ATAdisk backup appliances as analternative to robotic tape au-toloaders for handling largevolumes of archival storage.Both are using specialized ATAdisk array technology to lowerthe cost per gigabyte of disk-based storage and extend thelife of backup disk drives, mak-ing them more attractive forarchival and near-line storage.

The vendors, Longmont,Colo.-based start-up CopanSystems Inc. and Santa Clara,Calif.-based Exavio Inc., claimthat this new technology,dubbed MAID, for massive ar-rays of idle disks, is competi-tive with tape and offers fasterand more reliable access todata. MAID systems use arraysof ATA disk drives that powerdown when idle in an effort toextend media life. By spinningup only when they write orread data, the arrays use lesspower, mitigating heat issuesand allowing drives to bepacked more densely into thesystem. Idle disk drives requireabout 10 seconds to spin up,but once online, they providefaster access to archived datathan tape does.

Although powering up disksas needed can extend usefullife, disks that remain inactivefor long periods tend to devel-op problems spinning up. Toavoid this, MAID arrays canperiodically power up alldrives to relubricate the me-chanics, Copan says. Drivesare hot-swappable, and thesystems support RAID forfault tolerance. Prices rangefrom $3 to $5 per gigabyte, de-pending on the configuration,the amount of redundancy andtotal capacity.

Steve Curry, architect forstorage operations at YahooInc. in Sunnyvale, Calif., isconsidering buying Copan’sRevolution 200T MAID arrayto cut the use of some 350 tapedrives by half. By doing so, hehopes to improve reliability.“We see [one or two tapedrive] failures every day. To us,it’s not super-unreliable, but itstill has mechanical propertiesand does break down, whichrequires manual intervention,”Curry says.

Archiving to MAID Today Yahoo ships archivaltapes to an underground stor-age facility run by Boston-based Iron Mountain Inc. Cur-ry wants to locate a MAID ar-ray at the backup facility andarchive to it directly using a Fi-bre Channel or Fibre Channel-over-IP link. “From our calcu-lations, it’s looking like it’sdoable. We are just waiting forsomeone to build a productthat works as advertised,” hesays.

Copan’s 200T, announcedlast month, emulates a virtualtape library. It will scale to224TB and restore 2.4TB ofdata per hour — about fivetimes faster than tape accessspeeds — while keeping onlyone in every four drives pow-ered up and online at any onetime. The basic 56TB configu-ration, which includes 2247,200 rpm, 250GB Serial ATAdisk drives mounted in a singlerack, will ship in the thirdquarter and sell for $196,000,or about $3.50 per gigabyte, ac-cording to Aloke Guha, Co-pan’s chief technology officer.Exavio’s ExaVault array is pri-marily marketed as a devicefor near-line storage andstreaming of multimedia con-tent, although the companyclaims that the array can alsoemulate a tape backup system.ExaVault, available now, uses300GB, 5,400 rpm and parallelATA disk drives arranged in asingle rack with one controllerand a Fibre Channel or GigabitEthernet interface. Configura-tions range from 3TB to120TB. A basic unit including acontroller and 3.6TB of storageis $27,700; additional modulesare $6,600 per terabyte, saysKevin Hsu, Exavio’s director ofmarketing and product man-agement.

Despite MAID’s advantages,digital tape libraries remainthe cheaper form of storage, at

MAIDStorage

EMERGING TECHNOLOGIES

Massive Arrays of IdleDisks (MAID)

WHAT IT IS: Low-cost disk-based backup and archivingappliances that power downidle disks to extend medialife. Lower power require-ments and less heat allowfor more compact, lower-cost designs.

PROS: Faster and more reli-able than tape libraries.

CONS: Cost and portability.At $3 to $5 per gigabyte,MAID still costs more thantape libraries. Disk mediaaren’t well suited for off-site storage.

AT A GLANCE

Emerging Technologies Computerworld Executive Briefings 26

about $1.25 to $4.50 per giga-byte, according to Fred Moore,president of Horison Informa-tion Strategies in Boulder,Colo. The low cost of tape andthe fact that tape cartridgescan be easily removed andstored off-site are the medi-um’s most attractive features.In contrast, the individual diskdrives that make up MAID ap-pliances are bulkier and morefragile.

Hsu acknowledges thatMAID systems cost more pergigabyte than tape libraries butargues that they are less ex-pensive to run overall. “Ter-abyte for terabyte, tape ischeaper than MAID. If youlook at total cost of ownership. . . you have to look at robot-ics, manpower, replacing thetape heads, maintenance costs.MAID is cheaper,” he says.

Robert Amatruda, an IDC

analyst, disagrees, saying thattape still provides a lower totalcost of ownership overall.“You’re looking at a lot lessmoney. It’s still a compellingsolution,” he says.

Both Exavio and Copan aredeveloping portable versionsof their systems. Copan, for ex-ample, is working on specialshockproof disk enclosuresthat could be transported off-site. Drives would be stored

remotely in a Revolution 200Tshell chassis that would spinup the drives periodically tokeep them conditioned for use.

But Amatruda eyes suchportability designs with skepti-cism. “You drop some of thatstuff and there could be dataintegrity issues,” he says. “Atthe end of the day, disk andtape will play a complementa-ry role.”

© Copyright 2005, Computerworld Inc., Framingham, Mass.

See our full selection of Executive Briefings at the Computerworld Store.

https://store.computerworld.com

Computerworld has Executive Briefings on many subjects including Outsourcing, Mobile & Wireless, Storage, ROI and Security.