remote data management and backup best practices - signiant

12
Remote Data Management & Backup Best Practices A Signiant White Paper

Upload: jinishkg

Post on 18-Nov-2014

505 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Remote Data Management and Backup Best Practices - Signiant

Remote Data Management & Backup Best Practices

A Signiant White Paper

Page 2: Remote Data Management and Backup Best Practices - Signiant

Remote Data Management & Backup Best Practices

Table of Contents

Abstract .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Understanding the Challenges of Remote Data .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Key Considerations for Managing Remote Data .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Additional Requirements for Remote Data Backup ... . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 The Case for Archiving ... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 The Central Policy/Consolidated Approach to Managing Remote Data . 6

Disk-to-Disk Consolidated Backup....................................................................................7

Consolidated Archive .........................................................................................................8

Remote Data Management with Signiant Mobilize .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 A Best Practices Guide to Managing Remote Data .. . . . . . . . . . . . . . . . . . . . . . . . . . . 10 Summary .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

© Copyright Signiant Inc. Page 2

Page 3: Remote Data Management and Backup Best Practices - Signiant

Remote Data Management & Backup Best Practices

Abstract The increasing risk from unprotected user files and remote data (data stored outside the data center) is causing companies to re-evaluate their current remote backup processes. Managing remote data poses unique challenges given the variability of networks, computing platforms, lack of trained IT staff at remote locations and other issues. Further, traditional methods of managing remote data tend to be high cost, unreliable, manually intensive and often require redundant equipment and effort. Advanced remote data management and movement technology such as that incorporated into Signiant Mobilize™ now makes it possible to cost-effectively solve the challenges of managing data at remote offices. This paper discusses the issues, requirements, and approaches to effective remote data management, with specific emphasis on remote data protection and backup. Also included is a Best Practices Guide to help assess your remote data management requirements.

Understanding the Challenges of Remote Data Protecting remote data, and managing the exchange of data between corporate locations and remote offices is neither trivial, nor cheap. IT administrators in companies with remote offices often spend significant amounts of their time managing backup, data management and data transfer requirements for those offices. Even so, critical requirements such as backups and disaster recovery may not be adequately covered. Often, central IT staff must rely on non-technical staff in remote locations to change backup tapes, initiate processes and take other actions they are neither trained nor compensated to perform. As a result, companies report as much as 60% of their remote backup procedures may fail on a nightly basis. This represents operational, litigation and compliance risks that few companies can afford. When problems do occur, recovery can be tedious and take days to complete, assuming the data was adequately backed up. Central IT personnel may need to have tapes shipped from the remote site, catalog the volume and search for the files needed, and then reship the files for restore. Online alternatives such as Consolidated Backup, Disk-Based Backup and Centralized Archive can improve backup and speed recovery and have proven to improve reliability, overall data protection and significantly lower costs. These methods are discussed further in this paper, however there are a number of issues that must be considered as you evaluate these new approaches and the technologies to implement them.

© Copyright Signiant Inc. Page 3

Page 4: Remote Data Management and Backup Best Practices - Signiant

Remote Data Management & Backup Best Practices

Key Considerations for Managing Remote Data To effectively manage remote data, you must consider and address a number of specific functional, technology and environmental factors that do not necessarily come into play in the data center. These factors include: Central Policy-based Control: To efficiently control data at remote sites, enterprises must have the ability to set up and implement central policies. That means setting a rule once and having that directive implemented throughout the enterprise, rather than managing activities individually at different sites with multiple separate platform-specific policies and tools. While many products claim “central control” capability, they in fact require administrators to establish a unique connection to each remote node to set policy. This approach eats up hours of administration time not only during initial set up, but each time a business requirement changes. Some technologies, such as Signiant Mobilize, provide a “set it and forget it” approach that automates the communication of policy to the remote node, and have integrated notification if something does not proceed according to policy. WAN Network Bandwidth Utilization: Any solution that addresses remote data must take into account bandwidth restrictions, as well as a range of network conditions. Remote locations frequently have varying bandwidth that needs to be shared among multiple applications and users at particular times. For this reason, remote data management and movement solutions should have features that enable efficient use of available bandwidth such as byte-level differential data transfer, bandwidth scheduling and throttling, multi-streaming and compression. The amount of network overhead, or information that is in addition to the data being transferred, that a product sends over the network is an important consideration. Obviously, less is better than more. Finally, since some remote connections will likely be impaired during some processes, the ability to restart at the point of failure is critical as well as the ability to re-route information flow to alternate network connections / paths. Security and Data Integrity: When moving data over networks, data security is always a major concern. Networks are always susceptible to intrusion, but particularly in remote locations where there are fewer IT controls. As a result, any remote data solution should authenticate all sending and receiving nodes prior to any data transfer, and encrypt data during transmission. Moreover, they should utilize a single firewall port and minimize firewall rules. The ability to ensure that data is received with 100% integrity is also an important consideration. One of the biggest points of failure for remote tape backups is that data is corrupted on the tape, and therefore non-recoverable. With best-in-class disk-to-disk backup technologies, data accuracy can be 100% guaranteed. Remote Process Automation and Application Interfacing: To minimize or eliminate the need for manual effort at remote locations, the management solution must be able to automate processes and interface with remote applications to access data. For example, when backing up applications like Exchange or SQL Server, it is preferable to use native backup routines. Therefore, the remote data solution must be able to integrate with the application and invoke the native backup package as part of the backup process. Similarly, for applications such as SAP and Oracle, data must be accessed through the application to ensure integrity, instead of

© Copyright Signiant Inc. Page 4

Page 5: Remote Data Management and Backup Best Practices - Signiant

Remote Data Management & Backup Best Practices

accessing it directly at the database to maintain data and application consistency, filesystem or disk levels. In addition, other custom or script-based processes may also be needed or required prior, during or after data transmission. The remote management solution should automate these as part of the overall remote backup process. Heterogeneous system support: It is common that a company with multiple remote locations will have a variety of computing platforms and applications at those locations to support varying business applications and processes. It is therefore important to choose a solution that can work within a heterogeneous environment. While this seems simplistic, many products today only work within homogeneous (single platform) environments. Point-in-time vs. continuous replication: Continuous replication products are products that continuously monitor a filesystem and capture changes as they happen and either replicate them immediately or cache the information for bulk transfer at a later time. While these products are ideal for continuous replication between a small number of systems for business continuity purposes, they are not ideal for periodic processes such as backup and archive. Point-in-time replication products are more appropriate for periodic processes such as backup and archive and in general will be far more network efficient. Continuous replication provides protection against device failure, while point-in-time replications protects against data loss from accidental deletion, corruption, and threats such as viruses.

Additional Requirements for Remote Data Backup Beyond the basic remote data considerations listed in the previous section, there are some specific requirements for remote backup that become important as Consolidated and Disk to Disk Backup methods are considered. Backup at remote offices requires more than just writing the data to tape. Backup solutions must address data integrity and accuracy, ownership preservation, automatic operation, offsite storage for D/R, and of course, restoration. When storing user files for backup, it is important that the integrity and ownership of the original data be preserved; the most important characteristic of a backup is that it can be restored with full integrity. Most backup to tape technologies fully preserve ownership and integrity, and the more sophisticated disk-to-disk and centralized backup technologies do as well; however, some technologies now being pushed for disk-to-disk and centralized backup do not. As you evaluate new technologies be sure to check for this. A backup must represent the true data status at the time of the backup. While tape backup software often has the capability to handle files left in an open state at the time of backup, it is important that your disk-based backup mechanism have options (skip, open file transfer, or create an error log entry) for handling open files. Backup processes for remote offices ideally should require little or no local manual intervention, but instead be a completely automated, “lights out” operation. Offsite data storage is a requirement in any total data protection program, but a local backup to tape process at remote sites always involves some level of manual operations to load & unload backup media and move it to the offsite storage. Alternatively, online backup, which can transmit data to another location to be backed up either to disk or to tape, can eliminate the need for any redundant manual effort at the remote sites.

© Copyright Signiant Inc. Page 5

Page 6: Remote Data Management and Backup Best Practices - Signiant

Remote Data Management & Backup Best Practices

The Case for Archiving An often-overlooked, but critical component of remote data management and protection is archiving. Lets face it, few of us have the time or interest to clean out our electronic files. Emails building up in Outlook inboxes and other files building up in private and shared directories are contributing to the huge volume of data growing on remote storage. In a recent survey by Storage Magazine, users indicated the single biggest reason for backup failure was the quantity of data was too large to be backed up within the backup window. The fact is that most user files and email data are seldom re-opened after the first three days of creation/receipt. Statistics show that if a file hasn’t been accessed in 90 days, there’s a 90%+ probability that it will never be accessed again. Meanwhile, it consumes valuable storage resources. The problem is that since we can’t predict what data we will need in the future, we hold on to all of it. The cost-effective approach to long term retention is to move older data to lower cost storage (archive), while maintaining reasonably easy retrieval capabilities. A second key factor driving the need for archiving is the federal document regulations such as SEC Rule17a-4, Sarbanes-Oxley and hundreds of others that require many companies to retain all communication and documentation for specific time periods. For a distributed enterprise – with many remote offices, ensuring compliance to these regulations can be a challenge. So, cost and legal requirements are compelling companies to ensure that employee messages are archived or at the very least, moved to lower-cost, longer-term media. Similar to remote backup, ownership, security and data integrity are essential to any archival solution. Consolidated archival automatically moves older or infrequently accessed data from remote production systems to a central, often lower cost ATA disk, while leaving transparent access capability for the remote user, and is rapidly gaining acceptance as the most viable approach to archive for remote data.

The Central Policy/Consolidated Approach to Managing Remote Data Rather than relying on individual backups and separate point processes for each remote site and the staffing required for each, a more effective enterprise approach is to allow central IT staff to control remote data management and backup. This requires understanding the changing properties and characteristics of remote data. Solutions should be able to set policies pertaining to the data, automate processes to execute those processes on remote servers, and be able to move data between remote or “edge” servers and central or “core” systems. In this model, individual remote backup and archiving processes at the remote sites are replaced with a consolidated process that moves remote data to a hub site for backup or archive. This requires moving the pertinent data over the available networks in an efficient, secure, timely fashion and therefore requires technology that can deal with the many issues

© Copyright Signiant Inc. Page 6

Page 7: Remote Data Management and Backup Best Practices - Signiant

Remote Data Management & Backup Best Practices

associated with controlling and moving data among many sites and network connections. These issues are identified in more detail in the next section. Centrally controlled, automated processes have been shown to decrease backup costs by as much as 75% due to the elimination of tapes, tape drives, offsite tape storage and the elimination of redundant or inadequate staffing efforts at each location.

Disk-to-Disk Consolidated Backup Disk-to-disk backup is gaining popularity due to many factors including the rapidly falling cost of disk storage, the elimination of physical limits, the relative unreliability of magnetic tapes, and the need for more ready access to data for restore. Implemented in a best practices model, disk-to-disk backup for remote data involves moving the data to be backed up over a network to a different location. The reason for this is that disk-to-disk backup, if performed at the same site, does not provide the protection required for site-level disaster recovery. For any business with multiple remote locations, consolidating disk-to-disk backup brings operational and cost efficiencies plus enhanced data security and availability. There are two primary Consolidated Backup architectures: Moving differential data to a central disk Consolidating backup images

The common thread to both is central control and automation and the elimination of individual tapes, tape drives and offsite tape storage processes at each site. In the first, more common approach to Consolidated Backup, data at remote sites is periodically analyzed to determine differential data (i.e. data that has changed) since the last backup process. A copy of this differential data is then moved to a central site, where it is stored on disk. Some state-of-the-art technologies have the ability to discern just the byte-level modifications of files to minimize the amount of data that needs to be transferred. Data can be stored as incremental packets (i.e. snapshots) of data or re-constructed on the central site to provide full, up-to-date copies of remote files. This latter alternative provides the advantage of providing instant access to individual files in the case that a remote file is accidentally deleted. In the second approach, backups are run on remote servers with the output stored to a local disk. The resulting backup image is then transferred to disk at the central site. This works well for applications that have native backup or snapshot features that can be utilized in the Consolidated Backup process. These approaches can also be used together. For example, backing up user files may be best performed with differential data transfer, while backing up Exchange data may be best performed using the consolidated backup image approach.

© Copyright Signiant Inc. Page 7

Page 8: Remote Data Management and Backup Best Practices - Signiant

Remote Data Management & Backup Best Practices

In both approaches, the backup data on disk at the core location can be further sent to tape if desired. Companies often choose to keep one or two days of backup data on disk for instantaneous access, with older data written to tape.

Consolidated Archive Consolidated Archive involves identifying remote data that meets corporate archival policy and then automatically moving that data from remote drives and archiving it to a central disk. Archival policies determine what data should be archived and when and often includes: last date accessed, type of file, content, ownership, size of file, or other criteria. A consolidated archival process has many benefits such as: Reducing the amount of data to be backed up on a regular basis (shorter backup

windows) Optimizing use of remote disk – better performance, cost Ensuring compliance to data retention policies and regulations

An essential part of consolidated archive is some mechanism by which data can be retrieved from the archive by end users, preferably without the involvement of IT. Central policy-based consolidated processes such as consolidated backup and archive provide an approach to managing data at remote offices that can significantly lower costs, eliminate risk, improve data consistency, and also ensure better compliance to corporate backup and retention policies. It is an approach that all businesses with remote offices should actively consider.

Remote Data Management with Signiant Mobilize Signiant Mobilize™ is an enterprise-class software solution that enables organizations to centrally control and securely move data among remote and core locations. Highly scalable, Mobilize can handle up to thousands of locations across heterogeneous Windows, UNIX and Linux systems. Signiant’s patent-pending technology provides critical remote data capabilities such as policy-based central control, remote process automation, transport and data-level security, guaranteed data integrity, and the ability to deal effectively with many types of networks, including high-latency networks. Mobilize technology is in use by leading companies worldwide to manage, control and move remote data. Mobilize Manager: The Mobilize Manager is the central control center for enterprise-wide remote data processes. Remote data processes are easily set up, scheduled, deployed and monitored through the Manager’s graphical user interface. Mobilize Agents: Mobilize Agents are installed on all Windows, UNIX, Linux or NAS systems involved in the remote data processes. They are remotely installed

© Copyright Signiant Inc. Page 8

Page 9: Remote Data Management and Backup Best Practices - Signiant

Remote Data Management & Backup Best Practices

through the Mobilize Manager and execute processes and data transfers based on instructions and rules sent from the Manager and can handle multiple tasks, such as data consolidation, distribution or long distance synchronization. Mobilize Remote Data Solutions To make it easier for companies to apply Mobilize technology to solve specific remote data problems, Signiant has developed a number of solution packages for the most prevalent remote data problems. These solutions include: Remote Data Discovery: To properly manage remote data, a good understanding of data

inventories at remote sites is essential. The Remote Data Inventory solution automatically collects and reports on data characteristics at remote locations such as file types and sizes, ownership, file create/modified/access dates, file system size, capacity utilization and much more.

Consolidated Archive: This solution provides for rules-based archival of remote data to

central systems while providing easy retrieval of the archived file. Mobilize will automatically archive files based on flexible policy, such as file type, size or access date. A unique file marker technology allows users to retrieve archive files simply by clicking on them.

Consolidated Backup: This solution efficiently consolidates data from multiple locations

for a unified backup process. Mobilize identifies changes made to remote files since the last backup and on a scheduled or event-driven basis moves only the bytes of those files that have changed to the central site.

© Copyright Signiant Inc. Page 9

Page 10: Remote Data Management and Backup Best Practices - Signiant

Remote Data Management & Backup Best Practices

A Best Practices Guide to Managing Remote Data Best practices for managing and protecting remote data involve both understanding and implementing technology that supports the remote automated processes. There are five primary steps toward implementing an enterprise-wide remote data management solution:

1. Identify and understand remote data and the network environment 2. Select a remote data management solution 3. Create policies for how remote data should be managed 4. Deploy centrally controlled automated processes to implement the policies 5. Monitor and adjust as business conditions change

The questions below are designed as a guideline towards implementing the first two steps of the five-step remote data management process. Assess the current system: How effective are backup and D/R processes in remote sites? Which types of applications are you using for data transfers? (e.g. FTP, xcopy,

robocopy, tftp, DFS/FRS, public folder replication) How much manual intervention is currently required? What are the failure rates for these systems? How much does it cost when they fail? How easy is it to adjust to new business requirements?

Determine your remote data management goals: Knowing your goals for data movement will assist in developing the cost-recovery models to justify any purchases needed. Example goals include: Reduce backup failure rates/increasing data protection Reduce mean time to restore for remote office Ensure regulatory compliance for data management and retention Automate data transfers to and from remote sites

Determine the types of data that need to be moved: How much data is at the remote locations and what are the characteristics of the data? (size, file types, disk utilization, etc) Which characteristics need to be maintained:

i. Ownership preservation? ii. File system attributes?

iii. Physical disk layout? What applications are running in the remote locations? Does my data management system need to integrate with particular vendor

applications in ‘real-time’? What data are users currently not backing up effectively? (What is my current exposure?) What types of data need to be sent to the remote office?

© Copyright Signiant Inc. Page 10

Page 11: Remote Data Management and Backup Best Practices - Signiant

Remote Data Management & Backup Best Practices

Determine the data movement volume: Neither the time available nor the network bandwidth is infinite. You’ll need to crunch the numbers and come up with: What is the rate of change of the data on a day-to-day basis? How many sites need to be aggregated in the backup or archive consolidation system?

Assess your current network: What is the available bandwidth to each remote location? What other applications are currently using this bandwidth? How much bandwidth do

these applications require? o e.g. Active Directory replication, Electronic mail, Terminal services

Can traffic be segregated using quality of service (QoS) applications? (i.e. Will you be able to dedicate bandwidth to certain applications?)

Is the network traffic prone to bursts? How secure is the network? (e.g. Are encrypted VPNs in place to support confidential

data transfers?) Choose your solution: Evaluate potential vendors based against required remote data capabilities:

Capability Vendor 1 Vendor 2 Vendor 3 Yes No Yes No Yes No Central Policy-Based Control Network Efficiency Remote Process Automation Security Support for Heterogeneous Environments Point-in-Time Replication

Can the vendor’s solution solve multiple remote data problems, such as backup, and

archive and distribution? Does the vendor have expertise with remote data application and integration?

i. Will the vendor assist in assessing your requirements? ii. Will the vendor provide the tools to assess your data change and growth

rates?

© Copyright Signiant Inc. Page 11

Page 12: Remote Data Management and Backup Best Practices - Signiant

Remote Data Management & Backup Best Practices

© Copyright Signiant Inc. Page 12

Summary Many companies are re-evaluating their current backup processes, not only to ensure the proper protection of critical data, but for also with the goal of lowering overall IT costs, and safeguarding themselves from litigation and the penalties of regulatory non-compliance. Managing remote data effectively requires that you deal with a network’s variability, dissimilar computing platforms, security needs, data integrity, and then implement process automation to overcome the lack of trained IT staff at remote locations. The good news is that all this does not have to be hard or complex. Advanced remote data management and movement technology such as provided by Signiant Mobilize™ now makes it possible to cost-effectively solve the challenges of managing data at remote offices with a single unified approach. Understanding the issues, requirements, and approaches to effective data protection for remote data, with specific emphasis on remote data backup and archive is the first step to helping your company assess its remote data requirements. For a 3-minute tour of Mobilize, go to www.mobilizetour.com or for more information on Signiant Mobilize, contact us at [email protected] or 781-221-0022.

CORPORATE HEADQUARTERS 15 Third Avenue | Burlington, Mass. 01803 USA Tel: 781-221-0022 | Email: [email protected] CANADA OFFICE 515 Leggat Drive | Kanata, Ontario K2K 3G4 Tel: 613-599-2140 www.signiant.com