HP StoreOnce Backup systems Concepts and Configuration Guidelines
Abstract If you are new to the HP StoreOnce Backup System, read this guide before you configure your system. It provides network and Fibre Channel configuration guidelines, describes the StoreOnce VTL, NAS, StoreOnce Catalyst and Replication technology and advises how to plan the workload being placed on the HP StoreOnce Backup System in order to optimize performance and minimize the impact of backup, deduplication, replication and housekeeping operations competing for resources. The information in this guide is valid for both single-node and multi-node G3 StoreOnce Backup systems, running software version 3.6.0 and later. IMPORTANT:
Always check http://www.hp.com//manuals for the most up-to-date documentation for your product.
HP Part Number: BB877-90913 Published: November 2013 Edition: 4
© Copyright 2011 — 2013 Hewlett-Packard Development Company, L.P. Confidential computer software. Valid license from HP required for possession, use or copying. Consistent with FAR 12.211 and 12.212, Commercial Computer Software, Computer Software Documentation, and Technical Data for Commercial Items are licensed to the U.S. Government under vendor's standard commercial license. The information contained herein is subject to change without notice. The only warranties for HP products and services are set forth in the express warranty statements accompanying such products and services. Nothing herein should be construed as constituting an additional warranty. HP shall not be liable for technical or editorial errors or omissions contained herein. WARRANTY STATEMENT: To obtain a copy of the warranty for this product, see the warranty information website: http://www.hp.com/go/storagewarranty Linear Tape-Open, LTO, LTO Logo, Ultrium and Ultrium Logo are trademarks of Quantum Corp, HP and IBM in the US, other countries or both. Microsoft, Windows, Windows NT, and Windows XP are U.S. ed trademarks of Microsoft Corporation. Intel and Itanium are trademarks or ed trademarks of Intel Corporation or its subsidiaries in the United States and other countries. AMD is a ed trademark of Advanced Micro Devices, Inc. Revision History Revision 1
November 2011
This is the first edition, issued with the launch of the HP StoreOnce B6200 Backup system. Revision 2
May 2012
This is the second edition, issued with the 3.3.x version of HP StoreOnce software. Revision 3
May 2013
This is the third edition, issued with the 3.6.0 version of HP StoreOnce software. It has been expanded to include information about new models and also now contains generic information from the separate document HP StoreOnce Backup system best practices for VTL, NAS, StoreOnce Catalyst and Replication implementations. Revision 4
October 2013
This is the fourth edition, issued with the 3.9.0 version of the HP StoreOnce software. It has been updated to include new models, HP StoreOnce 2700, 4500 and 4700 Backup.
Contents 1 Before you start..........................................................................................7 Overview................................................................................................................................7 HP StoreOnce Backup system models..........................................................................................7 StoreOnce Catalyst targets for backup applications.......................................................................8 NAS targets for backup applications...........................................................................................9 Virtual Tape Library targets for backup applications......................................................................9 Comparing StoreOnce Catalyst, NAS and Virtual Tape Library target devices...................................9 Networking and Fibre Channel considerations...........................................................................10 Licensing...............................................................................................................................11 Security Features....................................................................................................................12 For more information...............................................................................................................13
2 HP StoreOnce technology..........................................................................14 Data deduplication.................................................................................................................14 Key performance factors with deduplication that occurs on the StoreOnce Backup system............14 VTL and NAS Replication overview...........................................................................................15 Key performance factors with replication...............................................................................15 Catalyst Copy and deduplication..............................................................................................15 Housekeeping........................................................................................................................15 Backup Application considerations............................................................................................16 Multi-stream or multiplex, what do they mean?.......................................................................16 Why multiplexing is a bad practice......................................................................................16 Effect of multiple streams on StoreOnce Performance .............................................................18 Data compression and encryption backup application features................................................21
3 Concepts specific to StoreOnce B6200 Backup system..................................22 The HP StoreOnce B6200 Backup system...................................................................................22 B6200 Basic Concepts............................................................................................................23 Deployment choices................................................................................................................25
4 Networking considerations........................................................................27 Common networking considerations..........................................................................................27 ed Ethernet configurations........................................................................................27 Network bonding modes....................................................................................................28 Network configuration in single-node StoreOnce Backup systems..................................................28 General guidelines.............................................................................................................29 Single port configurations...................................................................................................30 Dual port configurations.....................................................................................................31 Bonded port configurations (recommended)..........................................................................32 10GbE Ethernet ports on StoreOnce Backup systems...............................................................33 Network configuration for CIFS AD......................................................................................33 Option 1: HP StoreOnce Backup system on Corporate SAN and Network SAN.....................34 Option 2: HP StoreOnce Backup system on Network SAN only with Gateway.......................35
5 Network configuration in multi-node StoreOnce Backup systems......................36 What is currently ed.....................................................................................................36 What is not currently ed................................................................................................36 ed network configurations (templates).............................................................................36 Gateway considerations.....................................................................................................37 Template 1, uses 10 GbE and 1 GbE sub-nets........................................................................37 Template 2, uses 1GbE network only....................................................................................38 Template 3, uses 10GbE network only..................................................................................39 Template 4, uses two 1GbE networks...................................................................................39 Contents
3
Template 5, uses 1GbE network only....................................................................................40
6 Fibre Channel considerations.....................................................................41 Port assignment for StoreOnce Backup systems with two Fibre Channel cards..................................41 General Fibre Channel configuration guidelines..........................................................................42 Switched fabric..................................................................................................................42 Direct Attach (private loop)..................................................................................................44 Zoning.............................................................................................................................44 Use soft zoning for high availability.................................................................................45 Diagnostic Fibre Channel devices.............................................................................................46
7 Configuring FC to failover in a StoreOnce B6200 Backup system environment................................................................................................47 Autonomic failover..................................................................................................................47 What happens during autonomic failover? ...........................................................................47 Failover with backup applications ............................................................................48 Deg for failover ........................................................................................................49 Key Failover FC zoning considerations ......................................................................................49 Fibre channel port presentations ..............................................................................................50 Scenario 1, single fabric with dual switches, recommended..........................................................52 Scenario 2, single fabric with dual switches, not advised.............................................................53 Scenario 3, dual fabric with dual switches, recommended............................................................54 What happens if a fabric fails? ..........................................................................................55 Scenario 4, dual fabric with dual switches, not advised...............................................................56
8 StoreOnce Catalyst stores..........................................................................58 StoreOnce Catalyst technology.................................................................................................58 Key features......................................................................................................................58 StoreOnce interfaces with StoreOnce Catalyst............................................................................59 Catalyst Copy........................................................................................................................60 Summary of Catalyst best practices...........................................................................................61 StoreOnce Catalyst and the StoreOnce GUI...............................................................................62 Maximum concurrent jobs and blackout windows...................................................................62 Client access permissions....................................................................................................63 More information....................................................................................................................64
9 Virtual Tape Devices.................................................................................65 Overview..............................................................................................................................65 Tape Library Emulation ...........................................................................................................65 Emulation types.................................................................................................................65 Cartridge sizing.................................................................................................................66 Number of libraries per appliance.......................................................................................66 Backup application and configuration.......................................................................................66 Blocksize and transfer size..................................................................................................66 Rotation schemes and retention policy.......................................................................................67 Retention policy.................................................................................................................67 Rotation scheme.................................................................................................................67 Summary of VTL best practices.................................................................................................68
10 NAS shares...........................................................................................70 Operating system ........................................................................................................70 Backup application ......................................................................................................70 Shares and deduplication stores...............................................................................................70 Maximum concurrently open files.........................................................................................71 Maximum number of NAS shares.........................................................................................71 Maximum number of files per NAS share and appliance........................................................71 Maximum number of s per CIFS share.............................................................................72 4
Contents
Maximum number of hosts per NFS share ............................................................................72 CIFS share authentication....................................................................................................72 Backup application configuration..............................................................................................72 Backup file size..................................................................................................................73 Disk space pre-allocation....................................................................................................74 Block/transfer size.............................................................................................................74 Concurrent operations........................................................................................................74 Buffering...........................................................................................................................74 Overwrite versus append....................................................................................................74 Compression and encryption...............................................................................................75 ...............................................................................................................................75 Synthetic full backups.........................................................................................................75 Summary of NAS best practices...............................................................................................75
11 Replication.............................................................................................77 What is replication?................................................................................................................77 StoreOnce VTL and NAS replication overview............................................................................78 Replication usage models (VTL and NAS)..................................................................................78 What to replicate...................................................................................................................81 Appliance library and share replication fan in/fan out................................................................82 Concurrent replication jobs......................................................................................................82 Apparent replication throughput...............................................................................................82 What actually happens in replication?......................................................................................83 Limiting replication concurrency................................................................................................83 WAN link sizing....................................................................................................................83 Seeding and why it is required.................................................................................................84 Replication models and seeding..........................................................................................85 Controlling Replication ...........................................................................................................86 Replication blackout windows..............................................................................................87 Replication bandwidth limiting.............................................................................................87 Source appliance permissions..............................................................................................88
12 Seeding methods in more detail................................................................89 Seeding over a WAN link.......................................................................................................89 Co-location (seed over LAN)....................................................................................................92 Floating StoreOnce seeding.....................................................................................................94 Seeding using physical tape or portable disk drive......................................................................95
13 Implementing replication with the HP B6200 Backup system.........................98 Active/ive and Active/Active configurations.........................................................................98 Many to One configurations..................................................................................................102 Implementing floating StoreOnce seeding...........................................................................102 Balancing Many-to-One replication....................................................................................103 Replication and load balancing.........................................................................................104
14 Housekeeping......................................................................................106 Housekeeping Blackout window.............................................................................................106 Tuning housekeeping using the StoreOnce GUI.........................................................................106
15 Tape Offload.......................................................................................109 Terminology.........................................................................................................................109 Direct Tape Offload.........................................................................................................109 Backup application Tape Offload/Copy from StoreOnce Backup system.................................109 Backup application Mirrored Backup from Data Source.........................................................109 Tape Offload/Copy from StoreOnce Backup system versus Mirrored Backup from Data Source...110 When is Tape Offload required?............................................................................................110 Catalyst device types........................................................................................................110 Contents
5
VTL and NAS device types................................................................................................111 HP StoreOnce Optimum Configuration for Tape Offload............................................................112 Offload Considerations.........................................................................................................112 VTL Cloning/Media Copy to Physical Tape.........................................................................112 HP StoreEver Tape Libraries...............................................................................................113 Backup Application..........................................................................................................113 HP StoreOnce Optimum Tape Offload Configuration.................................................................113
16 Key parameters....................................................................................115 StoreOnce B6200 Backup.....................................................................................................115 StoreOnce 2700, 4500 and 4700 Backup...............................................................................116 StoreOnce 2610/2620, 4210/4220 and 4420/4430 Backup...................................................117
About this guide........................................................................................120 Intended audience................................................................................................................120 Related documentation..........................................................................................................120 Document conventions and symbols........................................................................................120 HP technical .............................................................................................................121 HP websites.........................................................................................................................121 Documentation .......................................................................................................121
6
Contents
1 Before you start Overview The HP StoreOnce Backup System is a disk-based storage appliance for backing up host network servers or PCs to target devices on the appliance. These devices are configured as Network-Attached Storage (NAS), StoreOnce Catalyst or Virtual Tape Library (VTL) targets for backup applications. The total number of backup target devices provided by an HP StoreOnce Backup System varies according to model. These devices may be all Storeonce Catalyst, all VTL, all NAS or any combination of Catalyst, NAS and VTL devices. All HP StoreOnce devices automatically make use of StoreOnce deduplication, ensuring efficient and cost-effective use of disk space. A further benefit of StoreOnce Catalyst devices is that deduplication may be configured to occur on the Media Server (low bandwidth) or on the StoreOnce Backup system (high bandwidth), allowing the to decide which makes most efficient use of available bandwidth.
HP StoreOnce Backup system models The information in this guide relates to all G3 HP StoreOnce Backup system models. (G3 means that the systems are running StoreOnce software version 3.x.x.) These products are also sometimes referred to as multi-node or single-node Backup systems, which relates to the hardware configuration. •
The HP StoreOnce B6200 Backup system is the multi-node model and is composed of 1 to 4 couplets (the number depends on the customer-specific configuration). A couplet is a paired combination of two nodes (or servers) that are directly connected in failover pairs. Within a couplet each node has access to and can assume the role of both nodes in the event of node failure. A couplet provides a minimum of storage and this can be expanded in increments by adding pairs of 12–disk storage shelves up to a maximum of three pairs per couplet.
•
All other HP StoreOnce Backup systems are single-node models and are composed of a single server. Some models expansion using 12–disk storage shelves.
Table 1 StoreOnce B6200 models Product model
Description
Ports
Storage expansion
HP StoreOnce B6200 Backup, EJ022A
Rack system with up to four 8 x 1GbE ports couplets. Each couplet 4 x 10GbE ports contains two nodes (servers) 8 x FC ports and two 12–disk RAID storage controllers nl
nl
Up to six 12–disk storage shelves per couplet (three per node)
Table 2 StoreOnce 2700/ 4500, 4700 Product model
Description
Ports
Storage expansion
HP StoreOnce 2700 A single server with iSCSI only 8TB Backup , BB877A four 2TB hot-plug disks
4 x 1GbE ports
Not ed
HP StoreOnce 4500 24TB Backup, BB878A
4 x 1GbE ports 2 x 10GbE ports 2 x FC ports
One 12–disk expansion shelf, BB881A
A single server with twelve 2TB hot-plug disks
Interfaces ed
iSCSI and FC
nl
nl
Overview
7
Table 2 StoreOnce 2700/ 4500, 4700 (continued) Product model
Description
Interfaces ed
HP StoreOnce 4700 24TB Backup, BB879A
A head server unit with two 1TB disks and a pre-configured storage array with twelve 2TB disks
iSCSI and FC
HP StoreOnce 4900 48TB Backup, BB903A
A head server unit iSCSI and FC with two 1TB disks and a pre-configured two-drawer disk enclosure with eleven 4TB disks and four hot spare disks
Ports
Storage expansion
4 x 1GbE ports 2 x 10GbE ports 4 x FC ports
Up to seven 12–disk expansion shelves, BB881A
4 x 1GbE ports 4 x 10GbE ports 4 x FC ports
Up to five 11–disk expansion kits for the disk enclosure, BB908A One additional disk enclosure with 11+4 spare pre-configured storage, BB904A
nl
nl
nl
nl
nl
Table 3 StoreOnce 2620. 4210, 4220, 4420 and 4430 models Product model
Description
Interfaces ed
Ports
Storage expansion
HP StoreOnce 2620 Backup , BB852A
A single server with four 1TB disks
iSCSI only
2 x 1GbE ports
Not ed
HP StoreOnce 4210 A single server with Backup, BB853A and twelve 1TB disks (see BB854A note below)
iSCSI ( BB853A)or FC 2 x 1GbE ports 2 x (BB854A) FC ports
One 12–disk expansion shelf
HP StoreOnce 4220 Backup, BB855A
A single server with twelve 1TB disks
iSCSI and FC
4 x 1GbE ports 2 x FC ports
One 12–disk expansion shelf
HP StoreOnce 4420 Backup, BB856A
A single server with twelve 1TB disks
iSCSI and FC
2 x 1GbE ports 2 x 10GbE ports2 x FC ports
One 12–disk expansion shelf
HP StoreOnce 4430 Backup, BB857A
A single server with twelve 2TB disks
iSCSI and FC
2 x 1GbE ports 2 x 10GbE ports2 x FC ports
Up to three 12–disk expansion shelves
NOTE: There are other legacy HP StoreOnce products, running StoreOnce software 2.x.x and 1.x.x. This guide should not be used with those products.
StoreOnce Catalyst targets for backup applications NOTE: See Catalyst devices (page 58) for a more detailed description of StoreOnce Catalyst technology and best practices. HP StoreOnce Catalyst is a unique interface and is fundamentally different from virtual tape or NAS. It provides the backup application with full control of backup and replication (called Catalyst Copy). HP StoreOnce Catalyst allows: •
Backup applications to back up data to a target store on the HP StoreOnce Backup system. Deduplication may occur on the media server or on the HP StoreOnce Backup system.
•
Backup applications to copy jobs between HP StoreOnce Backup systems. All configuration is carried out from the backup application, which makes this an attractive alternative to using the replication function on the HP StoreOnce Backup system.
This function requires a backup application that s HP StoreOnce Catalyst. For up-to-date details on ed applications refer to http://www.hp.com/go/ebs.
8
Before you start
For ed Symantec backup products, a plug-in application (HP OST 2.0) is required on each backup application media server that will use the Catalyst functionality. This can be ed from the Software Storage section of your StoreOnce product's drivers and software page on http://www.hp.com//s. OST/Catalyst devices require a license to be used at both source and target – but do NOT require an additional Catalyst Copy license as well. On the HP StoreOnce B6200 Backup system Catalyst licensing is per couplet; if a system has multiple couplets and one couplet is not using Catalyst stores, you do not need a license for that couplet. IMPORTANT: Much of the information that is displayed on the HP StoreOnce Catalyst pages is taken directly from the backup application. It is strongly recommended to give jobs and media server groupings names on the backup application that will also be meaningful when they are displayed in the StoreOnce Catalyst pages.
NAS targets for backup applications NOTE: See NAS shares (page 70) for more detailed information about NAS shares and best practices. for both CIFS and NFS protocols means that NAS target devices may be created as backup targets for both Windows and UNIX/Linux hosts, and may be used with most backup applications that backup to disk. NAS targets on an HP StoreOnce Backup System provide network file share access that is optimized for backup to disk. They should not be used for general purpose file storage. There are no licensing requirements for backup to NAS targets, but a replication license is required when replicating NAS targets to another StoreOnce Backup system.
Virtual Tape Library targets for backup applications NOTE: See Virtual Tape Devices (page 65) for more detailed information about VTL devices and best practices. The backup target appears to the host as an Ultrium Tape Library and requires a backup application that s backup to tape. Tape Library emulation type is selected during initial configuration and this determines the number of cartridge slots and embedded tape drives that may be configured for the device. Virtual Tape Libraries provide considerable flexibility for a variety of backup rotation schemes. •
The HP B6200 Backup system may only be configured with Fibre Channel (FC) devices.
•
The single node products may be configured with iSCSI and Fibre Channel (FC) devices, although some only iSCSI.
NOTE: The HP D2DBS Generic Library emulation type provides the most flexibility in numbers of cartridges and drives. It is also clearly identified in most backup applications as a virtual tape library and so is easier for ability. If your backup application s this emulation type, it is the recommended option. There are no licensing requirements for backup to VTL targets, but a replication license is required when replicating VTL targets to another StoreOnce Backup system.
Comparing StoreOnce Catalyst, NAS and Virtual Tape Library target devices The following table summarizes some of the differences between StoreOnce Catalyst, NAS share and Virtual Tape device types. NAS targets for backup applications
9
Table 4 Comparing StoreOnce Catalyst, NAS and Virtual Tape device types Device Type
Key Features
Virtual Tape (VTL)
Uses virtual tape drives and Enterprise FC SAN virtual slots to emulate environment (some models physical tape libraries do not FC). HP StoreOnce also s iSCSI VTL (not B6200)
Tried and tested, well understood with traditional backup applications Uses Robot and Drives device type
NAS (CIFS/NFS shares)
NAS shares can be easily configured and viewed by the operating system: CIFS shares for Windows, NFS shares for Unix
Specific environments that do not tape emulation backup or prefer to backup directly to disk. In some cases the licensing may be lower cost for NAS shares as a backup target. Consider this device type for virtualized environments
This is a NAS target for backup - not recommended for random NAS file type access. Uses Basic Disk device type
Backup software has total control over the HP StoreOnce appliance, providing source-based deduplication, replication control, improved DR etc.
Environments that require a single management console for all backup and replication activities and the ability to implement federated deduplication* Wherever possible HP recommend the use of HP StoreOnce Catalyst
May require additional plug-in components on Media Servers Uses OpenStorage device type (Symantec) or Backup to Disk ( HP Data Protector) device type
StoreOnce Catalyst (stores)
Best Used In
nl
Comments
nl
nl
* Federated deduplication is an HP term referring to the ability to distribute the deduplication load across Media servers. This feature is sometimes known as source-based deduplication or low bandwidth backup.
Networking and Fibre Channel considerations NOTE: See Networking considerations and Fibre Channel considerations for more information about Ethernet and FC configurations and best practices. The following table shows which network and fibre channel ports are present on each model of StoreOnce appliance. Correct configuration of these interfaces is important for optimal data transfer. A mixture of iSCSI and FC port virtual libraries and NAS shares can be configured on the same StoreOnce appliance to balance performance needs if required. •
The HP StoreOnce B6200 Backup system does not iSCSI virtual libraries.
•
The HP StoreOnce 2700, 2620 and 4210 iSCSI Backup systems do not FC virtual libraries.
Table 5 Single node systems, ethernet and FC ports Product/Model Name
Product Number
Ethernet Connection
Fibre Channel Connection
StoreOnce 2700
BB877A
2 x 1GbE
None
nl
StoreOnce 4500
BB878A
4 x 1GbE 2 x 10GbE
2 x 8GB FC
4 x 1GbE 2 x 10GbE
4 x 8GB FC
2 x 1GbE
None
nl
StoreOnce 4700
BB879A
nl
StoreOnce 2620
BB852A
nl
StoreOnce 4210 iSCSI
BB853A
2 x 1GbE
None
nl
StoreOnce 4210 fc
BB854A
2 x 1GbE nl
10
Before you start
2 x 8GB FC
Table 5 Single node systems, ethernet and FC ports (continued) Product/Model Name
Product Number
StoreOnce 4220
BB855A
Ethernet Connection
Fibre Channel Connection
2 x 1GbE
2 x 8GB FC
nl
StoreOnce 4420
BB856A
2 x 1GbE 2 x 10GbE
2 x 8GB FC
2 x 1GbE 2 x 10GbE
2 x 8GB FC
nl
StoreOnce 4430
BB857A
nl
For multi-node systems, the number of network and FC ports depends upon the number of couplets and racks installed. There is a minimum of one couplet and one rack. Table 6 Multi node systems, HP B6200, ethernet and FC ports Couplet
Node
Ethernet Connection
Fibre Channel Connection
Couplet 1 in Rack 1
Node 1
4 x 1GbE 2 x 8GB FC2 x 10GbE
2 x 8GB FC
4 x 1GbE 2 x 8GB FC2 x 10GbE
2 x 8GB FC
4 x 1GbE 2 x 8GB FC2 x 10GbE
2 x 8GB FC
4 x 1GbE 2 x 8GB FC2 x 10GbE
2 x 8GB FC
4 x 1GbE 2 x 8GB FC2 x 10GbE
2 x 8GB FC
4 x 1GbE 2 x 8GB FC2 x 10GbE
2 x 8GB FC
4 x 1GbE 2 x 8GB FC2 x 10GbE
2 x 8GB FC
4 x 1GbE 2 x 8GB FC2 x 10GbE
2 x 8GB FC
nl
Node 2
nl
Couplet 2 in Rack 1
Node 3
nl
Node 4
nl
Couplet 3 in Rack 2
Node 5
nl
Node 6
nl
Couplet 4 in Rack 2
Node 7
nl
Node 8
nl
Licensing Licensing requirements The HP StoreOnce Backup system has the following licensing requirements: •
Each storage shelf arrives with a license that must be loaded, further capacity upgrades also arrive with a license included.
•
There is no licensing required for VTL or NAS emulations.
•
VTL and NAS replication requires a license on the target site (B6200: per couplet, but only if VTL/NAS replication is used on that couplet..
•
OST/Catalyst devices require a license to be used at both source and target – but DO NOT require an additional replication license as well. B6200: Catalyst licensing is per couplet; if a couplet is not using Catalyst stores – you do not need the license.
•
Security features (Data at Rest Encryption and Secure Erase) require a security license.
NOTE: Licenses can only be applied from the StoreOnce CLI. See the HP StoreOnce CLI Guide for more information.
Licensing
11
Types of licensing There are two types of licensing: •
Full license (not time limited)
•
Instant on (time limited to 90 days): This allows you to try out licensable functionality on StoreOnce hardware products before paying for a full license for features such as Replication Target, Catalyst, or the Security features of Data at Rest Encryption and Secure Erase. For more information on applying this type of license, see the HP StoreOnce backup system Installation and Configuration guide and the HP StoreOnce Backup system CLI Reference Guide.
Security Features The HP StoreOnce backup system offers two security features that can be applied using a Security license: Data at Rest Encryption and Secure Erase.
Data at Rest Encryption When enabled, the Data at Rest Encryption security feature protects data at rest on a stolen, discarded, or replaced disk from forensic attack. Data encryption is only available on Catalyst and VTL devices. When you create a new store or library (VTL or Catalyst), you have the option to enable encryption if the security features license has already been applied. Once enabled, encryption will automatically be performed on the data before it is written to disk. Encryption cannot be disabled once it has been set for a library or Catalyst store. When you create an encrypted store or library, the key store is updated with the encryption key. This keystore may be backed up and saved securely offsite in case the original key store is corrupted. However, be sure to keep only the latest version of the key store as a backup; the key store on the StoreOnce Backup system is updated each time you create a library or Catalyst store. The StoreOnce CLI command that backs up the key store also encrypts it, ensuring that it can only be decrypted by the HP StoreOnce backup system, should you need to restore it. Be very diligent about backing up your keystore if you are creating encrypted stores or libraries! See the HP StoreOnce Backup system CLI Reference Guide for more information about the StoreOnce CLI commands for backing up and restoring key stores. NOTE: Each library or Catalysts store configured will use a different key. The StoreOnce software automatically tracks which key is relevant to which device in the Key Store File. Keys are automatically re-applied to the right device if the key store file is restored. IMPORTANT: B6200 systems: Every time that you expand storage by adding a couplet, you will need to restore your keystore. Installing the additional couplet is an HP task, but you are responsible for ensuring that a Security license has been installed for the new couplet and saving the existing keystore.
Secure erase Secure Erase can be enabled for all store types. When enabled, this feature allows you to securely erase data that has been backed up as part of a regular backup job. The Secure Erase feature can only be enabled after store or library creation (edit the store or library to enable Secure Erase). All data written to disk once secure erase is enabled will be securely erased upon data deletion. For example, you may have unintentionally backed up confidential data and need to be sure that it has been securely erased. You must work with your backup application to trigger the secure erase, for example by forcing a format of a cartridge. The backup application sends the request to delete the data and the deletion is carried out as part of the Housekeeping function.
12
Before you start
NOTE: If you need to immediately remove data, you must make sure your backup application is configured correctly. Rotation and retention policies may need to be revisited to ensure that the data is expired. Also, data that has been written to a store or library without secure erase enabled will not be able to be securely erased (there is no retroactive application of secure erase to already written data). Only chunks that are not referenced by any other items can be securely erased. If a chunk is referenced by another item which is not marked for secure erase, then that chunk will not be erased, securely or otherwise. You will need to utilize your backup application when enacting a secure erase on stores, shares, or libraries that have secure erase enabled. See the HP StoreOnce Backup system Guide for information on how to apply the Security license for these features.
For more information Separate chapters in this guide provide more background information about Network and Fibre Channel configuration; StoreOnce Catalyst, VTL and NAS devices and the specifications that are ed; and replication, housekeeping and tape offload. The following documents are also available: •
HP StoreOnce guides (PDF): There are separate guides for the single node and the multi node models. They describe how to use the StoreOnce GUI.
•
HP StoreOnce CLI Reference Guide (PDF): This guide describes the StoreOnce CLI commands and how to use them.
•
HP StoreOnce Linux and UNIX Configuration Guide (PDF): This guide contains information about configuring and using HP StoreOnce Backup systems with Linux and UNIX.
•
HP StoreOnce B6000 Series Backup system Installation Planning and Preparation Guide and Checklists (PDF): This guide is the site installation preparation and planning guide for the HP StoreOnce B6200 Backup system only. It contains checklists that should be completed prior to HP service specialists arriving on site to install the product.
•
HP StoreOnce 2620, 4210/4220 and 4420/4430 Backup system Installation and Configuration Guide (PDF): This guide is the installation and configuration guide for the single-node HP StoreOnce Backup systems.
•
HP StoreOnce Backup system Capacity Upgrade booklet (PDF): This guide describes how to install the capacity upgrade kits.
•
HP StoreOnce Backup systems Summary of Best practices for VTL, NAS, StoreOnce Catalyst and Replication implementations with sizing and application configuration examples (PDF): This document contains much of the same information as this guide, but it also contains a worked examples for sizing a VTL, NAS and StoreOnce Catalyst solution and implementing StoreOnce Catalyst with ed backup applications.
You can find these documents from the Manuals page of the HP Business Center website: http://www.hp.com//manuals In the Storage section, click Storage Solutions and then select your product.
For more information
13
2 HP StoreOnce technology A basic understanding of the way that HP StoreOnce Technology works is necessary in order to understand factors that may impact performance of the overall system and to ensure optimal performance of your backup solution.
Data deduplication HP StoreOnce Technology is an “inline” data deduplication process. It uses hash-based chunking technology, which analyzes incoming backup data in “chunks” that average 4K in size. The hashing algorithm generates a unique hash value that identifies each chunk and points to its location in the deduplication store. Hash values are stored in an index that is referenced when subsequent backups are performed. When data generates a hash value that already exists in the index, the data is not stored a second time, but rather a count is increased showing how many times that hash code has been seen. Unique data generates a new hash code and that is stored on the appliance. Typically about 2% of every new backup is new data that generates new hash codes. With Virtual Tape Library and NAS shares, deduplication always occurs on the StoreOnce Backup system. With Catalyst stores, deduplication may be configured to occur on the media server (recommended) or on the StoreOnce Backup system.
Key performance factors with deduplication that occurs on the StoreOnce Backup system The inline nature of the deduplication process means that it is a very processor and memory intensive task. HP StoreOnce appliances have been designed with appropriate processing power and memory to minimize the backup performance impact of deduplication. •
14
Best performance will be obtained by configuring a larger number of libraries/shares/Catalyst stores with multiple backup streams to each device, although this has a trade off with overall deduplication ratio.
◦
If servers with lots of similar data are to be backed up, a higher deduplication ratio can be achieved by backing them all up to the same library/share/Catalyst store, even if this means directing different media servers to the same data type device configured on the StoreOnce appliance.
◦
If servers contain dissimilar data types, the best deduplication ratio/performance compromise will be achieved by grouping servers with similar data types together into their own dedicated libraries/shares/Catalyst stores. For example, a requirement to back up a set of exchange servers, SQL database servers, file servers and application servers would be best served by creating four virtual libraries, NAS shares or Catalyst stores; one for each server data type.
•
The best backup performance to a device configured on a StoreOnce appliance is achieved using somewhere below the maximum number of streams per device (the maximum number of streams varies between models
•
When restoring data from a deduplicating device it must reconstruct the original un-deduplicated data stream from all of the data chunks contained in the deduplication stores. This can result in lower performance than that of the backup process (typically 80%). Restores also typically use only a single stream.
•
Full backup jobs will result in higher deduplication ratios and better restore performance. Incremental and differential backups will not deduplicate as well.
HP StoreOnce technology
VTL and NAS Replication overview Deduplication technology is the key enabling technology for efficient replication because only the new data created at the source site needs to replicate to the target site once seeding is complete. This efficiency in understanding precisely which data needs to replicate can result in bandwidth savings in excess of 95% compared to having to transmit the full contents of a cartridge/share from the source site. The bandwidth saving will be dependent on the backup data change rate at the source site. There is some overhead of control data that also needs to across the replication link. This is known as manifest data, a final component of any hash codes that are not present on the remote site and may also need to be transferred. Typically the “overhead components” are less than 2% of the total virtual cartridge/file size to replicate. Replication throughput can be “throttled” by using bandwidth limits as a percentage of an existing link, so as not to affect the performance of other applications running on the same WAN link.
Key performance factors with replication Key factors for performance considerations with replication: •
Define your “seeding” (first replication) strategy before implementation – several methods are available depending on your replication model active/ive, active/active or Many-to-One. See Seeding methods in more detail.
•
If a lot of similar data exists on remote office StoreOnce libraries, replicating these into a single target VTL library will give a better deduplication ratio on the target StoreOnce Backup system. Consolidation of remote sites into a single device at the target is available with VTL device types. (Catalyst targets can also be used to consolidate replication from various source sites into a single Catalyst store at a DR site.)
•
Replication starts when the cartridge is unloaded or the NAS share file is closed and when a replication window is enabled. If a backup spans multiple cartridges or NAS files, replication will start on the first cartridge/ file as soon as the job spans to the second, unless a replication blackout window is in force.
•
Size the WAN link appropriately to allow for replication and normal business traffic taking into data change rates. A temporary increase in WAN speed may be desirable for initial seeding process if it is to be performed over the WAN
•
Apply replication bandwidth limits or apply replication blackout windows to prevent bandwidth hogging. The maximum number of concurrent replication jobs ed by source and target StoreOnce appliances can be varied in the StoreOnce Management GUI to also manage throughput and bandwidth utilization.
Catalyst Copy and deduplication Catalyst Copy is the equivalent of Virtual library and NAS share replication. The same principles apply in that only the new data created at the source site needs to be copied (replicated) to the target site. The fundamental difference is that the copy jobs are created by the backup application and can, therefore, be tracked and monitored within the backup application catalog as well as from the StoreOnce Management GUI. Should it be necessary to restore from a Catalyst copy, the backup application is able to restore from a duplicate copy without the need to re-import data to the catalog database. The key performance factors are the same as the replication performance factors.
Housekeeping Housekeeping is an important process in order to maximize the deduplication efficiency of the appliance. If data is deleted from the StoreOnce system (e.g. a virtual cartridge is overwritten or erased), any unused chunks will be marked for removal, so space can be freed up (space VTL and NAS Replication overview
15
reclamation). The process of removing chunks of data is not an inline operation because this would significantly impact performance. This process, termed “housekeeping”, runs on the appliance as a background operation. Housekeeping is triggered in different ways depending on device type and backup application: •
VTL: media on which the data retention period has expired will be overwritten by the backup application. The act of overwriting triggers the housekeeping of the expired data. If media is not overwritten (if backup application chooses to use blank media in preference to overwriting), the expired media continues to occupy disk space.
•
NAS shares: Some backup applications overwrite with the same file names after expiration; others do an expiry check before writing new data to the share; others might do a quota check before overwriting. Any of these actions triggers housekeeping.
•
Catalyst stores: The backup application clean-up process, the running of which is configurable, regularly checks for expired backups and removes catalog entries. This provides a much more structured space reclamation process.
See Housekeeping (page 106) for more information about configuring housekeeping and best practices.
Backup Application considerations “Multiplexing” data streams from different sources into a single stream in order to get higher throughput used to be a common best practice when using physical tape drives. This was a necessity in order to make the physical tape drive run in streaming mode, especially if the individual hosts could not supply data fast enough. But multiplexing is not required and is in fact a BAD practice if used with HP StoreOnce StoreOnce deduplication devices.
Multi-stream or multiplex, what do they mean? Multi-streaming is often confused with Multiplexing; these are however two different (but related) . Multi-streaming is when multiple data streams are sent to the StoreOnce Backup system simultaneously but separately. Multiplexing is a configuration whereby data from multiple sources (for example multiple client servers) is backed up to a single tape drive device by interleaving blocks of data from each server simultaneously and combined into a single stream. Multiplexing is a hangover from using physical tape device, and was required in order to maintain good performance where source servers were slow because it aggregates multiple source server backups into a single stream. A multiplexed data stream configuration is NOT recommended for use with a StoreOnce system or any other deduplicating device. This is because the interleaving of data from multiple sources is not consistent from one backup to the next and significantly reduces the ability of the deduplication process to work effectively; it also reduces restore performance. Care must be taken to ensure that multiplexing is not happening by default in a backup application configuration. For example when using HP Data Protector to back up multiple client servers in a single backup job, it will default to writing four concurrent multiplexed servers in a single stream. This must be disabled by reducing the “Concurrency” configuration value for the tape device from 4 to 1.
Why multiplexing is a bad practice HP StoreOnce Backup systems rely on very similar repetitive data streams in order to de-duplicate data effectively. When multiplexing is deployed the backup data streams are not guaranteed to be similar, since the multiplexing can jumble up the data streams from one backup to the next backup in different ways – hence drastically reducing the deduplication ratios. There is no need for multiplexing to get higher performance – quite the contrary, because the best way to get performance from any HP StoreOnce Backup system is to send multiple streams in parallel. Sending only a single multiplexed stream actually reduces performance.
16
HP StoreOnce technology
The figure below shows a single backup job from multiple hosts, where the backup data is radically different from one backup job to the next. There is also only a single stream to the device on the StoreOnce Backup system. This configuration produces slow performance and poor deduplication ratios. Figure 1 Multiplexing produces slow performance and poor dedupe ratios
Instead of multiplexing data into a single stream and sending it to the HP StoreOnce Backup system, you should re-specify the multiplexed backup job to be either a single backup job using multiple devices or multiple jobs to separate devices. This will ensure better throughput and deduplication ratios. Finally, multiplexing creates SLOWER restores because data to an individual host has to be de-multiplexed from the data stored on the device. NOT using multiplexing will actually improve restore performance. The next figure shows the recommended configuration where single or multiple jobs are streamed in parallel with little change between backup jobs. There are multiple streams to devices on the HP StoreOnce Backup system, resulting in higher performance and good deduplication ratios.
Backup Application considerations
17
Figure 2 Recommended configuration using multiple streams
Effect of multiple streams on StoreOnce Performance The following graph, illustrates the relationship between the number of active data streams and performance; the appliance is assumed to be one of the larger models where more than 24 streams (if fast enough) can achieve best throughput. The throughput values shown are for example only. Along the x axis is the number of concurrent streams. A stream is a data path to a device configured on StoreOnce; on VTL it is the number of virtual tape drives, on NAS the number of writers, on Catalyst stores the number of streams. Along the Y axis is the overall throughput in MB/sec that the StoreOnce device can process – this ultimately dictates the backup window. As a backup window begins, the number of streams gradually increases and we aim to have as many streams running as possible to get the best possible throughput to the StoreOnce device. As the backup jobs come to an end, the stream count starts to decrease and so the overall throughput to the StoreOnce device starts to reduce. The StoreOnce device itself also has a limit which we call the maximum ingest rate. In this example it is 1000MB/sec. The > 24 streams value is calculated using “Infinite performance hosts” to characterize the HP StoreOnce ingest performance. As long as we can supply around 24 data streams at the required performance levels we keep the StoreOnce device in its “saturation zone” of maximum ingest performance.
18
HP StoreOnce technology
Figure 3 Relationship between active data streams and device configuration ( VTLs shown)
Note 1: Stream source data rates will vary; some streams will be at 8, others at 50, and maybe some others at 200. This means that as the stream count increases, it will be the aggregate total of the streams that will drive the unit to saturation, which is the goal. Some of the factors that influence source data rate are the compressiblity of data, number of disks in the disk group that is feeding the stream, RAID type and others. Note 2: With 5 streams at 100MB/Sec we do not reach the maximum throughput of the node (server), which can 600MB/sec in this example. This is the maximum possible ingest rate of the device for a specific model based on 5 streams. This ingest rate is the maximum even if each stream is capable of 200MB/Sec, because it represents the maximum amount of data the machine can process. Note 3: The number of streams available varies throughout the backup window. The curve representing backup streams increases as the backup jobs begin ramping into the appliance (to VTL, NAS share or Catalyst store target devices) and then declines towards the finish of the backup, when throughput rates decline as backup jobs complete. This highlights the importance of maintaining enough backup threads from sources to ensure that, while backups are running, sufficient source “data pump” is maintained to hold the StoreOnce device in saturation. Notes for the color-coded circles: In example 1 (red circle) we are supplying much more than 24 streams (100 actually) but they are all slow hosts and the cumulative ingest rate is 800 MB/sec (below our maximum ingest rate). In example 2 (green circle) we have some high performance hosts that can supply data at a rate higher than the StoreOnce maximum ingest rate; and so the performance is capped at 1000 MB/sec. In example 3 (blue circle) we have some very high performance hosts but can only configure 5 backup streams because of the way the data is constructed on the hosts. In this case the maximum Backup Application considerations
19
ingest of the StoreOnce appliance is 600MB/sec but we can only achieve 500 MB/sec because that is as fast we we can supply the data (because of stream limitations). If we could re-configure the backups to provide more streams, we could get higher throughput. In example 4 (brown circle) we show a more realistic situation where we have a mixture of different hosts with different performance levels. Most importantly, we have 30 streams and a total throughput capability of 950 MB/sec, which puts us very close to the maximum ingest rate. The maximum ingest rates vary according to each StoreOnce model. Typically, on the larger StoreOnce units about 48 streams spread across the configured devices give the best throughput; more streams only help to sustain the throughput with each stream being throttled appropriately. For example, if 96 streams are configured, the throughput is still the same as if 48 streams were configured – it is just that each stream runs slower as resources are shared. Once we understand the basic streams versus performance concept we can start to apply best practices for the number of devices to configure. With these factors in mind we have recommended some VTL configurations for the above performance examples, which are illustrated in the following graph. Figure 4 Relationship between active data streams and device configuration ( VTLs shown)
Note2 above: In general, per device configured we get the best throughput between 12-16 streams and the best throughput per appliance when we reach 48 streams or more. So, for 100 streams
20
HP StoreOnce technology
we could configure 6 devices with say 17 streams to each or 20 devices with 5 streams to each. 6 devices is preferable because: •
Less devices are easier to manage but we can still group similar data types into the same device
•
They provide best possible throughput when we have the higher stream count to a device
Data compression and encryption backup application features Both software compression and encryption will randomize the source data and will, therefore, not result in a high deduplication ratio for these data sources. Consequently, performance will also suffer. The StoreOnce Backup system will compress the data at the end of deduplication processing anyway, before finally writing the data to disk. For these reasons it is best to do the following, if efficient deduplication and optimum performance are required: •
Ensure that there is no encryption of data before it is sent to the StoreOnce appliance.
•
Ensure that software compression is turned off within the backup application. Not all data sources will result in high deduplication ratios; deduplication ratios are data type dependent, change rate dependent and retention period dependent. Deduplication performance can, therefore, vary across different data sources. Digital images, video, audio and compressed file archives will typically all yield low deduplication ratios. If this data predominantly comes from a small number of server sources, consider setting up a separate library/share/Catalyst store for these sources for better deduplication performance. In general, high- change rates yield low dedupe ratios, whilst low change rates yield high dedupe ratios over the same retention period. As you might expect – multiple full backups yield high dedeup ratios compared to Full and Incremental backup regimes.
Backup Application considerations
21
3 Concepts specific to StoreOnce B6200 Backup system The HP B6200 Backup system provides up to 8 separate nodes in a single appliance with a single management GUI for all nodes, and failover capability across nodes within the same couplet as standard. The best practices for single-node StoreOnce Backup systems apply but significant thought must also be given to mapping customer backup requirements and media servers across devices located on up to 8 separate nodes. The preferred mapping approach is to segment customer data into different data types and then map the data types into different backup devices configured on the HP B6200 Backup system so that each backup device is its own unique deduplication store. This approach also improves deduplication ratio; similar data types mean more chance of redundant data. Network and Fibre Channel configuration requires particular care to HP autonomic failover, which is a unique enterprise class feature of the HP B6200 StoreOnce Backup system. This is described in more detail in the relevant chapters of this guide. This chapter describes basic concepts only.
The HP StoreOnce B6200 Backup system The Enterprise StoreOnce B6200 Backup System is a deduplication backup appliance ing Catalyst, VTL and NAS target devices, which provides scale-up and scale-out performance with a capacity of up to 512 TB and throughput of up to 28 TB/hour. The architecture uses high levels of redundancy ed by 2-node couplets that allow autonomic failover to the other node in a couplet should a failure on one node occur. Any backups will restart automatically after failover. Figure 5 Base couplet
1 Disk array controller (node B)
2 Node B
3 Node A
4 Disk array controller (node A)
The whole appliance is managed by a single graphical interface (GUI) and also s a command line interface (CLI). The HP B6200 Backup System is replication compatible with existing HP StoreOnce Backup Systems and can a fan-in of up to 384 concurrent replication streams (up to 48 per node).
22
Concepts specific to StoreOnce B6200 Backup system
B6200 Basic Concepts Figure 6 HP StoreOnce B6200, basic concepts
The above diagram shows the basic concepts of the HP B6200 StoreOnce architecture – understanding the architecture is key to successful deployment. •
Node: This is the basic physical building block and consists of an individual server (HP Proliant server hardware)
•
Couplet: This consists of two associated nodes and is the core of the failover architecture. Each couplet has a common disk storage sub-system achieved by dual controller architecture and cross-coupled 6Gbps SAS interfaces. Each node has access to the storage subsystem of its partner node.
•
Service set: This is a collection of software modules (logical building blocks) providing VTL/NAS and replication functions. Each service set can have Virtual Tape (VT), NAS and replication configurations.
•
Management Console: This consists of a set of software ‘agents’, each agent running on a node, only one of which is active at any one time. Each node is in communication via an internal network. The Management Console provides a virtual IP address for the Management GUI and CLI. If node failure is detected, any agent may become active (first one to respond). Only ONE active Management Console agent is allowed at any one time.
•
Cluster: This is a collection of 1 to n couplets. For the HP StoreOnce B6200 Backup System n=4. This means that a 4-couplet, 8-node configuration is the largest permitted configuration.
•
Failover: This occurs within a couplet. Service sets for VTL/NAS/Replication will run on the remaining node in the couplet. • Failback: This is a manual process to restart a node after recovery/repair.
•
VIF: This is a Virtual network Interface. Network connections to the HP B6200 Backup System are to virtual IP addresses. The network ports of the nodes use bonded connections and each bonded (2 ports into 1 entity) interface has one virtual IP address. This means that if a physical port fails, the other port in the bonded pair can be used as the data channel because the VIF is still valid. This architecture eliminates single point of hardware failure. The architecture does
B6200 Basic Concepts
23
not use LA (Link Aggregation Control Protocol) so there are no specific network switch settings required. •
Storage shelf: This refers to the P2000 master controller shelf (one per node) or a P2000 JBOD capacity upgrade. JBODs are purchased in pairs and up to three pairs may be added to each couplet. They use dual 6Gbps SAS connections for resilience.
In reality, up to 128 TB of storage is shared between two nodes in a couplet. Depending on customer requirements and single node best practices, it is possible to have, for example, a service set on node 1 that consumes 100 TB and a service set on node 2 that consumes 28 TB of the available 128 TB. (However, this is not best practice and not recommended.) This architecture scales dramatically and the maximum configuration can be seen below. Figure 7 HP B6200 Backup system, maximum configuration
Note that data, replication and storage failover is always between nodes in the same couplet but the Management Console (GUI and CLI) can failover to any node in the whole cluster. In all cases the deployment should center around what devices and services need to be configured on each node. In the following example Node 2 has failed and Service Set 2 has failed over to Node 1. Both service sets are running on Node1, but backup and replication performance will be reduced. The Management Console that was active on Node 2 has moved to Node 3, Couplet 2. This is not significant; the Management Console becomes active on the first node to respond.
24
Concepts specific to StoreOnce B6200 Backup system
Figure 8 Showing Node 2, Service Set 2 failed over to Node 1
Deployment choices The very first deployment choice is how to use the external networking and Fibre Channel connections that the B6200 Enterprise StoreOnce Backup System presents. •
All GbE network connections are for NAS devices, replication and device management, and are bonded pairs to provide resiliency. There are 2 x 10 GbE ports and 4 x 1GbE ports on each node. See Network configuration in multi-node StoreOnce Backup systems (page 36) for more details.
•
The Fibre Channel connections (2 x 8 Gbs) are for VTL devices and MUST be connected to a Fibre Channel switch; direct FC Connect is not ed. The switch MUST NPIV for the FC failover process to work. See Configuring FC to failover with the HP StoreOnce B6200 Backup system (page 47) for more details.
The following diagram illustrates network and Fibre Channel external connections to a two-rack system.
Deployment choices
25
Figure 9 HP B6200 Backup system, customer connections
26
Concepts specific to StoreOnce B6200 Backup system
4 Networking considerations The network configuration for the StoreOnce single node systems and the StoreOnce multi node systems has some significant differences. Since the StoreOnce multi node product s autonomic failover, more care is required in the network configuration to ensure optimal operation of the autonomic failover feature. The network configuration is done as part of the HP installation process for the multi node system. Be sure to read the appropriate section in this chapter for your model. But first there are some considerations that are common to all models.
Common networking considerations The Ethernet ports are used for data transfer to iSCSI VTL devices, StoreOnce Catalyst and CIFS/NFS shares and also for replication data transfer and management access to the StoreOnce Web and CLI Management Interfaces. NOTE:
The HP StoreOnce B6200 Backup system does not iSCSI VTL devices.
Configured backup devices and the management interfaces are all available on all network IP addresses configured for the appliance. In order to deliver best performance when backing up data over the Ethernet ports it will be necessary to configure the appliance network ports, and also backup servers and network infrastructure to maximize available bandwidth to the StoreOnce device.
ed Ethernet configurations The HP StoreOnce Backup system s a wide range of network configurations. The following list summarizes configuration details. •
An Ethernet connection is required for backing up to NAS shares, Catalyst stores or iSCSI VTL devices (not B6200), and replication/Catalyst copy activities, and for all StoreOnce management tasks via the GUI or CLI.
•
The HP StoreOnce Backup system s IPv4 only.
•
DH and static IP addressing are ed on single-node systems.
•
The multi-node B6200 system s static IP only.
•
For single node products, networking parameters are contained within a network configuration file.
◦
For ease of installation, a default configuration file is supplied with the StoreOnce Backup system. As long as LAN port 1 of the appliance is connected to a DH–enabled 1GbE network switch, the HP StoreOnce Backup system will be immediately active on the network after installation. The then has the option of continuing to use the default configuration file or creating and activating an additional configuration file that is tailored to their exact networking requirements. NOTE:
◦
100 Base-T Ethernet will limit performance.
s who do not have a DH-enabled 1GbE network must create and activate a network configuration file before their system can become active on the network. This network configuration file may use any available Ethernet port, but one must always be connected, even if you are only using the FC ports to back up and restore data to the HP StoreOnce Backup system. This is because the network is used to access the StoreOnce Management Console remotely; it is also used for replication.
Common networking considerations
27
•
For the HP B6200 Backup system, networking parameters are specified within the selected network template.
•
All StoreOnce Backup systems have two 1GbE ethernet ports. Most also have two additional 10GbE ports. Network bonding is ed on pairs of 1GbE and 10GbE ports, as described in the next section.
Network bonding modes Each pair of network ports on the appliance can be configured either on separate subnets or in a bond with each other (1GbE and 10GbE ports cannot be bonded together). From software release 3.6.x onwards three bonding modes are ed: •
Mode 1 (Active/Backup) This is the most simple bonding mode; it allows network traffic via one active port only and requires no specific extra switch configuration. It is recommended for simple network connections.
•
Mode 4 (IEEE 802.3ad Dynamic Link Aggregation) This bonding mode is also known as LA and requires a special external switch configuration. It provides a link aggregation solution, increasing the bond physical bandwidth but can only work if all the ports in the bond are connected to one switch. It is recommended when :
◦
The customer wants to increase throughput to the StoreOnce appliance
◦
Trunks between switches on the customer network already use LA mode
The LA protocol only works when it is configured on both ends of the physical connection. •
Mode 6 (Active Load Balancing) This mode provides a load balance solution. It does not require specific external switch configuration, but does require the switch to allow ARP negotiation. It can be used in a 2–switch configuration.
Network configuration in single-node StoreOnce Backup systems NOTE: Please refer to the HP StoreOnce 2700, 4500 and 4700 Backup system Installation and configuration guide or HP StoreOnce 2620, 4210/4220 and 4420/4430 Backup system Installation and configuration guide for detailed information about installing the StoreOnce Backup system and configuring it on a network. It contains information about the network wizard and network configuration files that is not included in this guide. This section explains network configurations for the following single-node Backup systems, which are installed and configured by the customer: •
HP StoreOnce 2700
•
HP StoreOnce 4500
•
HP StoreOnce 4700
•
HP StoreOnce 2620
•
HP StoreOnce 4210/4220
•
HP StoreOnce 4420/4430
The multi-node StoreOnce B6200 Backup system has additional networking considerations, which are described separately in the next section of this chapter. Installation and configuration of these systems is normally carried out by HP service specialists.
28
Networking considerations
General guidelines Single node StoreOnce appliances have a factory default network configuration where the first 1GbE port (Port 1 /eth0) is enabled in DH mode. This enables quick access to the StoreOnce CLI and Management GUI for customers using networks with DH servers and DNS lookup because the appliance hostname is printed on a label on the appliance itself. The default bonding mode is Mode 1. Mode 6 bonding provides port failover and load balancing across the physical ports. There is no need for any network switch configuration in this mode. This network bonding mode requires that the same switch is used for each network port or that spanning tree protocol is enabled, if separate switches are used for each port. If external switch ports are configured for LA (Mode 4) bonding and Mode 6 bonding is required, then LA must be un-configured. Network configuration on StoreOnce Backup systems is performed via the CLI Management interface. For detailed information about configuring network bonding modes, please refer to the HP StoreOnce 2620, 4210/4220 and 4420/4430 Backup system Installation and Configuration guide.
Network configuration in single-node StoreOnce Backup systems
29
Single port configurations The example shows the simplest configuration of a single subnet containing just one 1GbE network port, generally this configuration is likely to be used: •
Only if the network interface is required only for management of the appliance or
•
Only if low performance and resiliency backup and restore are acceptable.
A single 10GbE port could also be configured in this way (on 4420/4430 appliances), providing both a backup data interface and management interface. This could deliver good performance, however, bonded ports are recommended for resiliency and maximum performance. Figure 10 Network configuration, single port mode
30
Networking considerations
Dual port configurations This example describes configuring multiple subnets in separate IP address ranges for each pair of network ports. A maximum of 4 separate subnets can be configured on a StoreOnce 4420 or 4430 appliance (2 x 1GbE and 2 x 10GbE). Use this mode: •
If servers to be backed up are split across two physical networks which need independent access to the appliance. In this case, virtual libraries and shares and Catalyst stores will be available on both network ports; the host configuration defines which port is used.
•
If separate data (“Network SAN”) and management LANs are being used, i.e. each server has a port for business network traffic and another for data backup. In this case, one port on the appliance can be used solely for access to the StoreOnce Management Interface with the other used for data transfer.
Figure 11 Network configuration, dual port mode
In the case of a separate network SAN being used, configuration of CIFS backup shares with Active Directory authentication requires careful consideration, see Network Configurations for CIFS AD for more information.
Network configuration in single-node StoreOnce Backup systems
31
Bonded port configurations (recommended) If two network ports are configured within the same subnet they will be presented on a single IP address and will be bonded using one of the bonding modes as described in Network bonding modes. This configuration is generally recommended for backup data performance and also for resiliency of both data and management network connectivity. It should be noted that when using bonded ports the full performance of both links will only be realized if multiple host servers are providing data, otherwise data will still use only one network path from the single server. Figure 12 Network configuration, bonded
32
Networking considerations
10GbE Ethernet ports on StoreOnce Backup systems 10GbE Ethernet is provided as a viable alternative to the Fibre Channel interface for providing maximum iSCSI VTL performance and also comparable NAS performance. 10GbE ports also provide good performance when using StoreOnce Catalyst low and high bandwidth backup as well as Catalyst copy or VTL/NAS replication between appliances. When using 10GbE Ethernet it is common to configure a “Network SAN”, which is a dedicated network for backup that is separate to the normal business data network; only backup data is transmitted over this network. Figure 13 Network configuration with 10GbE ports
As well as CIFS and NFS shares the devices configured could equally be Catalyst stores. When a separate network SAN is used, configuration of CIFS backup shares with Active Directory authentication requires careful consideration, see the next section for more information.
Network configuration for CIFS AD When using CIFS shares for backup on a StoreOnce device in a Microsoft Active Directory environment the appliance CIFS server may be made a member of the AD Domain so that Active Directory s can be authenticated against CIFS shares on the StoreOnce Backup system. However, in order to make this possible the AD Domain controller must be accessible from the StoreOnce device. Broadly there are two possible configurations which allow both: •
Access to the Active Directory server for AD authentication and
•
Separation of Corporate LAN and Network SAN traffic
Network configuration in single-node StoreOnce Backup systems
33
Option 1: HP StoreOnce Backup system on Corporate SAN and Network SAN In this option, the StoreOnce device has a port in the Corporate SAN which has access to the Active Directory Domain Controller. This link is then used to authenticate CIFS share access. The port(s) on the Network SAN are used to transfer the actual data. This configuration is relatively simple to configure: •
On StoreOnce devices with only 1GbE ports: Two subnets should be configured with one port in each. The ports are connected and configured for either the Corporate LAN or Network SAN. In this case one data port is “lost” for authentication traffic, so this solution will not provide optimal performance.
•
On StoreOnce devices with both 10GbE and 1GbE ports: the 10GbE ports can be configured in a bonded network mode and configured for access to the Network SAN. One or both of the 1GbE ports can then be connected to the Corporate LAN for authentication traffic. In this case optimal performance can be maintained – see below.
The backup application media server also needs network connections into both the Corporate LAN and Network SAN. Figure 14 HP StoreOnce Backup system on Corporate SAN and Network SAN
34
Networking considerations
Option 2: HP StoreOnce Backup system on Network SAN only with Gateway In this option the StoreOnce appliance has connections only to the Network SAN, but there is a network router or Gateway server providing access to the Active Directory domain controller on the Corporate LAN. In order to ensure two-way communication between the Network SAN and Corporate LAN the subnet of the Network SAN should be a subnet of the Corporate LAN subnet. Once configured, authentication traffic for CIFS shares will be routed to the AD controller but data traffic from media servers with a connection to both networks will travel only on the Network SAN. This configuration allows both 1GbE network connections to be used for data transfer but also allows authentication with the Active Directory Domain controller. The illustration shows a simple Class C network for a medium-sized LAN configuration. Figure 15 HP StoreOnce Backup system on Network SAN only with Gateway
Network configuration in single-node StoreOnce Backup systems
35
5 Network configuration in multi-node StoreOnce Backup systems The multi-node StoreOnce B6200 Backup system has specific networking considerations to ensure for autonomic failover. Installation and configuration of these systems is normally carried out by HP service specialists. Before installation the customer is asked to read the HP StoreOnce B6000 Series Backup system Installation Planning and Preparation Guide and complete the checklist that specifies the information that the engineer needs to configure the system. Each couplet is a paired combination of two nodes that are directly connected in failover pairs. If one node fails, the system is designed to failover to the other node without any external interaction from the customer. The B6200 Series Backup system uses a concept called a Virtual Network Interface (VIF) to make this possible. In very simple : •
The physical IP addresses relate to the physical ports that are used to connect the HP B6200 Backup system to the customer's network
•
The Virtual Network Interface (VIF) addresses are the IP addresses that the customer uses to connect to the B6200 Management Console (GUI and CLI) and to target backup and replication jobs. Because these are never directly linked to a physical port they continue to function correctly in the event of node failure.
For a more detailed discussion of how VIFs and IP addresses are used see the HP StoreOnce Backup system installation preparation and planning guide.
What is currently ed •
IPv4 is ed.
•
DNS is ed.
•
A maximum of two sub-nets is ed, which can be used as follows: one sub-net for data (NAS shares, Catalyst stores and replication) and one sub-net for management.
•
For software revision 3.3.0 and greater up to two gateways are ed. If you wish to configure two sub-nets with only one external gateway, make sure that the gateway is on the same sub-net as the network that requires access to remote sites.
•
Three network bonding modes are ed, as described in Network bonding modes.
•
NAS shares, Catalyst stores and replication data use the same Ethernet channel.
•
If you wish to use your Ethernet channel for replication only, the only ed backup option is to create VTL libraries on Fibre Channel.
•
The network configuration applies to all nodes in the cluster. For example, you cannot have separate network configurations for each rack in a two-rack system.
What is not currently ed •
IPv6 is not ed.
•
DH is not ed.
•
There is no VTL on Ethernet using the iSCSI protocol.
ed network configurations (templates) Network ports are bonded to ensure high availability. (But there is no network bonding between 1Gb and 10Gb ports.)
36
Network configuration in multi-node StoreOnce Backup systems
The optimum configuration is to use the 10GbE ports for data (NAS shares and Catalyst stores) and all replication traffic, and the 1GbE ports for the B6000 Management Console. However, this requires two sub-nets and may not be available to all s. Five network configurations are ed and configured using one of the supplied network templates, illustrated on the following pages. You must decide which template you intend to use prior to installation.
Gateway considerations When the network is configured, you are prompted to provide an IP address for a default gateway, which will be used to route management and data traffic to and from an external network. Two of the templates customer sites with two sub-nets and allow you to route data traffic on one network and management on the second. For software revision 3.3.0 and greater, for these templates you are given the opportunity to provide the IP address of a second gateway and specify which type of traffic it should take. There are up to three choices when configuring the network (using the StoreOnce CLI): •
No external gateways: If the customer wants to use the HP StoreOnce B6200 Backup System in a totally isolated network environment, they should not configure an external gateway (the HP engineer will simply skip this step during configuration)
•
One external gateway: This is the standard configuration with templates 2, 3 and 5, where data and management traffic are routed to the same subnet, so only one external gateway is required. When used with templates 1 and 4, the customer must select which subnet (data or management) will have the ability to communicate with the external gateway.
•
Two external gateways (applies to templates 1 and 4 only and for software revision 3.3.0 and greater): This configuration allows both management and data subnets to communicate with the external network. The IP addresses of the two gateways are provided during the network configuration. The customer must select whether data or management traffic will use the 'default' gateway.
NOTE: In this context, 'default' is purely a mechanism to allow the to specify a gateway for one type of traffic; the second gateway is automatically used for the other type of traffic. Examples •
Customer has a 10GbE and a 1GbE network and only wants data to be accessible remotely: Use Template 1, but configure one gateway only to the 10GbE network for data
•
Customer has a 10GbE and a 1GbE network and wants both data and management to be accessible remotely: Use Template 1 and configure two gateways. Decide whether data or management should use the 'default' gateway.
•
Customer has a 10GbE or a 1GbE network only and wants data and management to be accessible remotely: Use Template 2, 3 or 4, as appropriate and configure one gateway.
•
Customer has a 10GbE or a 1GbE network only and does not want data or management to be accessible remotely: Use Template 2, 3 or 4, as appropriate and do not configure a gateway.
NOTE: For software revision 3.3.0 and greater: if the customer wants to have only two 1GbE network connection from each node in a couplet to their Ethernet switches, they should use Template 5 instead of Template 2 in the above examples.
Template 1, uses 10 GbE and 1 GbE sub-nets Template 1 s s who have a 10GbE network and a 1GbE network and wish to use separate sub-nets for data and management. The gateway must be in the same sub-net as the network that is being used to connect to remote sites. This is normally the data path network. ed network configurations (templates)
37
The default bonding mode for this template is Mode 1, Active/Backup, on both sub-nets. The recommended IP address range is 25 in total: 16 contiguous on the data sub-net; 1 + 8 contiguous on the management sub-net. NOTE: When using a 10GbE network, you must provide the correct SFPs for your environment. They are not supplied with the product. Figure 16 Template 1 cabling to customer's networks
1 and 2
bonded 10 GbE ports, normally to customer's data sub-net
3 and 4
bonded 1 GbE ports, normally to customer's management sub-net
blue cables
internal cabling, should not be removed
blue cables
internal cabling, should not be removed
Template 2, uses 1GbE network only Template 2 s s who have a 1GbE network only. The same network is used for data and management. The default bonding mode for this template is Mode 6, Active Load Balancing. The recommended IP address range is 17 in total: 1 for management and 16 contiguous for data. NOTE: For software revision 3.3.0 and greater: customers who do not have a sufficient number of physical Ethernet ports to Template 2 should use Template 5 instead. Figure 17 Template 2 cabling to customer's networks
38
1, 2, 3 and 4
bonded 1 GbE ports to customer's data and management network
blue cables
internal cabling, should not be removed
Network configuration in multi-node StoreOnce Backup systems
Template 3, uses 10GbE network only Template 3 s s who have a 10GbE network only. The same network is used for data and management. The default bonding mode for this template is Mode 1, Active/Backup. The recommended IP address range is 17 in total: 1 for management and 16 contiguous for data. NOTE: When using a 10GbE network, you must provide the correct SFPs for your environment. They are not supplied with the product. Figure 18 Template 3 cabling to customer's networks
1 and 2
bonded 10 GbE ports to customer's data and management network
blue cables
internal cabling, should not be removed
Template 4, uses two 1GbE networks Template 4 s s who have two 1GbE networks. One 1 GbE network is used for data; the other is used for management. The gateway must be in the same sub-net as the network that is being used to connect to remote sites. This is normally the data path sub-net, for example used for remote/external site replication traffic. The default bonding mode for this template is Mode 1, Active/Backup, on both sub-nets. The recommended IP address range is 25 in total: 16 contiguous on the data sub-net; 1 + 8 contiguous on the management sub-net. Figure 19 Template 4 cabling to customer's networks
1 and 3
bonded 1 GbE ports to customer's management sub-net
2 and 4
bonded 1 GbE ports to customer's data sub-net
light blue cables
internal cabling, should not be removed
ed network configurations (templates)
39
Template 5, uses 1GbE network only For software revision 3.3.0 and greater: Template 5 s s who want to have only two 1GbE network connections from each node in a couplet to their Ethernet switches. The same network is used for data and management. It is primarily for customers who only use VTL as their backup targets and do not use StoreOnce Catalyst stores or NAS CIFS/NFS shares. VTL replication jobs will still run over the Ethernet connections. The default bonding mode for this template is Mode 1, Active/Backup, on both sub-nets. The recommended IP address range is 17 in total: 1 for management and 16 contiguous for data. Figure 20 Template 5 cabling to customer's networks
40
1 and 3
bonded 1 GbE ports to customer's data and management network
2 and 4
not used
blue cables
internal cabling, should not be removed
Network configuration in multi-node StoreOnce Backup systems
6 Fibre Channel considerations The HP StoreOnce B6200 Backup system s switched fabric using NPIV only. All other HP StoreOnce Backup systems both switched fabric and direct attach (private loop) topologies. Switched fabric using NPIV (N Port ID Virtualisation) offers a number of advantages and is the preferred topology for all StoreOnce appliances. NOTE: As with networking the multi-node StoreOnce B6200 Backup system has specific FC considerations to ensure for autonomic failover. This chapter contains generic information about FC . See also Configuring FC to failover with the HP StoreOnce B6200 Backup system (page 47). •
Virtual library devices are assigned to an individual interface. Therefore, for best performance, configure both FC ports and balance the virtual devices across both interfaces to ensure that one link is not saturated whilst the other is idle.
•
Switched fabric mode is preferred for optimal performance on medium to large SANs since zoning can be used. Switched fabric mode is required for the HP B6200 Backup system.
•
Use zoning (by Worldwide Name) to ensure high availability.
•
When using switched fabric mode, Fibre Channel devices should be zoned on the switch to be only accessible from a single backup server device. This ensures that other SAN events, such as the addition and removal of other FC devices, do not cause unnecessary traffic to be sent to devices. It also ensures that SAN polling applications cannot reduce the performance of individual devices.
•
Either or both of the two FC ports may be connected to a FC fabric and each virtual library may be associated with one or both of these FC ports but each drive can only be associated with one port. Port 1 and 2 is the recommended option in the GUI to achieve efficient load balancing. Only the robotics (medium changer) part of the VTL is presented to Port 1 and Port 2 initially, with the number of virtual tape drives defined being presented 50% to Port 1 and 50% to Port 2. This also ensures that in the event of a fabric failure at least half of the drives will still be available to the hosts. (The initial virtual tape drive allocation to ports (50/50) can be edited later, if required.
Port assignment for StoreOnce Backup systems with two Fibre Channel cards StoreOnce B6200, 4900 and 4700 model only: These models have two FC cards per server; there are four FC ports (B6200: available on each node). When creating a library you may select individual drives or FC ports 1&2 or FC ports 3&4. Be aware that Port 1 and port 3 are on the first FC card; port 2 and port 4 are on the second FC card, so it is important to ensure that the system has been cabled correctly when connecting to the FC SAN. If you select one of the combined port options, both FC cards must be connected. Drives can only appear on one port, so when you choose a pair of ports, drives are automatically distributed evenly across both ports to ensure best performance and failover. After creating the library, it is possible to change the drive assignments using the edit function on the Interface Information tab for the selected library.
Port assignment for StoreOnce Backup systems with two Fibre Channel cards
41
General Fibre Channel configuration guidelines NOTE: The illustrations in this section show single node products (HP StoreOnce 4500, 4700, 4210 and 4220, and 4420 and 4430). Additional configuration requirements apply with the HP StoreOnce B6200 Backup system to autonomic failover. For examples that illustrate multi node products (HP StoreOnce B6200) see Configuring FC to failover with the HP StoreOnce B6200 Backup system.
Switched fabric A switched fabric topology utilizes one or more fabric switches configured in one or more storage area networks (SANs) to provide a flexible configuration between several Fibre Channel hosts and Fibre Channel targets such as the StoreOnce appliance virtual libraries. Switches may be cascaded or meshed together to form large fabrics. Figure 21 Fibre Channel, switched fabric topology
StoreOnce does not implement any selective virtual device presentation, and so each virtual library will be visible to all hosts connected to the same fabric. It is recommended that each virtual library is zoned to be visible to only the hosts that require access. Unlike the iSCSI virtual libraries, FC virtual libraries can be configured to be used by multiple hosts, if required.
42
Fibre Channel considerations
The illustration above shows the flexibility of the configuration •
VTL1 is connected to FC Port 1 exclusively
•
VTL3 is connected to FC Port 2 exclusively
•
VTL2 is spread across FC Port 1 and FC Port 2. The medium changer is connected to both ports whereas the drives by default are connected 50% to each port (2 each in this case). This mode is useful in high availability SANs
General Fibre Channel configuration guidelines
43
Direct Attach (private loop) A direct attach (private loop) topology is implemented by connecting the StoreOnce appliance ports directly to a Host Bus Adapter (HBA). In this configuration the Fibre Channel private loop protocol must be used. NOTE:
The HP StoreOnce B6200 Backup system is not ed in direct attach configurations.
Figure 22 Fibre Channel, direct attach (private loop) topology
Either of the FC ports on a StoreOnce Backup system may be connected to a FC private loop, direct attach topology. The FC port configuration of the StoreOnce appliance should be changed from the default N_Port topology setting to Loop. This topology only s a single host connected to each private loop configured FC port. In Private loop mode the medium changer cannot be shared across FC Port 1 and FC port 2.
Zoning Zoning is only required if a switched fabric topology is used and provides a way to ensure that only the hosts and targets that they need are visible to servers, disk arrays, and tape libraries. Some of the benefits of zoning include:
44
•
Limiting unnecessary discoveries on the StoreOnce appliance
•
Reducing stress on the StoreOnce appliance and its library devices by polling agents
•
Reducing the time it takes to debug and resolve anomalies in the backup/restore environment
•
Reducing the potential for conflict with untested third-party products
•
Zoning implemention needs to ensure StoreOnce FC diagnostic device is not presented to hosts
Fibre Channel considerations
Zoning may not always be required for configurations that are already small or simple. Typically the larger the SAN, the more zoning is needed. Use the following guidelines to determine how and when to use zoning. •
Small fabric (16 ports or less)—may not need zoning
•
Small to medium fabric (16 - 128 ports)—use host-centric zoning. Host-centric zoning is implemented by creating a specific zone for each server or host, and adding only those storage elements to be utilized by that host. Host-centric zoning prevents a server from detecting any other devices on the SAN or including other servers, and it simplifies the device discovery process.
•
Disk and tape on the same pair of HBAs is ed along with the coexistence of array multipath software (no multipath to tape or library devices on the HP StoreOnce Backup system, but coexistence of the multipath software and tape devices).
•
Large fabric (128 ports or more)—use host-centric zoning and split disk and tape targets. Splitting disk and tape targets into separate zones will help to keep the HP StoreOnce Backup system free from discovering disk controllers that it does not need. For optimal performance, where practical, dedicate HBAs for disk and tape.
Use soft zoning for high availability The StoreOnce appliance allows the robot and tape drives to be presented to the different FC ports that are connected to the customer’s fabric. The diagram below shows a way of utilizing this feature to add higher availability to your StoreOnce VTL deployment. When the VTL is created there is the option to present the device to Port1, Port2 or Port 1&2. If the customer chooses to present the VTL to Port 1 &2, the following happens. The robot is presented to both Port1 and Port2, 50% of the configured drives are presented to Port1 and the other 50% to Port2 (this can be changed if required). With this configuration in the event of a Fabric Failure – the robot and 50% of the drives are available for backup. The downside to this feature is that only a single 8Gb FC link is available for backups. Figure 23 VTL Fibre Channel resiliency using WWN zoning
General Fibre Channel configuration guidelines
45
Use the StoreOnce Management GUI to find out the WWPN for use in zoning. The WW port names are on the VTL-Libraries-Interface Information tab.
Diagnostic Fibre Channel devices For each StoreOnce FC port there is a Diagnostic Fibre Channel Device presented to the Fabric. There will be one per active FC physical port. This means there are two per HP StoreOnce Backup system or node that has two Fibre Channel ports. The Diagnostic Fibre Channel Device can be identified by the following example text (taken from a single node system). Symbolic Port Name Symbolic Node Name
"HP D2D S/N-CZJ1440JBS HP D2DBS Diagnostic Fibre Channel S/N-MY5040204H Port-1" "HP D2D S/N-CZJ1440JBS HP D2DBS Diagnostic Fibre Channel S/N-MY5040204H"
A virtual driver or loader would be identified by the following example text: Symbolic Port Name Symbolic Node Name
"HP D2D S/N-CZJ1440JBS HP Ultrium 4-SCSI Fibre Channel S/N-CZJ1440JC5 Port-0" "HP S/N-CZJ1440JBS HP Ultrium 4-SCSI Fibre Channel S/N-CZJ1440JC5"
In the above the S/N-CZJ1440JBS for all devices should be identical. If this is Node Port 1, the Node Name string will be as above but, if Port 2, the Node Name string will end with “Port-2”. Often the diagnostic device will be listed above the other virtual devices as it logs in first ahead of the virtual devices. The S/N-MY5040204H string is an indication of the QLC HBA’s SN not any SN of an appliance/node. At this time these devices are part of the StoreOnce VTL implementation and are not an error or fault condition. It is imperative that these devices be removed from the switch zone that is also used for virtual drives and loaders to avoid data being sent to diagnostic devices.
46
Fibre Channel considerations
7 Configuring FC to failover in a StoreOnce B6200 Backup system environment Autonomic failover Autonomic failover is a unique enterprise class feature of the HP B6200 StoreOnce Backup system. When integrated with various backup applications it makes it possible for the backup process to continue even if a node within a B6200 couplet fails. ISV scripts are usually required to complete this process. The failover process is best visualized by watching the video on: http:// www.youtube.com/watch?v=p9A3Ql1-BBs
What happens during autonomic failover? At a logical level, all the virtual devices (VTL, NAS and replication) associated with the failing node are transferred by the B6200 operating system onto the paired healthy node of the couplet. The use of Virtual IP addresses for Ethernet and NPIV virtualization on the Fibre Channel ports are the key technology enablers that allow this to happen without manual intervention. •
NAS target failover is via the Virtual IP system used in the HP B6200 Backup System – the service set simply presents the failed node Virtual IP address on the remaining node.
•
FC (VTL device) failover relies on the customer’s fabric switches ing NPIV, and NPIV being enabled and the zones set up correctly. Here the situation is more complex as several permutations are possible.
NOTE: To prevent data corruption, the system must confirm that the failing node is shutdown before the other node starts writing to disk. This can be seen in the video where the “service set” is stopping. At a hardware level the active cluster manager is sending a shutdown command via the dedicated iLO3 port on the failing node. Email alerts and SNMP traps are also sent on node failure. The HP B6200 Backup System failover process can take 15 minutes or more to complete. The total time for failover is dependent on the system integrity checks, which are an integral part of the failover process, and the need to validate the deduplicated data stored.The following figure illustrates the failover timeline. Figure 24 Failover timeline
Autonomic failover
47
Failover with backup applications Backup applications do not have an awareness of advanced features such as autonomic failover because they are designed for use with physical tape libraries and NAS storage. From the perspective of the backup application, when failover occurs, the virtual tape libraries and the NAS shares on the HP B6200 Backup System go offline and after a period of time they come back online again. This is similar to a scenario where the backup device has been powered off and powered on again. Each backup application deals with backup devices going offline differently. In some cases, once a backup device goes offline the backup application will keep retrying until the target backup device comes back online and the backup job can be completed. In other cases, once a backup device goes offline it must be brought back online again manually within the backup application before it can be used to retry the failed backups. In this section we shall briefly describe three popular backup applications and their integration with the autonomic failover feature. Information for additional backup applications will be published on the B6200 documentation pages when it is available. •
HP Data Protector 6.21: job retries are currently ed by using a post-exec script. from B6200 documentation.
•
Symantec NetBackup 7.x: job retries are automatic, but after a period without a response from the backup device the software marks the devices as “down”. Once failover has completed and the backup device is responding again the software does not automatically mark the device as “up” again. A script is available from HP that continually checks Symantec device status and ensures that backup devices are marked as “up”. With this script deployed on the NetBackup media server, the HP B6200 Backup System failover works seamlessly. from B6200 documentation. NetBackup can go back to the last checkpoint and carry on from there, if checkpointing has been enabled in the backup job. So, all the data backed up prior to failover is preserved and the job does not have to go right back to the beginning and start again.
•
EMC Networker 7.x: VTL: Job retries are automatically enabled for scheduled backup jobs. No additional scripts or configuration are required in order to achieve seamless integration with the HP B6200 Backup System. In the event of a failover scenario, the backup jobs are automatically retried once the HP B6200 Backup System has completed the failover process. EMC Networker also has a checkpoint facility that can be enabled. This allows failed backup jobs to be restarted from the most recent checkpoint. NAS: The combination of Networker and NAS is not ed with autonomic failover and use could cause irrecoverable data loss. nl
nl
It is strongly recommended that all backup jobs to all nodes be configured to restart (if any action to do this is required) because there is no guarantee which nodes are more likely to fail than others. It is best to cover all eventualities by ensuring all backups to all nodes have restart capability enabled, if required. Whilst the failover process is autonomic, the failback process is manual because the replacement or repaired node must be brought back on line before failback can happen. Failback can be implemented either from the CLI or the GUI interface. Restores are generally a manual process and restore jobs are typically not automatically retried because they are rarely scheduled.
48
Configuring FC to failover in a StoreOnce B6200 Backup system environment
Deg for failover One node is effectively doing the work of two nodes in the failed over condition. There is some performance degradation but the backup jobs will continue after the autonomic failover. The following best practices apply when deg for autonomic failover : •
The customer must choose whether SLAs will remain the same after failover as they did before failover. If they do, the solution must be sized in advance to only use up to 50% of the available performance. This is to ensure that there is sufficient headroom in system resources so that in the case of failover there is no appreciable degradation in performance after failover and the SLAs are still met
•
For customers who are more price-conscious and where failover is an “exception condition” the solution can be sized for cost effectiveness. Here most of the available throughput is utilized on the nodes. In this case when failover happens there will be a degradation in performance. The amount of degradation observed will depend on the relative “imbalance” of throughput requirements between the two nodes. This is another reason for keeping both nodes in a couplet as evenly loaded as possible.
•
Ensure the correct ISV patches/scripts are applied and do a dry run to test the solution. In some cases a post execution script must be added to each and every backup job/policy. The customer can configure which jobs will retry in the event of failover (which is a temporary condition) in order to limit the load on the single remaining node in the couplet by:
◦
Only putting the post execution script to retry the job in the most urgent and important jobs, not all jobs. This is the method for HP Data Protector.
◦
Modifying the “bring device back on line scripts” to only apply to certain drives and robots – those used by the most urgent and important jobs. This is the method for Symantec NetBackup.
•
replication is also considered as a virtual device within a service set and replication fails over as well as backup devices
•
For replication failover there are two scenarios:
◦
Replication was not running – that is, failover occurred outside the replication window, in which case replication will start when the replication windows is next open.
◦
If replication was in progress when failover occurred, after failover has completed replication will start again from the last known good checkpoint (about every 10MB of replicated data).
•
Failback (via CLI or GUI) is a manual process and should be scheduled to occur during a period of inactivity.
•
all failover related events are recorded in the Event Logs.
Key Failover FC zoning considerations The same considerations apply when configuring Fibre Channel as did when configuring the network. Care must be taken to ensure there is no single point of failure in switch or fabric zoning that will negate the failover capabilities of the HP B6200 Backup System and its autonomic failover ability. Conformance to the following rules will help to ensure successful failover •
Fibre Channel switches used with HP StoreOnce must NPIV. For a full list see: http://www.hp.com/go/ebs. nl
•
Use WWPN zoning (rather than port based).
•
In a single fabric configuration ensure the equivalent FC ports from each B6200 node in a couplet are presented to the same FC switch, see Scenario 1. Key Failover FC zoning considerations
49
•
In a dual fabric configuration ensure the equivalent FC ports from each B6200 node in a couplet are presented to the same fabric. However, they should present to separate switches within the fabric. See Scenario 3.
•
Ensure the D2D diagnostic device WWNs (these will be seen in the switch name server and are associated with the physical ports) are not included in any fabric zones and, therefore, not presented to any hosts.
Fibre channel port presentations When you create a virtual tape library on a service set you specify whether the VTL should be presented to: •
Port 1 and 2
•
Port 3 and 4
•
Port 1
•
Port 2
•
Port 3
•
Port 4
Port 1 and 2 (or Port 3 and 4) is the recommended option to achieve efficient load balancing. Only the robotics (medium changer) part of the VTL is presented to Port 1 and Port 2 initially, with the number of virtual tape drives defined, being presented 50% to Port 1 and 50% to Port 2. This also ensures that in the event of a fabric failure at least half of the drives will still be available to the hosts. (The initial virtual tape drive allocation to ports (50/50) can be edited later, if required. So, to create a library you need: •
1 WWN for the robotics
•
XX number of WWNs for your drives, depending on the required number of drives
Although the universal configuration rule is a maximum of 255 WWNs per port, the HP B6200 Backup System applies a maximum of 120 WWNs per port and up to 192 drives per library. This is to ensure fabric redundancy and to enable failover to work correctly. For example, should Port 1 fail in any of the selected configurations, the WWNs associated with its service set will not exceed 120 and can be failed over safely to Port 2. To summarize: •
To create a library on one port only, the maximum number of devices that you can have is 120, of which 1 WWN is required for the robotics, so the total number of drives available = 119 drives
•
To create a library on Ports 1 and 2 (or Ports 3 and 4), the maximum number of drives is 96 per port (but this configuration is not recommended). This is a B6200 library limit and not a WWN limit.
The following table illustrates various FC port configurations with VTL devices and the impact that the choice of FC ports has on the validity of the configuration. NOTE:
50
This table will be updated to include Ports 3 and 4 in the next version of this guide.
Configuring FC to failover in a StoreOnce B6200 Backup system environment
Figure 25 VTL FC example port configurations
Fibre channel port presentations
51
Scenario 1, single fabric with dual switches, recommended Figure 40 illustrates the logical connectivity between the hosts and the VTLs and their FC ports. The arrows illustrate accessibility, not data flow. FC configuration •
Multiple switches within a single fabric
•
All hosts can see the robots over two separate switches
•
Zoning by WWPN
•
Each zone to include a host and the required targets on the HP B6200 Backup System
•
Equivalent ports from each node can see the same switch
B6200 VTL configuration •
Default library configuration is 50% drives presented to Port 1, 50% presented to Port 2
•
Up to 120 WWNs can be presented to Port 1 and Port 2
•
On B6200 failover all WWNs of failed node are automatically transferred to the corresponding port on the other node. This is transparent to the hosts.
Figure 26 VTLs presented to two ports and into dual switches in a single Fabric, recommended configuration
52
Configuring FC to failover in a StoreOnce B6200 Backup system environment
If FC switch1 fails, Host A and Host B lose access to their backup devices. Hosts C and D still have access to the media changers and to 50% of the drives on VTL2 and 50% of the drives on VTL1 B6200 failover between nodes is enabled.
Scenario 2, single fabric with dual switches, not advised The FC configuration is the same in this scenario, but the VTLs are presented to a single port. This configuration is not advised because it compromises the B6200 autonomic failover facility. FC configuration •
Multiple switches within a single fabric
•
All hosts can see the robots over two separate switches
•
Zoning by WWPN
•
Each zone to include a host and the required targets on the HP B6200 Backup system
•
Equivalent ports on each node see different switches
B6200 VTL configuration •
B6200 failover nodes will not failover if we lose a FC port on the node
•
Up to 120 WWNs can be presented to the individual port
•
Loss of a port or switch means that all access is lost to the VTLs that are dedicated to that port
•
Library configuration is all drives presented entirely to a single port, either Port 1 or Port 2
Figure 27 Separate VTLs presented to separate ports and into different switches in different Fabrics, not recommended
Scenario 2, single fabric with dual switches, not advised
53
If FC switch1 fails, Host A and Host B lose access to their backup devices, even though B6200 failover is enabled because the physical configuration provides a point of failure. Hosts C and D still have access to the media changers and to 100% of the drives on VTL3 and VTL4.
Scenario 3, dual fabric with dual switches, recommended This FC configuration has added complexity because it has two fabrics. The arrows illustrate accessibility, not data flow. FC configuration •
Dual fabrics
•
Multiple switches within each fabric
•
Zoning by WWPN
•
Each zone to include a host and the required targets on the HP B6200 Backup system
•
Equivalent ports from each node can see the same fabric, but are directed to different switches
B6200 VTL configuration •
Default library configuration is 50% drives presented to Port 1, 50% presented to Port 2. Robot appears on Port 1 and Port 2
•
Up to 120 WWNs can be presented to Port 1 and Port 2
•
On B6200 failover all WWNs of failed node are automatically transferred to the corresponding port on the other node, which still has access to both fabrics. This is transparent to the hosts.
Figure 28 Complex configuration with ports of different VTLs being presented to different fabrics, recommended configuration
54
Configuring FC to failover in a StoreOnce B6200 Backup system environment
What happens if a fabric fails? If Fabric 1 fails in the previous configuration, all VTL libraries and nodes on the HP B6200 Backup System still have access to Fabric 2. As long as Hosts A, B and C also have access to Fabric 2, then all backup devices are still available to Hosts A, B and C. The following diagram illustrates existing good paths after a fabric fails. Figure 29 Complex configuration with ports of different VTLs being presented to different fabrics, Fabric 1 fails
Similarly, if Fabric 2 failed, all VTL libraries and nodes on the HP B6200 Backup System would still have access to Fabric 1. As long as Hosts D, E and F also have access to Fabric 1, then all backup devices are still available to Hosts D, E and F. The following diagram illustrates good existing paths after Fabric 2 fails.
Scenario 3, dual fabric with dual switches, recommended
55
Figure 30 Complex configuration with ports of different VTLs being presented to different fabrics, Fabric 2 fails
Scenario 4, dual fabric with dual switches, not advised The FC configuration is the same as scenario 3, but the VTLs are presented to a single port, which means they are tied to a single switch within a single fabric. This configuration is not advised because it compromises the B6200 autonomic failover facility. FC configuration •
Dual fabrics
•
Multiple switches within each fabric
•
Zoning by WWPN
•
Each zone to include a host and the required targets on the HP B6200 Backup system
•
Equivalent ports from each node are connected to the same fabric, but are directed to different switches
•
Each port is connected to only one switch within one fabric
B6200 VTL configuration
56
•
Library configuration is all drives presented entirely to a single port, either Port 1 or Port 2.
•
Loss of a port, switch or fabric means that all access is lost to the VTLs that are dedicated to that port, switch or fabric
Configuring FC to failover in a StoreOnce B6200 Backup system environment
Figure 31 Complex configuration with ports of different VTLs being presented to different fabrics, not advised
Scenario 4, dual fabric with dual switches, not advised
57
8 StoreOnce Catalyst stores StoreOnce Catalyst technology HP StoreOnce Catalyst delivers a single, integrated, enterprise-wide deduplication algorithm. It allows the seamless movement of deduplicated data across the enterprise to other StoreOnce Catalyst systems without rehydration. This means that the benefits from: •
Simplified management of data movement from the backup application: tighter integration with the backup software to manage file replication centrally across the enterprise from the backup application GUI.
•
Seamless control across complex environments: ing a range of flexible configurations that enable the concurrent movement of data from one site to multiple sites, and the ability to cascade data around the enterprise (sometimes referred to as multi-hop).
•
Enhanced performance: distributed deduplication processing using StoreOnce Catalyst stores on the StoreOnce Backup system and on multiple servers can optimize network loading and appliance throughput.
•
Faster time to backup to meet shrinking backup windows: up to 100 TB/hour* aggregate throughput, which is up to 4 times faster than backup to a NAS target.
*Actual performance is dependent upon the specific StoreOnce appliance, configuration, data set type, compression levels, number of data streams, number of devices and number of concurrent tasks, such as housekeeping or replication. All HP StoreOnce Backup systems can Catalyst stores, Virtual Tape libraries and NAS (CIFS/NFS) shares on the same system, which makes them ideal for customers who have legacy requirements for VTL and NAS but who wish to move to HP StoreOnce Catalyst technology. HP StoreOnce Catalyst stores require a separate license on both source and target; VTL/NAS devices only require licenses if they are replication targets.
Key features The following are the key points to be aware of with StoreOnce Catalyst:
58
•
Optional deduplication at the backup server enables greater overall StoreOnce appliance performance and reduced backup bandwidth requirements. This can be controlled at backup session/job level.
•
HP StoreOnce Catalyst enables advanced features such as duplication of backups between appliances in a network-efficient manner under control of the backup application.
•
Catalyst stores can be copied using low-bandwidth links – just like NAS and VTL devices. The key difference here is that there is no need to set up replication mappings (required with VTL and NAS); the whole of the Catalyst copy process is controlled by the backup software itself.
•
HP StoreOnce Catalyst enables space occupied by expired backups to be returned for re-use in an automated manner because of close integration with the backup application
•
HP StoreOnce Catalyst enables asymmetric expiry of data. For example: retain 2 weeks on the source, 4 weeks on the target device.
•
HP StoreOnce Catalyst store creation can be controlled by the backup application, if required, from within Data Protector (not available with Symantec products).
•
StoreOnce Catalyst is fully monitored in the Storage Reporting section of the StoreOnce Management GUI and StoreOnce Catalyst Copy can be monitored on a global basis by using HP Replication Manager V2.1 or above.
StoreOnce Catalyst stores
•
HP StoreOnce Catalyst is an additional licensable feature both on the StoreOnce appliance and within the backup software because of the advanced functionality it delivers.
•
HP StoreOnce is only ed on HP Data Protector 7.01, Symantec NetBackup 7.x and Symantec Backup Exec 2012.
StoreOnce interfaces with StoreOnce Catalyst The following diagram shows the basic concept of a StoreOnce Catalyst store; it is a network based (not FC based) type of backup target, that exists alongside VTL and NAS targets. The main difference between a Catalyst store and VTL or NAS devices is that the processor-intensive part of deduplication (hashing/chunking and compressing) can be configured to occur on either the media server or the StoreOnce appliance. •
If deduplication is configured to occur on the media server supplying data to the Catalyst store, this is known as low bandwidth backup or source deduplication.
•
If deduplication is configured to occur on the StoreOnce appliance where the Catalyst store is located, this is known as target-side deduplication or high-bandwidth backup. ALL the deduplication takes place on the StoreOnce appliance.
The low bandwidth mode is expected to for the majority of Catalyst implementations since it has the net effect of improving the overall throughput of the StoreOnce appliance whilst reducing backup bandwidth consumed. It can also be used to allow remote offices to back up directly to a central StoreOnce Appliance over a WAN link for the first time. Catalyst stores are tolerant of high latency links – this has been tested by HP. The net effect is the same in both cases – a significant reduction in bandwidth consumed by the data path to the backup storage target. Figure 32 StoreOnce interfaces with StoreOnce Catalyst
StoreOnce interfaces with StoreOnce Catalyst
59
The deduplication offload into the media server is implemented in different ways in different backup applications. •
With HP Data Protector the StoreOnce deduplication engine is embedded in the HP Data Protector Media Agent that talks to the Catalyst API.
•
In Symantec products HP has developed an OpenStorage (OST) Plug-in to NetBackup and Backup Exec that creates the interface between Symantec products and the StoreOnce Catalyst store API. Catalyst stores can also be copied using low bandwidth links – just like NAS and VTL devices. The key difference here is that there is no need to set up replication mappings (required with VTL and NAS); the whole of the Catalyst copy process is controlled by the backup software itself. This is implemented by sending “Catalyst Copy” commands to the Catalyst API that exists on the source StoreOnce appliance. This simple fact, that the backup application controls the copy process and is aware of all the copies of the data held in Catalyst stores, solves many of the problems involved in Disaster Recovery scenarios involving replicated copies. No import is necessary because all entries for all copies of data already exist in the backup application’s Database.
Catalyst Copy Catalyst Copy should not be considered in the same way as VTL and NAS replication, since there is effectively no hard constraints other than capacity on how many Catalyst stores can be copied (replicated) into a Catalyst store at a central site. Furthermore, because Catalyst copies are controlled by the backup application, multi-hop replication is possible using Catalyst devices. However, Catalyst blackout windows can be set on the StoreOnce appliance to dictate when the copy job actually happens and bandwidth throttling can also be enforced to limit the amount of WAN link consumed by StoreOnce Catalyst copy; in this respect it is similar to NAS and VTL replication. Catalyst Copy has the following features:
60
•
The Copy job is configurable from within the backup application software.
•
Several source Catalyst stores can be copied into a single target Catalyst store.
•
Multi-hop copy is configurable via the backup software – Source to Target 1 then onto Target 2.
•
One to many copy is also configurable but happens serially one after the other.
•
With the Catalyst Agents running on remote office media servers HP StoreOnce Catalyst technology has the ability to back up directly from remote sites to a central site, using what is known as low bandwidth backup – essentially this uses HP StoreOnce replication technology.
StoreOnce Catalyst stores
Figure 33 Catalyst copy models
Summary of Catalyst best practices HP StoreOnce Catalyst is a unique interface and is fundamentally different from virtual tape or NAS. It provides the backup application with full control of backup and replication (called Catalyst Copy). For this reason, best practices are dependent upon the backup application. See the separate document, HP StoreOnce Backup system Best Practices for VTL, NAS, StoreOnce Catalyst and Replication Implementations with sizing and application configuration examples for more details. Generic best practices for HP StoreOnce Catalyst implementations are: •
Ensure that the media servers where Catalyst low bandwidth backup is to be deployed are sized accordingly; otherwise, the implementation will not work well.
•
As with other device types, the best deduplication ratios are achieved when similar data types are sent to the same device.
•
Best throughput is achieved with multiple streams, the actual number per device/appliance varies by model . Because Catalyst stores can be acting as a backup target and inbound replication target the maximum value applies to the two target types combined (although inbound copy jobs would not normally run at the same time as backups)
•
Although Catalyst copy is controlled by the backup software, the copy blackout window overrides the backup software scheduling. Check for conflicts.
•
The first Catalyst low bandwidth backup will take longer than subsequent low bandwidth backups because a seeding process has to take place.
•
If you are implementing multi-hop or one-to-many Catalyst copies, that these copies happen serially not in parallel.
Summary of Catalyst best practices
61
•
Ensure the backup clean-up scripts that regularly check for expired Catalyst Items run at a frequency that avoids using excessive storage to hold expired backups (every 24 hours is recommended).
•
There are several specific tuning parameters dependent on backup application implementation – please see separate document for more details..
StoreOnce Catalyst and the StoreOnce GUI Refer to the HP StoreOnce Backup system guide or online help for detailed information ing the Catalyst functions on the StoreOnce GUI.
Maximum concurrent jobs and blackout windows As with VTL and NAS replication, each StoreOnce appliance s a maximum number of concurrent jobs. In this case, the parameters are Outbound copy jobs and Data and Inbound Copy jobs. Bear in mind that Catalyst stores can act as both inbound and outbound copies when used in multi-hop mode. See Key parameters (page 115). The concurrency settings for Catalyst are configured by selecting the StoreOnce Catalyst - Settings tab and Edit. Figure 34 StoreOnce Catalyst settings
Catalyst copy blackout windows are configured from the StoreOnce Catalyst-Blackout Windows tab. Figure 35 StoreOnce Catalyst blackout windows
The can also configure bandwidth limiting from the StoreOnce Catalyst-Bandwidth Limiting Windows tab.
62
StoreOnce Catalyst stores
Figure 36 StoreOnce Catalyst Bandwidth limiting
Client access permissions Catalyst stores have a process that allows client access to the Catalyst stores to be controlled. First overall client access permission checking is enabled from the StoreOnce Catalyst – Settings tab. Figure 37 Enabling client access permission checking
Then each Catalyst store has a list of clients defined who are allowed to access it from the StoreOnce Catalyst – Stores – Permissions tab.
StoreOnce Catalyst and the StoreOnce GUI
63
Figure 38 Setting access permissions for a store
More information HP StoreOnce Catalyst is a unique interface and is fundamentally different from virtual tape or NAS. It provides the backup application with full control of backup and replication (called Catalyst Copy). For this reason, best practices and configuration are dependent upon the backup application. See the separate document, HP StoreOnce Backup systems Best practices for VTL, NAS, StoreOnce Catalyst and Replication implementations with sizing and application configuration examples.
64
StoreOnce Catalyst stores
9 Virtual Tape Devices Overview Virtual Tape Devices are backup target devices on the HP StoreOnce Backup system to which the backup application on the hosts write data. They appear to the host as a locally-attached physical tape library, but physically, they use disk space on the HP StoreOnce Backup system which, as in tape terminology, is referred to as slots or cartridges. Tape Libraries provide considerable storage capacity and full for tape rotation strategies. (It may be necessary to upgrade your backup application to libraries.)
Tape Library Emulation Emulation types HP StoreOnce Backup systems can emulate several types of physical HP Tape Library device; the maximum number of drives and cartridge slots is defined by the type of library configured. The options available vary according to the HP StoreOnce Backup system model. Performance is not related to library emulation other than in the respect of the ability to configure multiple drives per library and thus enable multiple simultaneous backup streams (multi-streaming operation). To achieve the best performance of the larger StoreOnce appliances more than one virtual library will be required to meet the multi-stream needs. The appliance is provided with a drive pool and these can be allocated to libraries in a flexible manner and so many drives per library can be configured up to a maximum as defined by the library emulation type. The number of cartridges per library can also be configured. The table below lists the key parameters all StoreOnce products. To achieve best performance the recommended maximum concurrent backup streams per library and appliance in the table should be followed. As an example, while it is possible to configure 200 drives per library on a 4420 appliance, for best performance no more than 12 of these drives should be actively writing or reading at any one time. See also Key parameters (page 115). NOTE: The maximum number of virtual devices ed varies according to the product and this number is split across VTL, NAS and Catalyst devices. The table above illustrates maximum configurations for libraries and drives, but this number may be limited if you have already created NAS shares and Catalyst stores. The HP D2DBS emulation type and the ESL/EML type provide the most flexibility in numbers of cartridges and drives. This has two main benefits: •
It allows for more concurrent streams on backups which are throttled due to host application throughput, such as multi-streamed backups from a database.
•
It allows for a single library (and therefore Deduplication Store) to contain similar data from more backups, which then increases deduplication ratio.
The D2DBS emulation type has an added benefit in that it is also clearly identified in most backup applications as a virtual tape library and so is easier for ability. It is the recommended option for this reason. There are a number of other limitations from an infrastructure point of view that need to be considered when allocating the number of drives per library. As a general point it is recommended that the number of tape drives per library does not exceed 64 due to the restrictions below: •
For iSCSI VTL devices a single Windows or Linux host can only access a maximum of 64 devices. A single library with 63 drives is the most that a single host can access. Configuring a single library with more than 63 drives will result in not all devices in the library being seen Overview
65
(which may include the library device). The same limitation could be hit with multiple libraries and fewer drives per library. •
A similar limitation exists for Fibre Channel. Although there is a theoretical limit of 255 devices per FC port on a host or switch, the actual limit appears to be 128 for many switches and HBAs. You should either balance drives across FC ports or configure less than 128 drives per library.
•
Some backup applications will deliver less than optimum performance if managing many concurrent backup tape drives/streams. Balancing the load across multiple backup application media servers can help here.
Cartridge sizing The size of a virtual cartridge has no impact on its performance and cartridges do not pre-allocate storage. It is recommended that cartridges are created to match the amount of data being backed up. For example, if a full backup is 500 GB, the next larger configurable cartridge size is 800 GB, so this should be selected. Note that if backups are to be offloaded to physical media elsewhere in the network, it is recommended that the cartridge sizing matches that of the physical media to be used.
Number of libraries per appliance The StoreOnce appliance s the creation of multiple virtual library devices. If large amounts of data are being backed up from multiple hosts or for multiple disk LUNs on a single host, it is good practice to separate these across several libraries (and consequently into multiple backup jobs). Each library has a separate deduplication “store” associated with it. Reducing the amount of data in, and complexity of, each store will improve its performance. Creating a number of smaller deduplication “stores” rather than one large store which receives data from multiple backup hosts could have an impact on the overall effectiveness of deduplication. However, generally, the cross-server deduplication effect is quite low unless a lot of common data is being stored. If a lot of common data is present on two servers, it is recommended that these are backed up to the same virtual library. •
For best backup performance, configure multiple virtual libraries and use them all concurrently.
•
For best deduplication performance, use a single virtual library and fully utilize all the drives in that one library.
Backup application and configuration In general backup application configurations for physical tape devices can be readily ported over to target a deduplicating virtual library with no changes; this is one of the key benefits of virtual libraries – seamless integration. However considering deduplication in the design of a backup application configuration can improve performance, deduplication ratio or ease of data recovery so some time spent optimizing backup application configuration is valuable.
Blocksize and transfer size As with physical tape, larger tape block sizes and host transfer sizes are of benefit. This is because they reduce the amount of overhead of headers added by the backup application and also by the transport interface. The recommended minimum is 256 KB block size, and up to 1 MB is suggested if the backup application and operating system will this. For HP Data Protector and EMC Networker Software a block size of 512 KB has been found to provide the best deduplication ratio and performance balance and is the recommended block size for this application. Some minor setting changes to upstream infrastructure might be required to allow backups with greater than 256 KB block size to be performed. For example, Microsoft’s iSCSI initiator 66
Virtual Tape Devices
implementation, by default, does not allow block sizes that are greater than 256 KB. To use a block size greater than this you need to modify the following registry setting: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Class\ nl nl
nl
{4D36E97B-E325-11CE-BFC1-08002BE10318}\0000\Parameters Change the REG_DWORD MaxTransferLength to “80000” hex (524,288 bytes), and restart the media server – this will restart the iSCSI initiator with the new value.
Rotation schemes and retention policy Retention policy The most important consideration is the type of backup rotation scheme and associated retention policy to employ. With data deduplication there is little penalty for using a large number of virtual cartridges in a rotation scheme and therefore a long retention policy for cartridges because most data will be the same between backups and will therefore be deduplicated. A long retention policy provides a more granular set of recovery points with a greater likelihood that a file that needs to be recovered will be available for longer and in many more versions.
Rotation scheme There are two aspects to a rotation scheme which need to be considered: •
Full versus Incremental/Differential backups
•
Overwrite versus Append of media
Full versus Incremental/Differential backups The requirement for full or incremental backups is based on two factors, how often offsite copies of virtual cartridges are required and speed of data recovery. If regular physical media copies are required, the best approach is that these are full backups on a single cartridge. Speed of data recovery is less of a concern with a virtual library appliance than it is with physical media. For example, if a server fails and needs to be fully recovered from backup, this recovery will require the last full backup plus every incremental backup since (or the last differential backup). With physical tape it can be a time consuming process to find and load multiple physical cartridges, however, with virtual tape there is no need to find all of the pieces of media and, because the data is stored on disk, the time to restore single files is lower due to the ability to randomly seek within a backup more quickly and to load a second cartridge instantly. Overwrite versus append of media Overwriting and appending to cartridges is also a concept where virtual tape has a benefit. With physical media it is often sensible to append multiple backup jobs to a single cartridge in order to reduce media costs; the downside of this is that cartridges cannot be overwritten until the retention policy for the last backup on that cartridge has expired. The diagram below shows cartridge containing multiple appended backup sessions some of which are expired and other that are valid. Space will be used by the StoreOnce appliance to store the expired sessions as well as the valid sessions. Moving to an overwrite strategy will avoid this. With virtual tape a large number of cartridges can be configured for “free” and their sizes can be configured so that they are appropriate to the amount of data stored in a specific backup. Appended backups are of no benefit because media costs are not relevant in the case of VTL.
Rotation schemes and retention policy
67
Figure 39 Cartridges with appended backups (not recommended)
Our recommendations are: •
Target full backup jobs to specific cartridges, sized appropriately
•
Reduce the number of appends by specifying separate cartridges for each incremental backup
Taking the above factors into consideration, an example of a good rotation scheme where the customer requires weekly full backups sent offsite and a recovery point objective of every day in the last week, every week in the last month, every month in the last year and every year in the last 5 years might be as follows: •
4 daily backup cartridges, Monday to Thursday, incremental backup, overwritten every week.
•
4 weekly backup cartridges, Fridays, full backup, overwritten every fifth week
•
12 monthly backup cartridges, last Friday of month, overwritten every 13th month.
•
5 yearly backup cartridges, last day of year, overwritten every 5 years.
This means that in the steady state, daily backups will be small, and whilst they will always overwrite the last week, the amount of data overwritten will be small. Weekly full backups will always overwrite, but housekeeping has plenty of time to run over the following day or weekend or whenever scheduled to run, the same is true for monthly and yearly backups. Total virtual tapes required in above rotation = 25 Each backup job effectively has its own virtual tape.
Summary of VTL best practices NOTE:
68
The HP StoreOnce B6200 Backup system s only FC VTLs.
•
Make use of multiple network or Fibre Channel ports throughout your storage network to eliminate bottlenecks.
•
For FC configurations, split virtual tape libraries and drives across multiple FC ports ( FC VTL is available on StoreOnce B6200, 4210 FC/ 4220/4420/4430 models).
•
Configure multiple VTLs and separate data types across them; for example SQL to VTL1, Filesystem to VTL2, and so on.
•
Configure larger “block sizes” within the backup application to improve performance.
•
Disable any multiplexing configuration within the backup application.
•
Disable any compression or encryption of data before it is sent to the StoreOnce appliance.
•
Best throughput is achieved with multiple streams, the actual number per device/appliance varies by model.
•
Schedule physical tape offload/copy operations outside of other backup, replication or housekeeping activities.
Virtual Tape Devices
•
Tape drive emulation types have no effect on performance or functionality
•
Configuring multiple tape drives per library enables multi-streaming operations per library for good aggregate throughput performance.
•
Do not exceed the recommended maximum concurrent backup streams per library and appliance if maximum performance is required.
•
Target the backup jobs to run simultaneously across multiple drives within the library and across multiple libraries. Keep the concurrent stream count high for best throughput.
•
Create multiple libraries on the larger StoreOnce appliances to achieve best aggregate performance
•
Configure dedicated Individual libraries for backing up larger servers.
•
Configure other libraries for consolidated backups of smaller servers.
•
Separate libraries by data type if the best trade-off between deduplication ratio and performance is needed
•
Cartridge capacities should be set either to allow a full backup to fit on one cartridge or to match the physical tape size for offload (whichever is the smaller)
•
Use a block size of 256KB or greater. For HP Data Protector and EMC Networker software a block size of 512 KB has been found to provide the best deduplication ratio and performance balance.
•
Disable the backup application for best performance.
•
that virtual cartridges cost nothing and use up very little space overhead. Don’t be afraid of creating “too many” cartridges. Define slot counts to match required retention policy. The D2DBS, ESL and EML virtual library emulations can have a large number of configurable slots and drives to give most flexibility in matching customer requirements.
•
Design backup policies to overwrite media so that space is not lost to a large expired media pool and media does not have different retention periods on the same piece of media.
•
Reduce the number of appends per tape by specifying separate cartridges for each incremental backup, this improves replication performance and capacity utilization.
Summary of VTL best practices
69
10 NAS shares NOTE: It is important to understand that the HP StoreOnce network share is intended to be used ONLY by backup applications that “back up to disk”. Do not use the NAS target device as a drag-and-drop general file store. The one exception to this rule is if you are using the NAS share to seed an appliance for replication.
Operating system Two interfaces are ed: •
a CIFS interface for Windows networks
• a NFS interface for Linux and UNIX networks See the HP StoreOnce Backup System guide for more information ing the Web Management Interface to create and configure NAS shares as targets for backup applications. Refer to the UNIX and Linux Configuration Guide for more information about the NFS interface.
Backup application NAS shares may be used with most applications that backup to disk, including embedded applications, such as Oracle RMAN and VMWare VCB Agent. For the most up-to-date information about ed applications, refer to http://www.hp.com/go/ebs.
Shares and deduplication stores Each NAS share created on the StoreOnce system has its own deduplication “store”; any data backed up to a share will be deduplicated against all of the other data in that store, there is no option to create non-deduplicating NAS shares and there is no deduplication between different shares on the same StoreOnce appliance. Once a StoreOnce CIFS share is created, subdirectories can be created via Explorer. This enables multiple host servers to back up to a single NAS share but each server can back up to a specific sub-directory on that share. Alternatively a separate share for each host can be created. The backup usage model for StoreOnce has driven several optimisations in the NAS implementation which require accommodation when creating a backup regime: •
Only backup files larger than 24 MB will be deduplicated, this works well with backup applications because they generally create large backup files and store them in configurable larger containers . Please note that simply copying (by drag and drop for example) a collection of files to the share will not result in the smaller files being deduplicated.
•
There is a limit of 25000 files per NAS share, applying this limit ensures good replication responsiveness to data change. This is not an issue with many backup applications because they create large files and it is very unlikely that there will be a need to store more than 25000 on a single share.
•
A limit in the number of concurrently open files both above and below the deduplication file size threshold (24 MB) is applied. This prevents overloading of the deduplication system and thus loss of performance.
When protecting a large amount of data from several servers with a StoreOnce NAS solution it is sensible to split the data across several shares in order to realise best performance from the entire system by improving the responsiveness of each store. Smaller stores have less work to do in order to match new data to existing chunks so they can perform faster. The best way to do this whilst still maintaining a good deduplication ratio is to group similar data from several servers in the same store. For example: keep file data from several servers in one share, and Oracle database backups in another share. 70
NAS shares
Maximum concurrently open files The table in Key parameters (page 115) shows the maximum number of concurrently open files per share and per StoreOnce appliance for files above and below the 24 MB dedupe threshold size. A backup job may consist of several small metadata/control files (that are constantly being updated) and at least one large data file. In some cases, backup applications will hold open more than one large file. It is important not to exceed the maximum concurrent backup operations. If these thresholds are breached the backup application will receive an error from the StoreOnce appliance indicating that a file could not be opened and the backup will fail. The number of concurrently open files in the table do not guarantee that the StoreOnce appliance will perform optimally with this number of concurrent backups, nor do they take into the fact that host systems may report a file as having been closed before the actual close takes place, this means that the limits provided in the table could be exceeded without realizing it. Should the open file limit be exceeded an entry is made in the StoreOnce Event Log so the knows this has happened. Corrective action for this situation is to reduce the overall concurrent backups that are happening and have caused too many files to be opened at once, maybe by re-scheduling some of the backup jobs to take place at a different time.
Maximum number of NAS shares The total number of “devices” provided by a StoreOnce appliance is split between VTL, NAS and Catalyst devices..These devices may be all VTL, all NAS or any combination of NAS and VTL devices. The following table illustrates some possible configurations. Table 7 Maximum number of virtual devices per service set VTL
NAS
Catalyst
Total number of devices
48
0
0
48 (maximum)
0
48
0
48 (maximum)
0
0
48
48 (maximum)
24
24
0
48 (maximum)
1
47
0
48 (maximum)
47
0
0
47 (less than maximum)
30
20
0
50 (more than maximum, you will not be able to create devices 49 and 50)
12
12
12
36 (less than maximum)
16
16
16
48 (maximum)
4
5
3
12 (less than maximum)
Maximum number of files per NAS share and appliance The HP StoreOnce NAS implementation is optimized for use with backup applications. These applications create large backup files on the NAS share, which make much more efficient use of deduplication than simply copying files of various sizes to a deduplicating share. To improve performance for small metadata files created by backup applications and to allow random access to the header information at the beginning of backup files, the first 24 MB of any backed up file will not be deduplicated. This non-deduplicated region is called the deduplication threshold. The HP StoreOnce Backup System imposes a limit on the number of files that can be stored on each NAS share. The limit is 25000 files, which provides the ability to protect a large amount of data using a backup application. The limit is imposed in order to allow efficient use of data replication. Shares and deduplication stores
71
There are also limits on the number of open files greater than the deduplication threshold that are allowed per share and per appliance. These are the files that hold the backed-up data. Backup applications generally create a small number of additional files during a backup job in order to store configuration details and catalog entries. Some of these small files will generally be updated throughout the backup process and, in most instances, these files will be below the deduplication threshold. So, there is also a maximum number of open files that are the same size or smaller than the deduplication threshold that are allowed per appliance. See Key parameters (page 115).
Maximum number of s per CIFS share The maximum number of s that may be configured when using “” or AD authentication for access to a CIFS share is 50. This maximum is the total number of s per CIFS server and also the maximum that can be allocated access to any single CIFS share Different s may access single NAS shares simultaneously, however, a file within a share may only be opened by one at a time.
Maximum number of hosts per NFS share The maximum number of host systems that may be configured to access an NFS share is 50. This maximum is the total number of hosts per NFS server and also the maximum that can be allocated access to any single NAS share. A file within a share may only be opened by one at a time.
CIFS share authentication The StoreOnce device provides three possible authentication options for the CIFS server: •
None – All shares created are accessible to any from any client (least secure).
•
– Local (StoreOnce) authentication.
•
AD – Active Directory authentication.
None – This authentication mode requires no name or authentication and is the simplest configuration. Backup applications will always be able to use shares configured in this mode with no changes to either server or backup application configuration. However this mode provides no data security as anyone can access the shares and add or delete data. – In this mode it is possible to create “local StoreOnce s” from the StoreOnce management interface. This mode requires the configuration of a respective local on the backup application media server as well as configuration changes to the backup application services. Individual s can then be assigned access to individual shares on the StoreOnce appliance. This authentication mode is ONLY recommended when the backup application media server is not a member of an AD Domain. AD – In this mode the StoreOnce CIFS server becomes a member of an Active Directory Domain. In order to an AD domain the needs to provide credentials of a who has permission to add computers and s to the AD domain. After ing an AD Domain access to each share is controlled by Domain Management tools and domain s or groups can be given access to individual shares on the StoreOnce appliance. This is the recommended authentication mode if the backup application media server is a member of an AD domain. It is the preferred option. Refer to the “HP StoreOnce Backup system guide” for more information about configuring authentication.
Backup application configuration The HP StoreOnce Backup system NAS functionality is designed to be used with backup applications that create large “backup files” containing all of the server backup data rather than applications that simply copy the file system contents to a share. 72
NAS shares
When using a backup application with StoreOnce NAS shares the will need to configure a new type of device in their backup application. Each application varies as to what it calls a backup device that is located on a StoreOnce device, for example it may be called a File Library, Backup to Disk Folder, or even Virtual Tape Library. Most backup applications allow the operator to set various parameters related to the NAS backup device that is created, these parameters are important in ensuring good performance in different backup configurations. Generic best practices can be applied to all applications as follows.
Backup file size Backup applications using disk/NAS targets will create one or more large backup files per backup stream; these contain all of the backed up data. Generally a limit will be set on the size that this file can get to before a new one is created (usually defaulting to 4–5 GB). A backup file is analogous to a virtual cartridge for VTL devices, but default file sizes will be much smaller than a virtual cartridge size (e.g. a virtual cartridge may be 800 GB). In addition to the data files, there will also be a small number of metadata files such as catalogue and lock files, these will generally be smaller than the 24 MB dedupe threshold size and will not be deduplicated. These files are frequently updated throughout the backup process, so allowing them to be accessed randomly without deduplication ensures that they can be accessed quickly. The first 24 MB of any backup file will not be deduplicated, with metadata files this means that the whole file will not be deduplicated, with the backup file the first 24 MB only will not be deduplicated. This architecture is completely invisible to the backup application which is merely presented with its files in the same way as any ordinary NAS share would do so. It is possible that the backup application will modify data within the deduplicated data region; this is referred to as a write-in-place operation. This is expected to occur rarely with standard backup applications because these generally perform stream backups and either create a new file or append to the end of an existing file rather than accessing a file in the middle. If a write-in-place operation does occur, the StoreOnce appliance will create a new backup item that is not deduplicated, a pointer to this new item is then created so that when the file is read the new write-in-place item will be accessed instead of the original data within the backup file. If a backup application were to perform a large amount of write-in-place operations, there would be an impact on backup performance – because of the random access nature that write in place creates. Some backup applications provide the ability to perform “Synthetic Full” backups, these may produce a lot of write-in-place operations or open a large number of files all at once, it is therefore recommended that Synthetic Full backup techniques are not used. Generally configuring larger backup container file sizes will improve backup performance and deduplication ratio because: 1. The overhead of the 24 MB dedupe region is reduced. 2. The backup application can stream data for longer without having to close and create new files. 3. There is a lower percentage overhead of control data within the file that the backup application uses to manage its data files. 4. There is no penalty to using larger backup files as disk space is not usually pre-allocated by the backup application. If possible the best practice is to configure a container file size that is larger than the complete backup will be (allowing for some data growth over time), so that only one file is used for each backup. Some applications will limit the maximum size to something smaller than that however, in which case, using the largest configurable size is the best approach.
Backup application configuration
73
Disk space pre-allocation Some backup applications allow the to choose whether to “pre-allocate” the disk space for each file at creation time, i.e. as soon as a backup file is created an empty file is created of the maximum size that the backup file can reach. This is done to ensure that there is enough disk space available to write the entire backup file. This setting has no value for StoreOnce devices because it will not result in any physical disk space actually being allocated due to the deduplication system. It is advised that this setting is NOT used because it can result in unrealistically high deduplication ratios being presented when pre-allocated files are not completely filled with backup data or, in extreme cases, it will cause a backup failure due to a timeout if the application tries to write a small amount of data at the end of a large empty file. This results in the entire file having to be padded-out with zeros at creation time, which is a very time consuming operation.
Block/transfer size Some backup applications provide a setting for block or transfer size for backup data in the same way as for tape type devices. Larger block sizes are beneficial in the same way for NAS devices as they are for virtual tape devices because they allow for more efficient use of the network interface by reducing the amount of metadata required for each data transfer. In general, set block or transfer size to the largest value allowed by the backup application.
Concurrent operations For best StoreOnce performance it is important to either perform multiple concurrent backup jobs or use multiple streams for each backup (whilst staying within the limit of concurrently open files per NAS share). Backup applications provide an option to set the maximum number of concurrent backup streams per file device; this parameter is generally referred to as the number of writers. Setting this to the maximum values shown in the table below ensures that multiple backups or streams can run concurrently whilst remaining within the concurrent file limits for each StoreOnce share. The table in Key parameters (page 115) shows the recommended maximum number of backup streams or jobs per share to ensure that backups will not fail due to exceeding the maximum number of concurrently open files. Note however that optimal performance may be achieved at a lower number of concurrent backup streams. These values are based on standard “file” backup using most major backup applications. If backing up using application agents (e.g. Exchange, SQL, Oracle) it is recommended that only one backup per share is run concurrently because these application agents frequently open more concurrent files than standard file type backups.
Buffering If the backup application provides a setting to enable buffering for Read and/or Write this will generally improve performance by ensuring that the application does not wait for write or read operations to report completion before sending the next write or read command. However, this setting could result in the backup application inadvertently causing the StoreOnce appliance to have more concurrently open files than the specified limits (because files may not have had time to close before a new open request is sent). If backup failures occur, disabling buffered writes and reads may fix the problem, in which case, reducing the number of concurrent backup streams then re-enabling buffering will provide best performance.
Overwrite versus append This setting allows the backup application to either always start a new backup file for each backup job (overwrite) or continue to fill any backup file that has not reached its size limit before starting new ones (append).
74
NAS shares
Appended backups should not be used because there is no benefit to using the append model, this does not save on disk space used.
Compression and encryption Most backup applications provide the option to compress the backup data in software before sending, this should not be implemented. Software compression will have the following negative impacts: 1. Consumption of system resources on the backup server and associated performance impact. 2. Introduction of randomness into the data stream between backups which will reduce the effectiveness of StoreOnce deduplication Some backup applications now also provide software encryption, this technology prevents either the restoration of data to another system or interception of the data during transfer. Unfortunately it also has a very detrimental effect on deduplication as data backed up will look different in every backup preventing the ability to match similar data blocks. The best practice is to disable software encryption and compression for all backups to the HP StoreOnce Backup system.
By default most backup applications will perform a on each back job, in which they read the backup data from the StoreOnce appliance and check against the original data. Due to the nature of deduplication the process of reading data is slower than writing as data needs to be re-hydrated. Thus running a will more than double the overall backup time. If possible should be disabled for all backup jobs to StoreOnce, but trial restores should still happen on a regular basis.
Synthetic full backups Some backup applications have introduced the concept of a “Synthetic Full” backup where after an initial full backup, only file or block based incremental backups are undertaken. The backup application will then construct a full system recovery of a specific point in time from the original full backup and all of the changes up to the specified recovery point. In most cases this model will not work well with a NAS target on a StoreOnce Backup system for one of two reasons. •
The backup application may post-process each incremental backup to apply the changes to the original full backup. This will perform a lot of random read and write and write-in-place which will be very inefficient for the deduplication system resulting in poor performance and dedupe ratio.
•
If the backup application does not post-process the data, then it will need to perform a reconstruction operation on the data when restored, this will need to open and read a large number of incremental backup files that contain only a small amount of the final recovery image, so the access will be very random in nature and therefore a slow operation.
An exception to this restriction is the HP Data Protector Synthetic full backup which works well. However the HP Data Protector Virtual Synthetic full backup which uses a distributed file system and creates thousands of open files does not. Check with your backup application or HP Sales person for more details.
Summary of NAS best practices •
Configure multiple shares and separate data types into their own shares.
• •
Adhere to the suggested maximum number of concurrent operations per share/appliance. Choose disk container backup file sizes in backup software to meet the maximum size of the backup data. For example, if a full backup is 500GB set the container size to be at least 500GB . If this is not possible, make the backup container size as large as possible. Summary of NAS best practices
75
76
•
Each NAS share has a 25,000 file limit, some backup applications create large numbers of small control files during backup to disk. If this is the case , it may be necessary to create additional shares and distribute the backup across multiple shares.
•
Disable software compression, deduplication and synthetic full backups.
•
Do not pre-allocate disk space for backup files within the backup application.
•
Best throughput is achieved with multiple streams, the actual number per device/appliance varies by model.
•
For NFS shares ensure the correct mount options are used to ensure in-order delivery and provide better deduplication ratios. See the HP StoreOnce Linux and UNIX Configuration Guide for specific details.
•
Configure bonded network ports for best performance.
•
Monitor the number of files created in the share at regular intervals as there is a 25,000 file limit per share
•
For CIFS shares the recommended implementation is using AD authentication (see later)
•
For NFS shares there is a specific mount option which ensures all data to the NFS share is sent “in order” – which enables the best deduplication ratio. The name of the mount option varies according to the operating system; some operating systems also require an update package to be installed to enable this. See the HP StoreOnce Linux and UNIX Configuration Guide for more details.
NAS shares
11 Replication When considering replication you are likely to be synchronizing data between different models of HP StoreOnce Backup Systems. The examples in this section are not specific to a particular model of HP StoreOnce Backup System. Replication can take place between multi-node and single-node StoreOnce Backup Systems. The GUI for both refers to both Replication Targets and Sources as appliances. It is important to understand the different meaning of appliance within the multi-node and single-node environment. •
In a single-node StoreOnce Backup System, the appliance is the physical device or server that contains the data to be replicated. All mapping is done using the physical IP addresses of the target and source StoreOnce Backup Systems.
•
In a multi-node StoreOnce Backup System, the appliance is the service set that contains the data to be replicated. This means that each B6000 Backup System has at least two service sets that can be selected as appliances. All mapping is done using the virtual IP addresses of the target and source service sets.
Within this chapter the term appliance and service set are synonymous.
What is replication? Replication is a standard term used to describe a way of synchronizing data between hardware in two physical locations. It is the process of creating an exact match on the target appliance of the specified data from the source appliance. It is important to understand that no history is held; the target appliance always mirrors as soon as possible the current state of the data on the source appliance, which means that it is ready for use if the source share, library or appliance is unavailable. But it does not hold archive versions and is not an alternative to conventional backup with multiple restore points. A Configuration Wizard is provided to take you through HP StoreOnce Replication configuration steps. StoreOnce replication is a concept that is used with VTL and NAS devices. The equivalent concept for Catalyst store is called Catalyst Copy. All three device types use a deduplication-enabled, low bandwidth transfer policy to replicate data from a device on a “replication source” StoreOnce Backup system to an equivalent device on another “replication target” StoreOnce Backup system. The fundamental difference is that the backup application controls Catalyst store copy operations, whereas all VTL and NAS replication is configured and managed on the StoreOnce Management GUI. Replication provides a point-in-time “mirror” of the data on the source StoreOnce device at a target StoreOnce Backup system on another site; this enables quick recovery from a disaster that has resulted in the loss of both the original and backup versions of the data on the source site. Replication does not however provide any ability to roll-back to previously backed-up versions of data that have been lost from the source StoreOnce Backup system. For example, if a file is accidentally deleted from a server and therefore not included in the next backup, and all previous versions of backup on the source StoreOnce Backup system have also been deleted, those files will also be deleted from a replication target device as the target is a mirror of exactly what is on the source device. The only exception to this is if a Catalyst device type is used because the retention periods of data on the Target can be different (greater in most cases) than the retention periods at the source – giving an additional margin on data protection. NOTE: For examples on using Catalyst Copy with specific backup applications see HP StoreOnce Backup system Summary of Best Practices for VTL, NAS, StoreOnce Catalyst and Replication implementations with sizing and application configuration examples.
What is replication?
77
StoreOnce VTL and NAS replication overview The StoreOnce Backup system utilizes a propriety protocol for replication traffic over the Ethernet ports; this protocol is optimized for deduplication-enabled replication traffic. An item (VTL Cartridge or NAS file) will be marked ready for replication as soon as it is closed (or the VTL cartridge returned to its slot). Replication works in a “round robin” process through the libraries and shares on a StoreOnce Backup system; when it gets to an item that is ready for replication it will start a replication job for that item assuming there is not already the maximum number of replication jobs underway. Replication will first exchange metadata information between source and target to identify the blocks of deduplicated data that are different; it will then synchronize the changes between the two appliances by transferring the changed blocks or marking blocks for removal at the target appliance. Replication does trigger housekeeping on the Target Appliance. Replication will not prevent backup or restore operations from taking place. If an item is re-opened for further backups or restore, then replication of that item will be paused to be resumed later or cancelled if the item is changed. Replication can also be configured to occur at specific times (via configurable blackout windows) in order to optimize bandwidth usage and not affect other applications that might be sharing the same WAN link. VTL and NAS replication is configured between devices using “Mappings” and is not known to the backup software but is controlled entirely by the StoreOnce appliance. Catalyst Copy is controlled entirely by the backup software and has no Mappings within the device to configure. A data import process is necessary to recover data from a target NAS or VTL device, But with Catalyst no backup application import is required because the additional copies are already known to the backup software and do not need to be imported.
Replication usage models (VTL and NAS) There are four main usage models for replication using StoreOnce VTL and NAS devices shown below. •
Active/ive – A StoreOnce system at an alternate site is dedicated solely as a target for replication from a StoreOnce system at a primary location.
•
Active/Active – Both StoreOnce systems are backing up local data as well as receiving replicated data from each other.
•
Many-to-One – A target StoreOnce system at a data center is receiving replicated data from many other StoreOnce systems at other locations.
•
N-Way – A collection of StoreOnce systems on several sites are acting as replication targets for other sites.
The usage model employed will have some bearing on the best practices that can be employed to provide best performance. The following diagrams show the usage models using VTL device types.
78
Replication
Figure 40 Active to ive replication
Figure 41 Active to active replication
Replication usage models (VTL and NAS)
79
Figure 42 Many to Once replication
80
Replication
Figure 43 N-way replication
In most cases StoreOnce VTL and StoreOnce NAS replication is the same, the only significant configuration difference being that VTL replication allows multiple source libraries to replicate into a single target library, NAS mappings however are 1:1, one replication target share may only receive data from a single replication source share. In both cases replication sources libraries or shares may only replicate into a single target. However, with VTL replication a subset of the cartridges within a library may be configured for replication (a share may only be replicated in its entirety).
What to replicate StoreOnce VTL replication allows for a subset of the cartridges within a library to be mapped for replication rather than the entire library (NAS replication does not allow this). Some retention policies may not require that all backups are replicated, for example daily incremental backups may not need to go offsite but weekly and monthly full backups do, in which case it is possible to configure replication to only replicate those cartridges that are used for the full backups. Reducing the number of cartridges that make up the replication mapping may also be useful when replicating several source libraries from different StoreOnce devices into a single target library at a data center, for example. Limited slots in the target library can be better utilized to take only replication of full backup cartridges rather than incremental backup cartridges as well. Configuring this reduced mapping does require that the backup has control over which cartridges in the source library are used for which type of backup. Generally this is done
What to replicate
81
by creating media pools with the backup application then manually asg source library cartridges into the relevant pools. For example the backup may configure 3 pools: •
Daily Incremental, 5 cartridge slots (overwritten each week)
•
Weekly Full, 4 cartridge slots (overwritten every 4 weeks)
•
Monthly Full, 12 cartridge slots (overwritten yearly)
Replicating only the slots that will contain full backup cartridges saves five slots on the replication target device which could be better utilized to accept replication from another source library. NOTE: The Catalyst equivalent of this requires the actual Backup policies to define which backups to Catalyst stores are to be copied and which are not – so for example, you could configure only Full backups to be copied to Catalyst stores.
Appliance library and share replication fan in/fan out Each StoreOnce model has a different level of for the number of other StoreOnce appliances that can be involved in replication mappings with it and also the number of libraries that may replicate into a single library on the device. The configuration settings are defined below. Max Appliance Fan Out
The maximum number of target appliances that a source appliance can be paired with
Max Appliance Fan In
The maximum number of source appliances that a target appliance can be paired with
Max Library Fan Out
The maximum number of target libraries that may be replicated into from a single source library on this type of appliance
Max Library Fan In
The maximum number of source libraries that may replicate into a single target library on this type of appliance
Max Share Fan Out
The maximum number of target NAS Shares that may be replicated into from a single source NAS Share on this type of appliance
Max Share Fan In
The maximum number of source NAS Shares that may replicate into a single target NAS Share on this type of appliance
It is important to note that when utilizing a VTL replication Fan-in model (where multiple source libraries are replicated to a single target library), the deduplication ratio may be better than is achieved by each individual source library due to the deduplication across all of the data in the single target library. However, over a large period of time the performance of this solution will be slower than configuring individual target libraries because the deduplication stores will be larger and therefore require more processing for each new replication job.
Concurrent replication jobs Each StoreOnce model has a different maximum number of concurrently running replication jobs when it is acting as a source or target for replication. When many items are available for replication, this is the maximum number of jobs that will be running at any one time. As soon as one item has finished replicating another will start. For example, an HP 2620 may be replicating up to 4 jobs to a StoreOnce 4430, which may also be accepting another 44 source items from other StoreOnce systems. But the target concurrency for a 4430 is 96 so the target is not the bottleneck to replication performance. If total source replication jobs are greater than 96 then the StoreOnce 4430 will limit replication throughput and replication jobs will queue until a slot becomes available.
Apparent replication throughput In characterizing replication performance we use a concept of “apparent throughput”. Since replication es only unique data between sites there is a relationship between the speed at which the unique data is sent and how that relates to the whole backup apparently replicated 82
Replication
between sites. In all reporting on the StoreOnce Management GUI, the throughput in MB/sec is apparent throughput – think of this at the rate at which we are apparently replicating the backup data between sites.
What actually happens in replication? Assuming the seeding process is complete (seeding is when the initial data is transferred to the target device), the basic replication process works like this: 1. Source has a cartridge (VTL) or File (NAS) to replicate 2. Source sends to target a “Manifest” that is a list of all the Hash codes it wants to send to the target (the hash codes are what make up the cartridge/file/item) 3. The target replies: “I have 98% of those hash codes already – just send the 2% I don’t have.” 4. The source sends the 2% of hash codes the target requested. 5. The VTL or NAS replication job executes and completes. The bigger the change rate of data, the more “mismatch” there will be and the higher the volume of unique data that must be replicated over the WAN.
Limiting replication concurrency In some cases it may be useful to limit the number of replication jobs that can run concurrently on either the source or target appliance. These conditions might be: 1. There is a requirement to reduce the activity on either the source or target appliance in order to allow other operations (e.g. backup/restore) to have more available disk I/O. 2. The WAN Bandwidth is too low to the number of concurrent jobs that may be running concurrently. It is recommended that a minimum WAN bandwidth of 2Mb/sec is available per replication job. If a target device can for example 6 concurrent jobs, then 12 Mb/s of bandwidth is required for that target appliance alone. If there are multiple target appliances, the overall requirement is even higher. So, limiting the maximum number of concurrent jobs at the target appliance will prevent the WAN bandwidth being oversubscribed with the possible result of replication failures or impact on other WAN traffic. The Maximum jobs configuration is available from the StoreOnce Management GUI on the Local Settings tab of the Replication – Configuration page. Other tabs on this page can be used to control the bandwidth throttling used for replication and the blackout windows that prevents replication from happening at certain times.
WAN link sizing One of the most important aspects in ensuring that a replication will work in a specific environment is the available bandwidth between replication source and target StoreOnce systems. In most cases a WAN link will be used to transfer the data between sites unless the replication environment is all on the same campus LAN. It is recommended that the HP Sizing Tool (http://h30144.www3.hp.com/SWDSizerWeb/ default.htm) is used to identify the product and WAN link requirements because the required bandwidth is complex and depends on the following: •
Amount of data in each backup
•
Data change per backup (deduplication ratio)
•
Number of StoreOnce systems replicating
•
Number of concurrent replication jobs from each source
•
Number of concurrent replication jobs to each target
•
Link latency (governs link efficiency)
What actually happens in replication?
83
As a general rule of thumb, however, a minimum bandwidth of 2 Mb/s per replication job should be allowed. For example, if a replication target is capable of accepting 8 concurrent replication jobs (HP 4220) and there are enough concurrently running source jobs to reach that maximum, the WAN link needs to be able to provide 16 Mb/s to ensure that replication will run correctly at maximum efficiency – below this threshold replication jobs may begin to pause and restart due to link contention. It is important to note that this minimum value does not ensure that replication will meet the performance requirements of the replication solution, a lot more bandwidth may be required to deliver optimal performance.
Seeding and why it is required One of the benefits of deduplication is the ability to identify unique data, which then enables us to replicate between a source and a target StoreOnce Backup system, only transferring the unique data identified. This process only requires low bandwidth WAN links, which is a great advantage to the customer because it delivers automated disaster recovery in a very cost-effective manner. The StoreOnce Management GUI reports bandwidth saving as a key metric of the replication process and in general it is around the 95-98% mark (depending on data change rate). However prior to being able to replicate only unique data between source and target StoreOnce Backup system, we must first ensure that each site has the same hash codes or “bulk data” loaded on it – this can be thought of as the reference data against which future backups are compared to see if the hash codes exist already on the target. The process of getting the same bulk data or reference data loaded on the StoreOnce source and StoreOnce target is known as “seeding”. NOTE: With Catalyst the very first low bandwidth backup effectively performs its very own seeding operation. Seeding is generally a one-time operation which must take place before steady-state, low bandwidth replication can commence. Seeding can take place in a number of ways: •
Over the WAN link – although this can take some time for large volumes of data. A temporary increase in WAN bandwidth provision by your telco can often alleviate this problem.
•
Using co-location where two devices are physically in the same location and can use a GbE replication link for seeding– (this is best for Active/Active, Active ive configurations). After seeding is complete, one unit is physically shipped to its permanent destination.
•
Using a “Floating” StoreOnce device which moves between multiple remote sites ( best for many to one replication scenarios)
•
Using a form of removable media (physical tape or portable USB disks) to “ship data” between sites.
The recommended way to accelerate seeding is by co-location of the source and target systems on the same LAN whilst performing the first replicate. This process will obviously involve moving one or both of the appliances and will thus prevent them from running their normal backup routines. In order to minimize disruption seeding should ideally only be done once; in this case all backup jobs that are going to be replicated must have completed their first full backup to the source appliance before commencing a seeding operation. Once seeding is complete there will typically be a 90+% hit rate, meaning most of the hash codes are already loaded on the source and target and only the unique data will be transferred during replication. It is good practice to plan for seeding time in your StoreOnce Backup system deployment plan as it can sometimes be very time consuming or manually intensive work. The Sizing Tool calculates expected seeding times over Wan and LAN to help set expectations for how long seeding will take place. In practice a gradual migration of backup jobs to the StoreOnce appliance ensures there is not a sudden surge in seeding requirements but a gradual one, with weekends being used to performer high volume seeding jobs.
84
Replication
During the seeding process it is recommended that no other operations are taking place on the source StoreOnce Backup system, such as further backups or tape copies. It is also important to ensure that the StoreOnce Backup system has no failed disks and that RAID parity initialization is complete because these will impact performance. When seeding over fast networks (co-located StoreOnce devices) it should be expected that performance to replicate a cartridge or file is similar to the performance of the original backup.
Replication models and seeding The diagrams in Replication usage models (VTL and NAS) (page 78) indicate the different replication models ed by HP StoreOnce Backup systems; the complexity of the replication models has a direct influence on which seeding process is best. For example an Active – ive replication model can easily use co-location to quickly seed the target device, where as co-location may not be the best seeding method to use with a 50:1, many to 1 replication model. NOTE: HP StoreOnce Catalyst copy seeding follows the same processes outlined below with the added condition that for multi-hop and one to many replication scenarios the seeding process may have to occur multiple times. Table 8 Summary of seeding methods and likely usage models Technique
Best for
Concerns
Comments
Seed over the WAN link
Active -- ive and Many to 1 replication models with: Initial Small Volumes of Backup data OR Gradual migration of larger backup volumes/jobs to StoreOnce over time
This type of seeding should be scheduled to occur over weekends where at all possible.
Seeding time over WAN is calculated automatically when using the Sizing tool for StoreOnce. It is perfectly acceptable for customers to ask their link providers for a higher link speed just for the period where seeding is to take place.
Co-location (seed over LAN) Active -- ive, Active -Active and Many to 1 replication models with significant volumes of data (> 1TB) to seed quickly and where it would simply take too long to seed using a WAN link ( > 5 days) This process can only really be used as a “one off” when replication is first implemented.
This process involves the transportation of complete StoreOnce units. This method may not be practical for large fan-in implementations e.g. 50:1 because of the time delays involved in transportation.
Seeding time over LAN is calculated automatically when using the Sizing tool for StoreOnce
Floating StoreOnce
Careful control over the device creation and co-location replication at the target site is required. See example below.
This is really co-location using a spare StoreOnce. The last remote site StoreOnce can be used as the floating unit.
nl
nl
nl
nl
nl
Many to 1 replication models with high fan in ratios where the target must be seeded with several remote sites at once. Using the floating StoreOnce approach means the device is ready to be used again and again for future expansion where more remote sites might be added to the configuration. nl
nl
nl
Seeding and why it is required
85
Table 8 Summary of seeding methods and likely usage models (continued) Technique
Best for
Concerns
Comments
Backup application Tape offload/ copy from source and copy onto target
Suitable for all replication models, especially where remote sites are large (intercontinental) distances apart. Well suited to target sites that plan to have a physical Tape archive as part of the final solution. Best suited for StoreOnce VTL deployments. Unlikely to be used when seeding HP StoreOnce B6200 Backup systems.
Relies on the backup application ing the copy process, e.g. Media copy or “object” copy” or “duplicate” or “cloning”
Reduced shipping costs of physical tape media over actual StoreOnce units. Requires physical tape connectivity at all sites, AND media server capability at each site even if only for the seeding process. Backup application licensing costs for each remote site may be applicable
nl
nl
nl
nl
Use of portable disk drives - USB portable disks, such as backup application copy or HP RDX series, can be drag and drop configured as Disk File Libraries within the backup application software and used for “copies” OR Backup data can be drag and dropped onto the portable disk drive, transported and then drag and dropped onto the StoreOnce Target. Best used for StoreOnce NAS deployments. Do not use when seeding HP StoreOnce B6200 Backup systems.
nl
Multiple drives can be used – single drive maximum capacity is about 3TB currently.
nl
nl
USB disks are typically easier to integrate into systems than physical tape or SAS/FC disks. RDX ruggedized disks are OK for easy shipment between sites and cost effective. nl
nl
nl
NOTE:
Seeding methods are described in more detail in the next chapter.
Controlling Replication In order to either optimize the performance of replication or minimize the impact of replication on other StoreOnce operations it is important to consider the complete workload being placed on the StoreOnce Backup system. By default replication will start quickly after a backup completes; this window of time immediately after a backup may become very crowded if nothing is done to separate tasks. In this time the following are likely to be taking place: •
Other backups to the StoreOnce Backup system which have not yet finished
•
Housekeeping of the current and other completed overwrite backups
•
Possible copies to physical tape media of the completed backups
These operations will all impact each other’s performance, some best practices to avoid these overlaps are:
86
•
Set replication blackout windows to cover the backup window period , so that replication will not occur whilst backups are taking place.
•
Set housekeeping blackout windows to cover the replication period, some tuning may be required in order to set the housekeeping window correctly and allow enough time for housekeeping to run.
•
Delay physical tape copies to run at a later time when housekeeping and replication has completed. Preferably at the weekend
Replication
Replication blackout windows The replication process can be delayed from running using blackout windows that may be configured using the StoreOnce GUI. Up to two separate windows per day, which are at different times for each day of the week, may be configured. The best practice is to set a blackout window throughout the backup window so that replication does not interfere with backup operations. If tape copy operations are also scheduled, a blackout window for replication should also cover this time. Care must be taken, however, to ensure that enough time is left for replication to complete. If it is not, some items will never be synchronized between source and target and the StoreOnce Backup system will start to issue warnings about these items. The replication blackout window settings can be found on the StoreOnce Management Interface on the HP StoreOnce - Replication - Local Settings – Blackout Windows page. Figure 44 Configuring replication blackout windows
Replication bandwidth limiting In addition to replication blackout windows, the can also define replication bandwidth limiting; this ensures that StoreOnce replication does not swamp the WAN with traffic, if it runs during the normal working day. This enables blackout windows to be set to cover the backup window over the night time period but also allow replication to run during the day without impacting normal business operation. Bandwidth limiting is configured by defining the speed of the WAN link between the replication source and target, then specifying a maximum percentage of that link that may be used. Again, however, care must be taken to ensure that enough bandwidth is made available to replication to ensure that at least the minimum (2 Mb/s per job) speed is available and more, depending on the amount of data to be transferred, in the required time. Replication bandwidth limiting is applied to all outbound (source) replication jobs from an appliance; the bandwidth limit set is the maximum bandwidth that the StoreOnce Backup system can use for replication across all replication jobs. The replication bandwidth limiting settings can be found on the StoreOnce Management Interface on the HP StoreOnce - Replication - Local Settings - Bandwidth Limiting page. There are two ways in which replication bandwidth limits can be applied: •
General Bandwidth limit – this applies when no other limit windows are in place.
•
Bandwidth limiting windows – these can apply different bandwidth limits for times of the day A bandwidth limit calculator is supplied to assist with defining suitable limits.
Controlling Replication
87
Figure 45 Replication bandwidth settings
Source appliance permissions It is a good practice to use the Source Appliance Permissions functionality provided on the Replication - Partner Appliances tab to prevent malicious or accidental configuration of replication mappings from unknown or unauthorized source appliances. See the HP StoreOnce Backup system guide for information on how to configure Source Appliance Permissions. Note the following changes to replication functionality when Source Appliance Permissions are enabled: •
Source appliances will only have visibility of, and be able to create mappings with, libraries and shares that they have already been given permission to access.
•
Source appliances will not be able to create new libraries and shares as part of the replication wizard process, instead these shares and libraries must be created ahead of time on the target appliance.
Figure 46 Configuring source appliance permissions
88
Replication
12 Seeding methods in more detail While the concepts described in this chapter apply to all HP StoreOnce Backup systems, the diagrams in this chapter illustrate single node systems only. There is a separate chapter that illustrates how to implement replication and seeding with the HP StoreOnce B6200 Backup system.
Seeding over a WAN link With this seeding method the final replication set-up (mappings) can be established immediately. In the active-ive replication model, WAN seeding over the first backup is, in fact, the first wholesale replication. Figure 47 Seeding over a WAN link, active-ive replication model
Seeding over a WAN link
89
In the active-active replication model, WAN seeding after the first backup at each location is, in fact, the first wholesale replication in each direction. Figure 48 Seeding over a WAN link, active-active replication model
90
Seeding methods in more detail
In the Many-to-One model, WAN seeding over the first backup is, in fact, the first wholesale replication from the many remote sites to the Target site. Care must be taken not to run too many replications simultaneously or the Target site may become overloaded. Stagger the seeding process from each remote site. Figure 49 Seeding over a WAN link, many-to-one replication model
Seeding over a WAN link
91
Co-location (seed over LAN) The following diagram illustrates co-location seeding from a remote site to Data Center, in an active/ive replication model. The initial backup and replication takes place at the remote site over the LAN. The StoreOnce appliance holding the replicated data is then transported to the Data Center. Figure 50 Co-location seeding in an Actve/ive replication model
1. Initial backup
2. Replication over GbE link at remote site over LAN
3. Ship appliance to Data Center site
4. Re-establish replication with remote site over WAN
With this seeding method it is important to define the replication set-up (mappings) in advance so that in say the Many to One example the correct mapping is established at each site the target StoreOnce appliance visits before the target StoreOnce appliance is finally shipped to the Data Center Site and the replication “re-established” for the final time.
92
Seeding methods in more detail
The following diagram illustrates a many-to-one example. Figure 51 Co-location seeding in an Actve/ive replication model
1. Initial backup at each remote site
2. Replication to Target StoreOnce appliance over GbE link at each remote site over LAN
3. Move Target StoreOnce appliance between remote sites 4. Finally take Target StoreOnce appliance to Data Center and repeat replication. site. 5. Re-establish replication with remote sites over WAN
Co-location (seed over LAN)
93
Floating StoreOnce seeding In this model co-location takes place at many remote sites using a floating StoreOnce target. The StoreOnce target is transported between may sites and the taken to the Data Center. It is a useful model for large-in fan scenarios. Figure 52 Seeding using floating StoreOnce Backup system
1. Initial backup at each remote site
2. Replication to floating StoreOnce Target appliance over GbE link at each remote site over LAN
3. Move floating Target StoreOnce appliance between remote sites and repeat replication.
4. Finally take floating Target StoreOnce appliance to Data Center site.
5. Establish replication from floating StoreOnce Target 6. Establish final replication with remote sites. (now a Source) with Target StoreOnce at Data Center. Delete devices on floating Target StoreOnce appliance. Repeat the process for further remote sites until all data has been loaded onto the Data Center Target StoreOnce appliance. You may be able to accommodate 4 or 5 sites of replicated data on a single floating StoreOnce appliance. nl
This “floating StoreOnce appliance” method is more complex because for large fan-in (many source sites replicating into single target site) the initial replication set up on the floating StoreOnce appliance changes as it is then transported to the data center, where the final replication mappings are configured. The sequence of events is as follows:
94
Seeding methods in more detail
1.
Plan the final master replication mappings from sources to target that are required and document them. Use an appropriate naming convention e.g. SVTL1, SNASshare1, TVTL1, TNASshare1. 2. At each remote site perform a full system backup to the source StoreOnce appliance and then configure a 1:1 mapping relationship with the floating StoreOnce appliance ” e.g. SVTL1 on Remote Site A -> FTVTL1 on floating StoreOnce. FTVTL1 = floating target VTL1. 3. Seeding remote site A to the floating StoreOnce appliance will take place over the GbE link and should take only a few hours. 4. On the Source StoreOnce appliance at the remote site DELETE the replication mappings – this effectively isolates the data that is now on the floating StoreOnce appliance. 5. Repeat the process steps 1-4 at Remote sites B and C. 6. When the floating StoreOnce appliance arrives at the central site, the floating StoreOnce appliance effectively becomes the Source device to replicate INTO the StoreOnce appliance at the data center site. 7. On the Floating StoreOnce appliance we will have devices (previously named as FTVTL1, FTNASshare 1) that we can see from the Web Management Interface. Using the same master naming convention as we did in step 1, set up replication which will necessitate the creation of the necessary devices (VTL or NAS) on the StoreOnce 4220 at the Data Center site e.g. TVTL1, TNASshare 1. 8. This time when replication starts up the contents of the floating StoreOnce appliance will be replicated to the data center StoreOnce appliance over the GbE connection at the data center site and will take several hours. In this example Remote Site A, B, C data will be replicated and seeded into the StoreOnce 4220. When this replication step is complete, DELETE the replication mappings on the floating StoreOnce appliance, to isolate the data on the floating StoreOnce appliance and then DELETE the actual devices on the floating StoreOnce appliance, so the device is ready for the next batch of remote sites. 9. Repeat steps 1-8 for the next series of remote sites until all the remote site data has been seeded into the StoreOnce 4220. 10. Now we have to set up the final replication mappings using our agreed naming convention decided in Step 1. This time we go to the Remote sites and configure replication again to the Data Center site but being careful to use the agreed naming convention at the data center site e.g. TVTL1, TNASshare1 etc. This time when we set up replication the StoreOnce 4220 at the target site presents a list of possible target replication devices available to the remote site A. So in this example we would select TVTL1 or TNASshare1 from the drop-down list presented to Remote Site A when we are configuring the final replication mappings. This time when the replication starts almost all the necessary data is already seeded on the StoreOnce 4220 for Remote site A and the synchronization process happens very quickly. NOTE: If using this approach with Catalyst stores that do not rely on “mappings”, the Floating StoreOnce appliance can be simply used to collect all the Catalyst Items at the Remote sites if a consolidation model is to be deployed. If not, create a separate Catalyst store on the Floating StoreOnce Appliance for each site.
Seeding using physical tape or portable disk drive In this method of seeding we use a removable piece of media (like LTO physical tape or removable RDX disk drive acting as a disk Library or file library*) to move data from the remote sites to the central data center site. This method requires the use of the backup application software and additional hardware to put the data onto the removable media. * Different backup software describes “disk targets for backup” in different ways e.g. HP Data Protector calls StoreOnce NAS shares “ DP File Libraries”, Commvault Simpana calls StoreOnce NAS shares “Disk libraries.” Seeding using physical tape or portable disk drive
95
Figure 53 Seeding using physical tape and backup application
1. Initial backup to StoreOnce appliance
2. Copy to tape(s) or a disk using backup application software on Media Server for NAS devices; only use simple drag and drop to portable disk This technique is not possible at Sites A & B unless a media server is present.
3. Ship tapes/disks to Data Center site.
4. Copy tapes/disks into target appliance using backup application software on Media Server (or for portable disks only use drag and drop onto NAS share on the StoreOnce target).
5. Establish replication.
Proceed as follows 1. Perform full system backup to the StoreOnce Backup system at the remote site using the local media server, e.g. at remote site C. The media server must also be able to see additional devices such as a physical LTO tape library or a removable disk device configured as a disk target for backup. 2. Use the backup application software to perform a full media copy of the contents of the StoreOnce Backup system to a physical tape or removable disk target for backup also attached to the media server. In the case of removable USB disk drives the capacity is probably limited to 2 TB, in the case of physical LTO5 media it is limited to about 3 TB per tape, but of course multiple tapes are ed if a tape library is available. For USB disks, separate backup targets for disk devices would need to be created on each removable RDX drive because we cannot span multiple RDX removable disk drives. 3. The media from the remote sites is then shipped (or even posted!) to the data center site. 4. Place the removable media into a library or connect the USB disk drive to the media server and let the media server at the data center site discover the removable media devices. The media server at the data center site typically has no information on what is on these pieces of removal media and we have to make the data visible to the media server at the data center site. This generally takes the form of what is known as an “import” operation where the 96
Seeding methods in more detail
5.
6.
7.
removable media has to be ed into the catalog/database of the media server at the data center site. Create devices on the StoreOnce Backup system at the data center site using an agreed convention e.g. TVTL1, TNASshare1. Discover these devices through the backup application so that the media server at the data center site has visibility of both the removable media devices AND the devices configured on the StoreOnce Backup system. Once the removable media has been imported into the media server at the data center site it can be copied onto the StoreOnce Backup system at the data center site (in the same way as before at step 2) and, in the process of copying the data , we seed the StoreOnce Backup system at the data center site. It is important to copy physical tape media into the VTL device that has been created on the StoreOnce Backup system and copy the disk target for backup device (RDX) onto the StoreOnce NAS share device that has been created on the StoreOnce Backup system at the data center site. Now we have to set up the final replication mappings using our agreed naming convention. Go to the remote sites and configure replication again to the data center site, being careful to use the agreed naming convention at the data center site e.g. TVTL1, TNASshare1 etc. This time when we set up replication the StoreOnce 4220 at the target site presents a list of possible target replication devices available to the remote site. So in this example we would select TVTL1 or TNASshare1 from the drop-down list presented to remote site C when we are configuring the final replication mappings. This time when the replication starts almost all the necessary data is already seeded on the StoreOnce 4220 for Remote site A so the synchronization process happens very quickly.
The media servers are likely to be permanently present at the remote sites and data center site so this is making good use of existing equipment. For physical tape drives/library connection at the various sites SAS or FC connection is required. For removable disk drives such as RDX a USB connection is the most likely connection because it is available on all servers at no extra cost. If the StoreOnce deployment is going to use StoreOnce NAS shares at source and target sites the seeding process can be simplified even further by using the portable disk drives to drag and drop backup data from the source system onto the portable disk. Then transport the portable disk to the target StoreOnce site and connect it to a server with access to the StoreOnce NAS share at the target site. Perform a drag and drop from portable disk onto the StoreOnce NAS share and this then performs the seeding for you! NOTE: Drag and drop is NOT to be used for day to day use of StoreOnce NAS devices for backup; but for seeding large volumes of sequential data this usage model is acceptable. Only HP Data Protector, Symantec NetBackup and Symantec Backup Exec HP StoreOnce Catalyst – but Catalyst stores can be “copied” to Tape or USB Disk using object copy (DP), duplicate commands (NetBackup).
Seeding using physical tape or portable disk drive
97
13 Implementing replication with the HP B6200 Backup system The main difference with the HP B6200 StoreOnce implementation is that replication is part of the service set. Each service set (associated to Node 1, Node 2, et cetera) can handle a maximum of 48 incoming concurrent replication jobs per node and can itself replicate OUT to up to 16 devices. If failover occurs, the replication load becomes incumbent on the remaining service set. The replication traffic will pause during the failover process and restart from the last checkpoint when failover has completed. This means that replication continues without the need for manual intervention but performance may deteriorate. Possible ways to improve this situation are: •
Dedicate a couplet as a replication target only (no backup targets). This will allow more resources to be dedicated to replication in the event of failover.
•
Stagger the replication load across different nodes in different couplets. Try not to overload a couplet that is responsible for replication.
Active/ive and Active/Active configurations Figure 54 shows an ideal situation where: •
Site B nodes are acting as replication targets only. Performance is guaranteed and all we have to do is enable the replication windows and make allowances each day for housekeeping.
•
The replication load at Site B is balanced across two nodes. In the event of failure of a node at Site B, replication performance will not be adversely affected, especially if the nodes at Site B are less than 50% loaded.
NOTE: If there are local backups at Site B as well to VTL7 and NAS3, the arrangement shown in Figure 56 would be the best practice. Figure 52shows local backup devices VT7 and NAS7 at Site B on Couplet 1. We are still dedicating nodes to act as replication targets, but they are now on Couplet 2 only. Because the load on the nodes in Couplet 2 is now increased, should a node fail in Couplet 2 on Site B there may be noticeable performance degradation on replication. This is because a single node has to handle a much larger load than in Figure 22. Careful sizing of the nodes in Couplet 2 on Site B to ensure they are less than 50% loaded will ensure that even in failover mode replication performance can be maintained.
98
Implementing replication with the HP B6200 Backup system
Figure 54 Using dedicated nodes for replication targets at the target site (Active/ive replication)
Active/ive and Active/Active configurations
99
Figure 55 Using dedicated nodes for replication targets at the target site for Active ive, along with backup sources at Site B
100 Implementing replication with the HP B6200 Backup system
In Figure 56 we deliberately provide one node on each couplet that is dedicated to replication. This simplifies the management, and the loading and performance is easier to predict. The way the couplets are balanced also means that wherever a node fails over we do not lose all our replication performance. In the failover scenario the remaining node can still handle backup in one time window and replication in another time window so the overall impact of a failed node is not that damaging. Figure 56 Using dedicated nodes for replication targets in an Active/Active configuration
Active/ive and Active/Active configurations
101
Many to One configurations The other main usage model for the HP B6200 Backup system is in large-scale remote office deployments where a fan-in of up to 384 replication jobs to a maximum-configuration HP B6200 Backup System is possible (one stream per device). The sources (remote offices) are more likely to be single-node HP StoreOnce StoreOnce Backup systems. For a large number of remote sites co-location is impractical, instead the Floating StoreOnce option is recommended. Physical tape and seeding over a WAN link both have difficulties, due to capacity and bandwidth limitations.
Implementing floating StoreOnce seeding This “floating StoreOnce” method is more complex because for large fan-in (many source sites replicating into a single target site) the initial replication setup on the floating StoreOnce changes when it is transported to the Data Center, where the final replication mappings are configured. The sequence of events is as follows: 1. Plan the final master replication mappings from sources to target that are required and document them. Use an appropriate naming convention, such as SVTL1 (Source VTL1), SNASshare1, TVTL1 (Target VTL1), TNASshare1. 2. At each remote site perform a full system backup to the source StoreOnce and then configure a 1:1 mapping relationship with the floating StoreOnce device, such as: SVTL1 on Remote Site A -> FTVTL1 on floating StoreOnce where FTVTL1 = floating target VTL1. Seeding times at the remote site A will vary. If the StoreOnce at site A is an HP StoreOnce 2620 Backup system, it is over a 1 GbE link and may take several hours. It will be faster if a model with 10GbE replication links is used at the remote sites. 3. On the Source StoreOnce at the remote site DELETE the replication mappings – this effectively isolates the data that is now on the floating StoreOnce. 4. Repeat the process steps 1-3 at Remote sites B and C. 5. When the floating StoreOnce arrives at the central site, the floating StoreOnce effectively becomes the Source device to replicate INTO the HP B6200 Backup System at the Data Center site. 6. On the floating StoreOnce we will have devices (previously named as FTVTL1, FTNASshare 1) that we can see from the Management Console (GUI). Using the same master naming convention as we did in step 1, set up replication which will necessitate the creation of the necessary devices (VTL or NAS) on the B6200 at the Data Center site e.g. TVTL1, TNASshare 1. 7. This time when replication starts up the contents of the floating StoreOnce will be replicated to the Data Center B6200 over the 10 GbE connection at the Data Center site and will take several hours. In this example Remote Site A, B, C data will be replicated and seeded into the B6200. When this replication step is complete, DELETE the replication mappings on the floating StoreOnce, to isolate the data on the floating StoreOnce and then DELETE the actual devices on the floating StoreOnce, so the device is ready for the next batch of remote sites. 8. Repeat steps 1-7 for the next series of remote sites until all the remote site data has been seeded into the HP B6200 Backup System. 9. Finally set up the final replication mappings using our agreed naming convention decided in Step 1. At the remote sites configure replication again to the Data Center site but be careful to replicate to the correct target devices, by using the agreed naming convention at the data center site e.g. TVTL1, TNASshare1 etc. nl
This time when we set up replication the B6200 at the target site presents a list of possible target replication devices available to the Remote Site A. So, in this example, we would select TVTL1 or TNASshare1 from the list of available targets presented to Remote Site A when we are configuring the final replication mappings. This time when the replication starts almost all
102 Implementing replication with the HP B6200 Backup system
the necessary data is already seeded on the B6200 for Remote Site A and the synchronization process happens very quickly. In some scenarios where a customer has larger remote storage locations the floating StoreOnce process can be used together with the smaller locations seeding over the WAN link. Another consideration is the physical logistics for some customers with 100+ locations, some being international locations. The floating StoreOnce and co-location will be difficult, so the only option is to schedule the use of increased bandwidth connections along with their infrastructure needs. The schedule is used to perform seeding at timed, phased slots.
Balancing Many-to-One replication For the many-to-one replication scenario, it is probably better to load balance the number of incoming replication sources across the available nodes as shown in the diagram below. In Figure 25 we show the many-to-one replication scenario where we have grouped remote sites (VTL and NAS) together into bundles and have them replicating into multiple dedicated replication target devices. The current recommendation with the HP B6200 Backup System is to keep the same relationship between remote site VTLs and replication target VTLs, namely a 1:1 mapping. The deployment illustrated has the following benefits: •
Load balancing of remote sites: 40 sites are divided by 4 and then presented in bundles of 10 to the replication targets. As more remote sites come on line they are also balanced across the four replication target nodes.
•
Site B backup devices can be managed and added to easily, and their loading on the node accurately monitored. Similarly, the replication target nodes have a single function (replication targets) which makes their behavior more predictable.
•
In a failover situation, the performance impact on either backup or replication is likely to be lower because the backup load at Site B nodes and the replication load at Site B nodes are likely to run in separate windows at separate times.
Many to One configurations 103
Figure 57 Balancing many-to-one replication sources across all available nodes
Replication and load balancing The specification for a B6200 service set is that it can accept up to a maximum of 48 concurrent replication streams from external sources. If more than 48 streams are replicating into a B6200 node, some streams will be put on hold until a spare “replication slot” becomes available. Being a replication target is more demanding of resources than being a replication source, this is why we recommend allocating dedicated replication targets to specific nodes. The example detailed on the following page shows a full system approach and is a good overview of what is required. Note that: •
Each service set on each node is relatively at the same load (load balancing is a manual process)
•
Each node has a single function, VTL backup targets node, NAS backup targets node, Replication target. This makes management and load assessment easier
104 Implementing replication with the HP B6200 Backup system
•
FC SAN 1 with its larger number of hosts and capacities is spread over four nodes all with maximum storage capacity. There are at least eight streams per node to provide good throughput performance
•
All the NAS target backups have been grouped together on node 5 and 6 on 10GbE – these could be VMWare backups which generally require a backup to NAS target. Again all NAS targets are balanced equally across nodes 5 and 6 and, in the event of failover, performance would be well balanced at around 50% of previous performance for the duration of the failed-over period.
•
FC SAN 2 has smaller capacity hosts connected via FC. Nodes 7 and 8 are the least loaded hosts, so this couplet is the obvious candidate for use as the replication target.
•
Keep it simple and easy to understand – that’s the key.
Figure 58 Fully load-balanced analysis of a typical implementation
Many to One configurations 105
14 Housekeeping Housekeeping is the process whereby space is reclaimed and is best scheduled to occur in quiet periods when no backup or replication is taking place. If insufficient time is allocated to housekeeping, there is a risk that housekeeping jobs will stack up – effectively “hogging” capacity So, there are two factors to consider: •
Is housekeeping interfering with performance, in which case you may wish to set blackout windows, when housekeeping will not occur, see below. Housekeeping blackout windows are configurable (up to two periods in any 24 hours) so, even if the “clean up” scripts run in the backup software, the housekeeping will not trigger until the blackout window is closed.
•
Are backup and replication jobs scheduled to allow housekeeping time to complete? Running backup, restore, tape offload and replication operations with no break (i.e. 24 hours a day) will result in housekeeping never being able to complete. Configuring backup rotation schemes correctly is very important to ensure the maximum efficiency of the product; correct configuration of backup rotation schemes reduces the amount of housekeeping that is required and creates a predictable load. For example, large housekeeping loads are created if large numbers of cartridges are manually erased or re-formatted. In general all media overwrites should be controlled by the backup rotation scheme so that they are predictable. Housekeeping also applies when data is replicated from a source StoreOnce appliance to a target StoreOnce appliance – the replicated data on the target StoreOnce appliance triggers housekeeping on the target StoreOnce appliance to take place. Blackout windows are also configurable on the target devices.
Housekeeping Blackout window This is a period of time (up to 2 separate periods in any 24 hours) that can be configured in the StoreOnce appliance during which the I/O intensive process of Housekeeping WILL NOT run. The main use of a blackout window is to ensure that other activities such as backup and replication can run uninterrupted and therefore give more predictable performance. Blackout windows must be set on BOTH the source StoreOnce appliance and Target StoreOnce appliance. See HP StoreOnce Backup system Best Practices with Sizing Tool and StoreOnce Catalyst worked examples for a fully worked example of configuring a complex StoreOnce environment including setting housekeeping windows. NOTE: With the HP B6200 Backup system, housekeeping blackout windows are configured per service set. Without a housekeeping blackout window set, the housekeeping can interfere with the backup jobs because both are competing for disk I/O. By setting a housekeeping blackout window appropriately from 12:00 to 00:00 we can ensure the backups and replication run at maximum speed because the housekeeping is scheduled to run when the device is idle. There is a worked example in HP StoreOnce Backup system Summary of Best Practices with Sizing Tool and StoreOnce Catalyst worked examples.
Tuning housekeeping using the StoreOnce GUI Some tuning is required to determine how long to set the housekeeping windows; this is achieved using the StoreOnce Management Interface and the reporting capabilities which we will now explain.
106 Housekeeping
On the StoreOnce Management Interface go to the Housekeeping page; a series of graphs and a configuration capability is displayed. There are four tabs on the Housekeeping page: Overall, Libraries, Shares and StoreOnce Catalyst. The Overall tab shows the total housekeeping load on the appliance. The other tabs can be used to select the device type and monitor housekeeping load on individual named VTL, NAS shares or Catalyst stores. Note how the Housekeeping blackout window configuration setting is shown below the Housekeeping status. The housekeeping blackout window is set on an appliance basis not an individual device type basis. NOTE:
With the HP B6200 Backup system, housekeeping statistics are per service set.
Figure 59 Housekeeping jobs received versus housekeeping jobs processed
The Housekeeping load on the target replication devices is generally higher than on the source devices and must be monitored/observed on those devices – you cannot monitor the target housekeeping load from the source device. The key features within this section are: •
Housekeeping Statistics: Status has three options: OK if housekeeping has been idle within the last 24 hours, Warning if housekeeping has been processing nonstop for the last 24 hours, Caution if housekeeping has been processing nonstop for the last 7 days. Last Idle is the date and time when the housekeeping processing was last idle. Time Idle (Last 24 Hours) is a percentage of the idle time in the last 24 hours. Time Idle (Last 7 Days) is a percentage of the idle time in the last 7 days. nl nl
nl nl
nl nl
nl nl
•
Load graph (top graph): will display what levels of load the StoreOnce appliance is under when housekeeping is being processed. However this graph is intended for use when housekeeping is affecting the performance of the StoreOnce appliance (e.g. housekeeping
Tuning housekeeping using the StoreOnce GUI 107
has been running nonstop for a couple of hours), therefore if housekeeping is idle most of the time no information will be displayed. Figure 60 Housekeeping load graph
1. Housekeeping under control 2. Housekeeping out of control, not being reduced over time In the above graph we show two examples, one where the housekeeping load increases and then subsides, which is normal, and another where the housekeeping job continues to grow and grow overtime. This second condition would be a strong indication that the housekeeping jobs are not being dealt with efficiently, maybe the housekeeping activity window is too short (housekeeping blackout window too large), or we may be overloading the StoreOnce appliance with backup and replication jobs and the unit may be undersized. Another indicator is the Time Idle status, which is a measure of the housekeeping empty queue time. If % idle over 24 hours is = 0 this means that the box is fully occupied and that is not healthy, but this may be OK if the % idle over 7 days is not 0 as well. For example, if the appliance is 30% idle over 7 days then we are probably operating within reasonably safe limits. Signs of housekeeping becoming too high are that backups may start to slow down or backup performance becomes unpredictable. Corrective actions if idle time is low or the load continues to increase are: •
Use a larger StoreOnce appliance or add additional shelves to increase I/O performance.
•
Restructure the backup regime to remove appends on tape or keep appends on separate cartridges – as the bigger the tapes (through appends,) the more housekeeping they generate when they are overwritten.
•
Increase the time allowed for housekeeping to run by reducing the housekeeping blackout windows
If you do set up housekeeping blackout windows (up to two periods per day, 7 days per week), be careful as you cannot set a blackout time from say 18:00 to 00:00 but you must set 23:59. In addition there is a Pause Housekeeping button, but use this with caution because it pauses housekeeping indefinitely until you restart it! Finally, that it is best practice to set housekeeping blackout windows on both the source and target devices. There is a worked example in HP StoreOnce Backup system Summary of Best Practices with Sizing Tool and StoreOnce Catalyst worked examples.
108 Housekeeping
15 Tape Offload Terminology Direct Tape Offload This is when a physical tape library device is connected directly to the rear of the StoreOnce Backup system. This offload feature is not currently ed on HP StoreOnce Backup system.
Backup application Tape Offload/Copy from StoreOnce Backup system This is the preferred way of moving data from a StoreOnce Backup system to physical tape. The data transfer is managed entirely by the backup software, multiple streams can be copied simultaneously and StoreOnce NAS, Catalyst store and VTL emulations can be copied to physical tape. Both the StoreOnce Backup system and the physical tape library must be visible to the backup application media server doing the copy and some additional licensing costs may be incurred by the presence of the physical tape library. Using this method, entire pieces of media (complete virtual tapes or NAS shares) may be copied OR the can select to take only certain sessions from the StoreOnce Backup system and copy and merge them onto physical tape. These techniques are known as “media copy” or “object copy” respectively. All copies of the data are tracked by the backup application software using this method and it is the tape offload method HP recommends. When reading data in this manner from the StoreOnce Backup system the data to be copied must be read from the StoreOnce appliance and “reconstructed” then copied to physical tape. Just as with the backup process – the more parallel backup streams to the StoreOnce appliance the faster the backup will proceed, Similarly the larger the number of parallel reads generated for the tape copy, the faster the copy to tape will take place – even if this means less than optimal usage of physical tape media. Scheduling tape offload to occur at less busy periods, such as weekends, is also highly recommended, so that the read process has maximum I/O available to it.
Backup application Mirrored Backup from Data Source This again uses the backup application software to write the same backup to two devices simultaneously and create two copies of the same data. For example, if the monthly backups must be archived to tape, a special policy can be set up for these mirror copy backups. The advantage of this method is that the backup to physical tape will be faster and you do not need to allocate specific time slots for copying from StoreOnce Backup system to physical tape. All copies of the data are tracked by the backup application software.
Terminology 109
Tape Offload/Copy from StoreOnce Backup system versus Mirrored Backup from Data Source A summary of the ed methods is shown below. Table 9 Tape Offload/Copy For easiest integration Backup application copy to tape nl
For optimum performance Separate physical tape mirrored backup nl
The backup application controls the copy from the StoreOnce appliance to the network-attached tape drive so that:
This is a parallel activity. The host backs up to the StoreOnce appliance and the host backs up to tape. It has the following benefits:
• It is easier to find the correct backup tape
• The backup application still controls the copy location
• The scheduling of copy to tape can be automated within • It has the highest performance because there are no the backup process read operations and reconstruction from the StoreOnce appliance Constraints:
Constraints:
• • Streaming performance will be slower because data • It requires the scheduling of specific mirrored backup must be reconstructed. policies. • This method is generally only available at the “Source” side of the backup process. Offloading to tape at the target site can only use the backup application copy to tape method.
When is Tape Offload required? •
Compliance reasons or company strategy dictate Weekly, Monthly, Yearly copies of data be put on tape and archived or sent to a DR site. Or a customer wants the peace of mind that he can physically “hold” his data on a removable piece of media.
•
In a StoreOnce Replication model it makes perfect sense for the data at the StoreOnce DR site or central site to be periodically copied to physical tape and the physical tape be stored at the StoreOnce site (avoiding offsite costs) yet still providing long term data retention.
•
The same applies in a StoreOnce Catalyst Copy model. However, the StoreOnce Catalyst Copy feature allows the backup application to incorporate tape offload, as well as Catalyst Store copy between StoreOnce appliances into a single backup job specification. The following examples relate to StoreOnce Replication.
Catalyst device types The visibility, flexibility and integration of Catalyst stores into the backup software is one of the key advantages of HP StoreOnce Catalyst - especially because the replicated copies are already known to the backup application.
110
Tape Offload
Figure 61 StoreOnce Catalyst Copy offload to tape drive
1. Catalyst Copy command
2. Low bandwidth Catalyst Copy
3. Rehydration and full bandwidth copy to tape
VTL and NAS device types Figure 62 Backup application tape offload at StoreOnce target site for VTL and NAS device types
1. Backup data written to StoreOnce Source
2. StoreOnce low bandwidth replication
3. All data stored safely at DR site. Data at StoreOnce target (written by StoreOnce source via replication) must be imported to Backup Server B before it can be copied to tape.
NOTE: Target Offload can vary from one backup application to another in of import functionality. Please check with your vendor.
When is Tape Offload required?
111
Figure 63 Backup application tape offload at StoreOnce source site for VTL and NAS device types
1. Copy StoreOnce device to physical tape; this uses the backup Copy job to copy data from the StoreOnce appliance to physical tape and is easy to automate and schedule, it has a slower copy performance. 2. Mirrored backup; specific backup policy used to back up to StoreOnce and Physical Tape simultaneously (mirrored write) at certain times (monthly). This is a faster copy to tape method.
As can be seen in the diagrams above – offload to tape at the source site is somewhat easier because the backup server has written the data to the StoreOnce Backup system at the source site. In the StoreOnce Target site scenario (Figure 30), some of the data on the StoreOnce Backup system may have been written by Backup Server B (local DR site backups, maybe) but the majority of the data will be on the StoreOnce Target via low bandwidth replication from StoreOnce Source. In this case, the Backup Server B has to “learn” about the contents of the StoreOnce target before it can copy them and the typical way this is done is via “importing” the replicated data at the StoreOnce target into the catalog at Backup Server B, so that it knows what is on each replicated virtual tape or StoreOnce NAS share. Copy to physical tape can then take place. These limitations do not exist if HP StoreOnce Catalyst device types are used.
HP StoreOnce Optimum Configuration for Tape Offload Optimizing tape offload from HP StoreOnce Backup System is achieved by reading multiple data streams simultaneously into a backup application, which combines the data streams into a single data stream (multiplexing). This single data stream is now fast enough to send to a HP LTO Ultrium Tape Device and maintain streaming, hence ensuring optimal throughput and minimal offload time. To achieve multiple data streams on a HP StoreOnce Backup System, create an offload job in the backup application, which reads from multiple objects stored on the HP StoreOnce Backup System simultaneously, then using the backup application’s multiplexing option to write to a single HP LTO physical tape.
Offload Considerations VTL Cloning/Media Copy to Physical Tape This process involves transferring all backup data held on a VTL cartridge directly to a physical tape making an identical copy of the data. This type of tape offload generally does not appending, therefore it is best practice to match the VTL cartridge size to the Physical cartridge size. (May Not Apply To All Backup Applications)
112
Tape Offload
HP StoreEver Tape Libraries HP Half-Height Tape Drives for LTO 5 and LTO 6 have equal Data Transfer Rates to the Full-Height versions. This allows an HP StoreEver MSL 2024 to have two Half-Height LTO Drives with no decrease in performance giving maximum “performance density”.
Backup Application To achieve optimum offload performance the backup application must be capable of multiplexing offload data to a HP LTO Ultrium Tape, and the backup application server must be capable of ingesting multiple streams from the HP StoreOnce Backup System at the same speed as it can write to HP LTO Ultrium Tape.
HP StoreOnce Optimum Tape Offload Configuration The table below represents the optimum configuration of Tape Drives and Data Streams for offloading from an HP StoreOnce Backup System. HP LTO 6
HP LTO 5
HP LTO 4
HP Recommended Physical StoreOnce Physical Drives Library
Data Streams
Recommended Physical Physical Drives Library
Data Streams
Recommended Physical Physical Drives Library
Data Streams
HP MSL StoreOnce 2024 2700 and 2620i SCSI
2
8
MSL 2024
2
8
MSL 2024
2
6
HP MSL StoreOnce 4048 4210i SCSI & FC
3
12
MSL 4048
3
12
MSL 4048
3
12
HP MSL StoreOnce 4048 4210 & 1 Shelf
3
9
MSL 4048
3
9
MSL 4048
3
9
HP MSL StoreOnce 4048 4500, 4220
3
12
MSL 4048
4
10
MSL 4048
4
12
HP MSL StoreOnce 4048 4500, 4220 & 1 Shelf
3
9
MSL 4048
4
10
MSL 4048
4
8
HP MSL StoreOnce 4048 4420
4
12
ESL G3
5
9
ESL G3
6
11
HP MSL StoreOnce 4048 4420 & 1 Shelf
4
8
ESL G3
5
9
ESL G3
6
11
HP MSL StoreOnce 4048 4420 & 2 Shelves
4
8
ESL G3
5
9
ESL G3
6
11
HP StoreOnce Optimum Tape Offload Configuration
113
HP MSL StoreOnce 4048 4420 & 3 Shelves
4
8
ESL G3
5
9
ESL G3
6
11
HP ESL G3 StoreOnce 4700 and 4430
6
12
ESL G3
7
14
ESL G3
8
15
HP ESL G3 StoreOnce 4700 and 4430 & 1 Shelf
6
12
ESL G3
7
14
ESL G3
8
15
HP ESL G3 StoreOnce 4700 and 4430 & 2 Shelves
6
12
ESL G3
7
14
ESL G3
8
8
HP ESL G3 StoreOnce 4700 and 4430 & 3 Shelves
6
12
ESL G3
7
14
ESL G3
8
8
NOTE: Split the number of data streams evenly between the physical drives. For example, two drives with six streams is configured as three data streams per physical drive. If you have an uneven number of streams, for example five drives with nine streams, four drives are configured with two data streams each and the fifth drive has a single stream.
114
Tape Offload
16 Key parameters StoreOnce B6200 Backup Table 10 Key parameters for HP StoreOnce B6200 Backup StoreOnce Backup products running software 3.0.0 and later
One couplet
Two couplets
Three couplets
Four couplets
Up to 128
Up to 256
Up to 384
Up to 512
96
192
288
384
384
768
1152
1536
Max VTL Library Rep Fan Out
2
4
6
8
Max VTL Library Rep Fan In
16
16
16
16
Max Rep Fan Out +
16
32
48
64
Max Rep Fan In
96
192
288
384
Max Concurrent Rep Jobs as Source
32
64
96
128
Max Concurrent Rep Jobs Target
96
192
288
384
s direct attach of physical tape device
No
No
No
No
Max Concurrent Tape Attach Jobs Appliance
N/A
N/A
N/A
N/A
Max VTL drives (384) and medium changers (96) - (combined)
480
960
1440
1920
Max VTL Drives
384
768
1152
1536
Max Cartridge Size (TB)
3.2
3.2
3.2
3.2
16384
16384
16384
16384
24,48,96
24,48,96
24,48,96
24,48,96
Max virtual devices( drives & medium changers) configurable per FC ports
480
960
1440
1920
Recommended Max Concurrent backup streams ( mix of VTL & NAS)
256
512
768
1024
Recommended Max Concurrent Backup Streams per Library
12
12
12
12
Devices Max Addressable Disk Capacity (TB) – assuming 2TB drives Max Number Devices (VTL + NAS shares ) Total maximum concurrent streams ( backup/restores/inbound replication)
Replication
Physical Tape Copy
VTL
Max Slots Per Library (D2DBS, EML-E, ESL-E Lib Type) Max Slots Per Library (MSL2024, MSL4048,MSL8096 Lib Type)
StoreOnce B6200 Backup
115
Table 10 Key parameters for HP StoreOnce B6200 Backup (continued) StoreOnce Backup products running software 3.0.0 and later
One couplet
Two couplets
Three couplets
Four couplets
25000
25000
25000
25000
Max number of streams if only CIFS target shares configured (no VTL)
384
768
1152
1536
Max number of streams if only NFS target shares configured (no VTL)
192
384
586
768
Recommended Max Concurrent Backup Streams (mix of VTL & NAS)
256
512
768
1024
Recommended Max Concurrent Backup Streams per Share
12
12
12
12
Catalyst Command Sessions per service set = 64
128
256
384
512
Maximum Concurrent outbound copy jobs per service set = 48
96
192
288
384
Maximum Concurrent inbound data and copy jobs per service set = 192
384
768
1152
1536
NAS Max files per share
StoreOnce Catalyst
StoreOnce 2700, 4500 and 4700 Backup Table 11 Key parameters for HP StoreOnce 2700, 4500 and 4700 Backup StoreOnce Backup products running software 3.8.0 and later
StoreOnce 2700
StoreOnce 4500
StoreOnce 4700
5.5
36
160
8
24
50
Max VTL Library Rep Fan Out
1
1
1
Max VTL Library Rep Fan In
1
8
16
Max Appliance Rep Fan Out
2
4
8
Max Appliance Rep Fan In
8
24
50
Max Appliance Concurrent Rep Jobs Source
12
24
48
Max Appliance Concurrent Rep Jobs Target
24
48
96
s direct attach of physical tape device
No
No
No
Max Concurrent Tape Attach Jobs Appliance
N/A
N/A
N/A
Devices Usable Disk Capacity (TB) (With full expansion) Max Number Devices (VTL/NAS/Catalyst)
Replication
Physical Tape Copy
116
Key parameters
Table 11 Key parameters for HP StoreOnce 2700, 4500 and 4700 Backup (continued) StoreOnce Backup products running software 3.8.0 and later
StoreOnce 2700
StoreOnce 4500
StoreOnce 4700
Max VTL Drives Per Library/Appliance
32
96
200
Max Cartridge Size (TB)
3.2
3.2
3.2
Max Slots Per Library (D2DBS, EML-E, ESL-E Lib Type)
96
1024
4096
24,48,96
24,48,96
24,48,96
Max active streams per store
48
96
128
Recommended Max Concurrent Backup Streams per appliance
16
48
64
Recommended Max Concurrent Backup Streams per Library
4
6
12
25000
25000
25000
Max NAS Open Files Per Share > DDThreshold*
48
64
128
Max NAS Open Files Per Appliance > DDThreshold*
48
64
128
Max NAS Open Files Per Appliance concurrent
240
320
640
Recommended Max Concurrent Backup Streams per appliance
24
48
64
Recommended Max Concurrent Backup Streams per Share
4
6
12
Catalyst Command Sessions
16
32
64
Maximum Concurrent outbound copy jobs per appliance
12
24
48
Maximum Concurrent inbound data and copy jobs per appliance
48
96
192
VTL
Max Slots Per Library (MSL2024,MSL4048,MSL8096 Lib Type)
NAS Max files per share
StoreOnce Catalyst
StoreOnce 2610/2620, 4210/4220 and 4420/4430 Backup Table 12 Key parameters for HP StoreOnce 2610/2620, 4210/4220 and 4420/4430 Backup StoreOnce 2610 iSCSI
StoreOnce 2620 iSCSI
StoreOnce 4210 iSCSI/FC nl
StoreOnce 4220
StoreOnce 4420
StoreOnce 4430
Usable Disk Capacity (TB) (With full expansion)
1
2.5
9
18
38
76
Max Number Devices (VTL/NAS/Catalyst)
4
8
16
24
50
50
1
1
1
1
1
1
StoreOnce Backup products running software 3.4.0 and later
nl
nl
Devices
Replication Max VTL Library Rep Fan Out
StoreOnce 2610/2620, 4210/4220 and 4420/4430 Backup
117
Table 12 Key parameters for HP StoreOnce 2610/2620, 4210/4220 and 4420/4430 Backup (continued) StoreOnce 2610 iSCSI
StoreOnce 2620 iSCSI
StoreOnce 4210 iSCSI/FC nl
StoreOnce 4220
StoreOnce 4420
StoreOnce 4430
Max VTL Library Rep Fan In
1
1
8
8
16
16
Max Appliance Rep Fan Out
2
2
4
4
8
8
Max Appliance Rep Fan In
4
8
16
24
50
50
Max Appliance Concurrent Rep Jobs Source
12
12
24
24
48
48
Max Appliance Concurrent Rep Jobs Target
24
48
48
48
96
96
No
No
No
No
No
No
N/A
N/A
N/A
N/A
N/A
N/A
Max VTL Drives Per Library/Appliance
16
32
64
96
200
200
Max Cartridge Size (TB)
3.2
3.2
3.2
3.2
3.2
3.2
Max Slots Per Library (D2DBS, EML-E, ESL-E Lib Type)
96
96
1024
1024
4096
4096
Max Slots Per Library (MSL2024,MSL4048,MSL8096 Lib Type)
24,48,96
24,48,96
24,48,96
24,48,96
24,48,96
24,48,96
Max active streams per store
32
48
64
96
128
128
Recommended Max Concurrent Backup Streams per appliance
16
24
48
48
64
64
Recommended Max Concurrent Backup Streams per Library
4
4
6
6
12
12
25000
25000
25000
25000
25000
25000
Max NAS Open Files Per Share > DDThreshold*
32
48
64
64
128
128
Max NAS Open Files Per Appliance > DDThreshold*
32
48
64
64
128
128
Max NAS Open Files Per Appliance concurrent
96
112
128
128
640
640
Recommended Max Concurrent Backup Streams per appliance
16
24
48
48
64
64
Recommended Max Concurrent Backup Streams per Share
4
4
6
6
12
12
StoreOnce Backup products running software 3.4.0 and later
nl
nl
Physical Tape Copy s direct attach of physical tape device Max Concurrent Tape Attach Jobs Appliance
VTL
NAS Max files per share
118
Key parameters
Table 12 Key parameters for HP StoreOnce 2610/2620, 4210/4220 and 4420/4430 Backup (continued) StoreOnce 2610 iSCSI
StoreOnce 2620 iSCSI
StoreOnce 4210 iSCSI/FC nl
StoreOnce 4220
StoreOnce 4420
StoreOnce 4430
Catalyst Command Sessions
16
16
32
32
64
64
Maximum Concurrent outbound copy jobs per appliance
12
48
24
24
48
48
Maximum Concurrent inbound data and copy jobs per appliance
12
48
96
96
192
192
StoreOnce Backup products running software 3.4.0 and later
nl
nl
StoreOnce Catalyst
StoreOnce 2610/2620, 4210/4220 and 4420/4430 Backup
119
About this guide This guide provides conceptual information about the following HP StoreOnce Backup systems: •
HP StoreOnce B6200 Backup system
•
HP StoreOnce 2700 Backup system
•
HP StoreOnce 4500 Backup system
•
HP StoreOnce 4700 Backup system
•
HP StoreOnce 2620 Backup system
•
HP StoreOnce 4210/4220 Backup system
•
HP StoreOnce 4420/4430 Backup system
Intended audience This guide is intended for s who install, operate and maintain the HP StoreOnce Backup System.
Related documentation In addition to this guide, the following document provides related information: •
‘Start here' poster for an overview of the installation information in this guide (available in English, French, German and Japanese)
•
HP StoreOnce Backup system CLI Reference Guide
•
HP StoreOnce Backup system Linux and UNIX Reference Guide
•
HP StoreOnce Backup system Guide
•
HP StoreOnce Backup system Maintenance and Service Guide
You can find these documents from the HP Center website: http://www.hp.com// Query on your product name and then select the Product Manuals link.
Document conventions and symbols Table 13 Document conventions Convention
Element
Blue text: Table 13 (page 120)
Cross-reference links and e-mail addresses
Blue, underlined text: http://www.hp.com
website addresses
Bold text
• Keys that are pressed • Text typed into a GUI element, such as a box • GUI elements that are clicked or selected, such as menu and list items, buttons, tabs, and check boxes
Italic text
120
Text emphasis
Table 13 Document conventions (continued) Convention
Element
Monospace text
• File and directory names • System output • Code • Commands, their arguments, and argument values
Monospace, italic text
• Code variables • Command variables
Monospace, bold text
WARNING! CAUTION: IMPORTANT: NOTE:
Emphasized monospace text
Indicates that failure to follow directions could result in bodily harm or death. Indicates that failure to follow directions could result in damage to equipment or data. Provides clarifying information or specific instructions.
Provides additional information.
HP technical For worldwide technical information, see the HP website: http://www.hp.com/ Before ing HP, collect the following information: •
Product model names and numbers
•
Technical registration number (if applicable)
•
Product serial numbers
•
Error messages
•
Operating system type and revision level
•
Detailed questions
HP websites For additional information, see the following HP websites: •
http://www.hp.com
•
http://www.hp.com/go/ebs
•
http://www.hp.com/go/connect
•
http://www.hp.com/go/storage
•
http://www.hp.com/service_locator
•
http://www.hp.com//manuals
•
http://www.hp.com//s
Documentation HP welcomes your .
HP technical
121
To make comments and suggestions about product documentation, please send a message to
[email protected]. All submissions become the property of HP.
122