DATA PROTECTION ~ TYPES OF BACKUP AND CONSIDERATIONS
Businesses and individuals alike should be aware of the options they have for data protection/data storage. Once they have solid backup/recovery implementations in place they can feel secure in knowing they will not lose important and irreplacement information. Time and labor costs will decrease because recovery times will be minimized when steps have been implemented to protect data in business and personal applications, remote offices, and desktop/mobile systems.
Data protection using external storage refers to various techniques and devices for storing large amounts of data. The earliest storage devices were punched paper cards, which were used as early as 1804 to control silk-weaving looms. Modern mass storage devices include all types of disk drives and tape drives. Mass data storage, or auxiliary storage, is distinct from your computer's memory, which refers to temporary storage areas within the computer. Your computer's main memory or RAM (random-access memory) refers to read and write memory; that is, you can both write data into RAM and read data from RAM. RAM is physical memory in the form of chips. This is in contrast to ROM (read-only memory) that holds the instructions to start up your computer, therefore is read only data. Most RAM is volatile, which means that it requires a steady flow of electricity to maintain its contents. As soon as the power is turned off, whatever data was in RAM is lost. Unlike main memory, mass data storage devices retain data even when the computer is turned off.
The main types of data storage are:
b) WORM : Stands for write-once, read -many. With a WORM disk drive, you can write data onto a WORM disk, but only once. After that, the WORM disk behaves just like a CD-ROM.
c) erasable: Optical disks that can be erased and loaded with new data, just like magnetic disks. These are often referred to as EO (erasable optical) disks.
These three technologies are not compatible with one another; each requires a different type of disk drive and disk. Even within one category, there are many competing formats, although CD-ROMs are relatively standardized.
OTHER TYPES OF BACKUP TECHNOLOGY TO CONSIDER:
VIRTUAL TAPE LIBRARY - A VTL is an archival backup solution that combines traditional tape backup methodology (software or appliance based) with low-cost disk technology to create an optimized backup and recovery solution. This provides backup and recovery performance benefits compared to tape based solutions but lets users continue using technologies and processes designed to work with their tape environments. It is an intelligent disk-based library acting like a tape library with the performance of modern disk drives, data is deposited onto disk drives just as it would onto a tape library, only faster. VTL cand be used as a stand alone tape library solution. A VTL generally consists of a Virtual Tape appliance or server, and software which emulates traditional tape devices and formats. Vendors include ADIC, Alacritus, Diligent, Falcon-Stor, Neartek, Overland, Quantum, Sepaton, and SpectraLogic.
NEAR-LINE DISK TARGET - A disk array that acts as a target or cache for tape backup. These arrays typically offier faster backup and recovery times when compared with tape and are cost effective because they're increasingly based on low cost Advanced Technology Attachment disk drives. Unlike virtual tape libraries, however, they typically require configuration and process changes to existing backup / recovery operations. Disk array refers to a linked group of one or more physical independent hard disk drives generally used to replace larger, single disk drive systems. The most common disk arrays are in daisy chain configuration or implement RAID (Redundant Array of Independent Disks) technology. A disk array may contain several disk drive trays, and is structured to improve speed and increase protection against loss of data.
CONTENT-ADDRESSED STORAGE (CAS) - A disk based storage system that uses the content of the data as a locator for the information, eliminating dependence on file system locators or volume/block/device descriptors to identify and locate specific data. CAS an object-oriented system for storing data that are not intended to be changed once they are stored (e.g., medical images, sales invoices, archived e-mail). CAS assigns a unique identifying logical address to the data record when it is stored, and that address is neither duplicated nor changed in order to ensure that the record always contains the exact same data as were originally stored. CAS relies on disk storage instead of removable media, such as tape. CAS is often used as a new story paradigm for archiving reference information. EMC's Centera is an example of CAS.
MASSIVE ARRAY OF IDLE DISKS (MAID) - A disk system in which disks spin only when necessary (such as during read/write operations), reducing total power consumption and enabling massive high-capacity disk systems with comparable economics to tape libraries. The many hundred disks share a power supply/controller/cabling cabinet infrastructure An algorithm is used to decide which disks in a cabinet should spin and which not. Inactive disks are powered down, and then spun up again when needed. Reactivation typically takes under 10 seconds. Disks are spun on a regular basis even when not used to keep them operational. This so-called duty cycle management can reduce the number of stops experienced by a drive by a quarter. For comparison a typical ATA drive is built for 40,000 stops over its life.
SNAPSHOTS AND INCREMENTAL CAPTURE - A snapshot is a copy of a volume that is essentially empty but has pointers to existing files. When one of the files changes the snap volume creates a copy of the original file just before the new file is written to disk on the original volume. IT administrators have a second copy of data saved to disk that they can use for instantaneous recovery or as an offline copy for backups. The most common method is a copy-on-write technique. When one of the existing files changes, the snap volume creates a copy of the original file just before the new file is written to disk on the original volume. Incremental capture solutions can take snapshots at the block, file, or volume level. This provides users with more granularity when capturing data and offers unique integration capabilities with applications because these products typically write at the block level. A wide variety of vendors offer some type of snapshot capability. Software vendors with volume management capabilities, such as Microsoft and Veritas, also provide snapshot functionality. Vendors such as FilesX, have the capability to either replace existing backup technologies or co-exist with them.
INCREMENTAL CAPTURE - Vendors in this category can replace existing backup technologies or co-exist with them. Incremental capture solutions can take snapshots at the block, file, or volume level. This gives users more detail when capturing data and offers unique integration capabilities with applications because these products typically write at the block level. FilesX is an example of incremental capture.
CONTINUOUS CAPTURE - This segment of the data-protection market includes software or appliances designed to capture every write made to primary storage and make a time-stamped copy on a secondary device. The main objective is to have the ability to re-create a data set as it existed at any point in time with the goal of being able to rapidly restore applications. Representative vendors include Alacritus, Mendocino Software (via acquired assets from Vyant Software), Revivio, and StorageTek. While it will be a while before these technologies become mainstream, today they are helping end users who need instantaneous recoverability for their applications.
ARRAY-BASED REPLICATION - These products have been around for a long time and have traditionally come from large disk-array vendors such as EMC, Hitachi Data Systems, and IBM. These products run on high-end arrays and are very robust (and expensive). They usually come in two types: synchronous or asynchronous. In the past, these replication technologies only worked between homogeneous arrays from the same vendor, requiring two expensive arrays with two expensive software licenses for each replication pair. As host-based replication became more robust, the array-based replication vendors began to add more flexibility in their solutions. For example, the requirement to replicate from one high-end array to another no longer exists, allowing companies to deploy lower-cost arrays at remote sites. Additionally, prices have come down, and new vendors are getting into the game. For example, vendors such as EqualLogic, Exagrid, and Intransa provide replication with their disk arrays at relatively low prices.
HOST-BASED REPLICATION - Host-based replication software runs on servers. As writes are made to one array, they are also written to a second array. Vendors in this category have eliminated many of the complexities in their products, making them easier to deploy and manage. Representative vendors of host-based replication software include EMC-Legato, DataCore Software, NSI, Softek, Sun, Topio, and Veritas.
FABRIC-BASED REPLICATION - The new debate raging in the storage industry revolves around the following question: "Where should storage services, or applications, resideon hosts, arrays, or in the fabric on switches or appliances?" The hardware that connects workstations and servers to storage devices in a SAN is referred to as a "fabric." The SAN fabric enables any-server-to-any-storage device connectivity through the use of Fibre Channel switching technology. Storage Area Network (SAN) is a high-speed subnetwork of shared storage devices. Because stored data does not reside directly on any of a network's servers, server power is utilized for business applications, and network capacity is released to the end user. Fabric-based applications are relatively new but IT professionals expect a strong trend toward fabric-based intelligence over the next couple years due to a number of potential advantages. For example, the sooner an I/O is captured, the sooner it can be sent to a secondary device, thus enabling better performance. Examples of vendors with solutions in this space include Brocade, Candera, Cisco, CNT, FalconStor, IBM, Kashya, Maranti Networks, McDATA, and Troika. A variety of traditional switch vendors are putting intelligent blades into their core products, and third-party developers are porting their applications to the blades. Blades are a single circuit board populated with components such as processors, memory, and network connections that are usually found on multiple boards. Server blades are designed to slide into existing servers. Server blades are more cost-efficient, smaller and consume less power than traditional box-based servers.
MAID: further information at: http://sc-2002.org/paperpdfs/pap.pap312.pdf
Network World, May 16, 2005, Vol. 22, Number 19, Page41.