An archive repository is the place where certain documents are stored for archiving. Here it is necessary to distinguish between archiving and backup. An archive is long-term storage for documents in accordance with legislative requirements and the company’s internal shredding rules.
The repositories differ by method of interconnection with the data/document archiving application and by data storage technology with regard to the speed of its accessibility. The most frequent repositories consist of magnetic hard drives. Optical media are no longer used.
In addition to local data repositories, storage systems are classified as follows:
- CAS (Content Addressed Storage) – documents are accessible exclusively through a software interface based on a special identifier derived from their content. The user does not know the exact physical location of given file within the repository and cannot access it from a file system perspective. Such repositories are suitable for long-term storage of unalterable data subject to strict legislative requirements. Any change of the document within such repository automatically causes a change in its identifier – address. Due to legislative requirements, such repositories are usually operated in a mode that does not allow deletion or modification of data for a defined period of time. Such repositories are connected to the archiving system over a computer network.
- NAS (Network Attached Storage) – as the name suggests, these repositories are connected to the archiving system over a network interface. Data is accessed using the NFS protocol (in the Unix world), CIFS (in the Windows world), or iSCSI.
- HSM (Hierarchical Storage Management) – enables storage of frequently accessed data to hard drives while only more infrequently accessed data is stored on slower but considerably cheaper media, such as magnetic tapes. In case of the IBM DR550 product, these constitute a sub-group of CAS repositories as the data is accessible exclusively through an application software interface through the stored document’s unique identifier.
- SAN (Storage Area Network) – an archive storage that is not recommended unless it includes a layer that enables a retention period and the properties of WORM-media to be defined.
All of the archive repositories mentioned below support defined retention periods at the HW level. This eliminates the possibility of modification or deletion of stored documents for a defined period of time – the retention period (for example 10 or more years). Documents may be stored either individually or collectively in ISO image containers on the basis of a virtual jukebox.
The advantage of container storage lies in its ability to hold a vast number of documents while using only one identifier. Such a storage method is also more economic with regard to the system resources of the given repository. Its disadvantage is that single documents cannot be deleted individually (this is possible only logically in the controlling application). Only the entire container can be deleted physically. Access to data stored in containers is possible through the archive system’s software layer called the Storage Manager (STORM).
IXTENT offers its customers archiving storages manufactured by:
- EMC Centera (druh úložiště CAS)
More information: http://www.emc.cz/archivace/centera
- EMC Isilon (druh úložiště NAS)
More information: http://www.emc.cz/datova-uloziste/isilon
- EMC VNX (druh úložiště NAS)
More information: http://www.emc.cz/datova-uloziste/vnx-family
- NetApp SnapLock (druh úložiště NAS)
More information: http://www.netapp.com/us/
- IBM N-Series (druh úložiště NAS, jedná se o technologii NetApp pod značkou IBM)
More information: http://www-03.ibm.com/systems/storage/network/