HSM is a data storage technique that moves data between different stores according to a defined policy.
The most common use of the HSM technique is the move of older data from a faster-but-expensive storage device to a slower-but-cheaper one based on the following premises:
-
Fast storage costs more.
-
Slow storage costs less.
-
Old data will be accessed much less frequently than new data.
The advantages of the HSM technique are clear: Lowering the overall storage cost since only a small part of your data needs to be on costly storage, and improving the overall user experience.
Zimbra allows for two different types of stores:
-
Index Store: A store that contains information about your data that is used by Apache Lucene to provide indexing and search functions.
-
Data Store: A store that contains all your Zimbra data organized in a MySql database.
You can have multiple stores of each type, but only one Index Store, one Primary Data Store and one Secondary Data Store can be set as Current (meaning that is currently used by Zimbra).
The main feature of the HSM NG module is the ability to apply defined HSM policies.
The move can be triggered in three ways:
-
Click the Apply Policy button in the Administration Zimlet.
-
Start the
doMoveBlobs
operation through the CLI. -
Enable Policy Application Scheduling in the Administration Zimlet and wait for it to start automatically.
Once the move is started, the following operations are performed:
-
HSM NG scans through the Primary Store to see which items comply with the defined policy.
-
All the Blobs of the items found in the first step are copied to the Secondary Store.
-
The database entries related to the copied items are updated to reflect the move.
-
If the second and the third steps are completed successfully (and only in this case), the old Blobs are deleted from the Primary Store.
The Move operation is stateful - each step is executed only if the previous step has been completed successfully - so the risk of data loss during a Move operation is nonexistent.
The doMoveBlobs is the heart of HSM NG.
It moves items between the Current Primary Store and the Current Secondary Store according to the proper HSM policy.
The move is performed by a transactional algorithm. Should an error occur during one of the steps of the operation, a rollback takes place and no change will be made to the data.
Once HSM NG identifies the items to be moved, the following steps are performed:
-
A copy of the Blob to the Current Secondary Store is created.
-
The Zimbra Database is updated to notify Zimbra of the item’s new position.
-
The original Blob is deleted from the Current Primary Store.
Every item that complies with the specified HSM policy is moved.
Example:
The following policy
message,document:before:-20day message:before:-10day has:attachment
will move all emails and documents older than 20 days along with all emails older than 10 days that contain an attachment.
All conditions for a policy are executed in the exact order they are specified. HSM NG will loop on all items in the Current Primary Store and apply each separate condition before starting the next one.
This means that the following policies
message,document:before:-20day message:before:-10day has:attachment
message:before:-10day has:attachment message,document:before:-20day
applied daily on a sample server that sends/receives a total of 1000 emails per day, 100 of which contain one or more attachments, will have the same final result. However, the execution time of the second policy will probably be slightly higher (or much higher, depending on the number and size of the emails on the server).
This is because in the first policy, the first condition (message,document:before:-20day) will loop on all items and move many of them to the Current Secondary Store, leaving fewer items for the second condition to loop on.
Likewise, having the message:before:-10day has:attachment
as the first condition will leave more items for the second condition to loop
on.
This is just an example and does not apply to all cases, but gives an idea of the need to carefully plan your HSM policy.
Applying a policy means running the doMoveBlobs
operation in order
to move items between the Primary and Secondary store according to the
defined policy.
HSM NG gives you three different options:
-
Via the Administration Zimlet
-
Via the CLI
-
Through Scheduling
To apply the HSM Policy via the Administration Zimlet:
-
Log into the Zimbra Administration Console.
-
Click the HSM NG entry in the Administration Zimlet.
-
Click the Apply Policy button.
To apply the HSM Policy via the CLI, run the following command as the zimbra user:
` zxsuite hsm doMoveBlobs`
To schedule a daily execution of the doMoveBlobs
operation:
-
Log into the Zimbra Administration Console.
-
Click the HSM NG entry in the Administration Zimlet.
-
Enable scheduling by selecting the
Enable HSM Session scheduling:
button. -
Select the hour to run the operation under
HSM Session scheduled for:
.
Both primary and secondary volumes can be created on either local storage or on supported third-party storage solutions.
A volume is a distinct entity (path) on a filesystem with all the associated properties that contain Zimbra Blobs.
All Zimbra volumes are defined by the following properties:
-
Name: A unique identifier for the volume.
-
Path: The path where the data is going to be saved.
ImportantThe zimbra user must have r/w permissions on this path. -
Compression: Enable or Disable the file compression for the volume.
-
Compression Threshold: The minimum file size that will trigger the compression. 'Files under this size will never be compressed even if the compression is enabled.'
-
Current: A Current volume is a volume where data will be written upon arrival (Primary Current) or HSM policy application (Secondary Current).
To create a new volume from the HSM NG tab of the Administration Zimlet:
-
Click the appropriate Add option in the Volumes Management section according to the type of volume you want to create.
-
Select the store type, choosing between local mount point or S3 Bucket.
-
Enter the new volume’s name.
-
Enter a path for the new volume.
-
Check the Enable Compression button if you wish to activate data compression on the new volume.
-
Select the Compression Threshold.
-
If you are using an S3 Bucket, it’s possible to store information for multiple buckets.
-
Press OK to create the new volume. Should the operation fail, a notification containing any related errors will be generated.
To edit a volume from the Administration Zimlet, simply select an existing volume and press the appropriate Edit button.
Important
|
Beginning with release 8.8.9, all volume creation and update commands have been updated, as the storeType argument is now required.
|
The storeType
argument is mandatory, it is always the on the first position and accepts any one value corresponding to the S3-Compatible Services listed previously.
The arguments that follow in the command now depend on the selected storeType
.
Updated zxsuite
syntax to create new FileBlob zimbra volume:
# Add volume, run as zimbra user zxsuite hsm doCreateVolume FileBlob name secondary /path/to/store # Delete volume zxsuite hsm doDeleteVolume name # set current zxsuite hsm doUpdateVolume FileBlob name current_volume true
zxsuite hsm doCreateVolume FileBlob
Syntax: zxsuite hsm doCreateVolume FileBlob {volume_name} {primary|secondary|index} {volume_path} [attr1 value1 [attr2 value2...]] PARAMETER LIST NAME TYPE EXPECTED VALUES DEFAULT volume_name(M) String volume_type(M) Multiple choice primary|secondary|index volume_path(M) Path volume_compressed(O) Boolean true|false false compression_threshold_bytes(O) Long 4096 (M) == mandatory parameter, (O) == optional parameter Usage example: zxsuite hsm doCreateVolume FileBlob volumeName secondary /path/to/store volume_compressed true compression_threshold_bytes 4096
zxsuite hsm doUpdateVolume FileBlob
Syntax: zxsuite hsm doUpdateVolume FileBlob {current_volume_name} [attr1 value1 [attr2 value2...]] PARAMETER LIST NAME TYPE EXPECTED VALUES DEFAULT current_volume_name(M) String volume_type(O) String primary|secondary|index volume_name(O) String volume_path(O) Path current_volume(O) Boolean true|false false volume_compressed(O) String compression_threshold(O) String (M) == mandatory parameter, (O) == optional parameter
# Add volume, run as zimbra user zxsuite hsm doCreateVolume S3 name secondary bucket_name bucket access_key accessKey secret secretString region EU_WEST_1 # Delete volume zxsuite hsm doDeleteVolume name # set current zxsuite hsm doUpdateVolume S3 name current_volume true
zxsuite hsm doCreateVolume S3
Syntax: zxsuite hsm doCreateVolume S3 {Name of the zimbra store} {primary|secondary} [attr1 value1 [attr2 value2...]] PARAMETER LIST NAME TYPE EXPECTED VALUES volume_name(M) String Name of the zimbra store volume_type(M) Multiple choice primary|secondary bucket_name(O) String Amazon AWS bucket access_key(O) String Service username secret(O) String Service password server_prefix(O) String Prefix to the server id used in all objects keys bucket_configuration_id(O) String UUID for already existing S3 service credentials (zxsuite config global get attribute s3BucketConfigurations) region(O) String Amazon AWS Region url(O) String S3 API compatible service url (ex: s3api.service.com) prefix(O) String Prefix added to blobs keys use_infrequent_access(O) Boolean true|false infrequent_access_threshold(O) String (M) == mandatory parameter, (O) == optional parameter Usage example: S3 AWS Bucket: zxsuite hsm doCreateVolume S3 volumeName primary bucket_name bucket access_key accessKey secret secretKey prefix objectKeysPrefix region EU_WEST_1 user_infrequent_access TRUE infrequent_access_threshold 4096 S3 compatible object storage: zxsuite hsm doCreateVolume S3 volumeName primary bucket_name bucket access_key accessKey secret secretKey url http://host/service Using existing bucket configuration: zxsuite hsm doCreateVolume S3 volumeName primary bucket_configuration_id 316813fb-d3ef-4775-b5c8-f7d236fc629c
zxsuite hsm doUpdateVolume S3
Syntax: zxsuite hsm doUpdateVolume S3 {current_volume_name} [attr1 value1 [attr2 value2...]] PARAMETER LIST NAME TYPE EXPECTED VALUES DEFAULT current_volume_name(M) String volume_name(O) String volume_type(O) String primary|secondary server_prefix(O) String Prefix to the server id used in all objects keys bucket_configuration_id(O) String UUID for already existing service credentials (zxsuite config global get attribute s3BucketConfigurations) use_infrequent_access(O) Boolean true|false infrequent_access_threshold(O) String current_volume(O) Boolean true|false false (M) == mandatory parameter, (O) == optional parameter
# Add volume, run as zimbra user zxsuite hsm doCreateVolume ScalityS3 name secondary bucket_name mybucket access_key accessKey1 secret verySecretKey1 url http://{IP_ADDRESS}:{PORT} # Delete volume zxsuite hsm doDeleteVolume name # set current zxsuite hsm doUpdateVolume ScalityS3 name current_volume true
zxsuite hsm doCreateVolume ScalityS3
Syntax: zxsuite hsm doCreateVolume ScalityS3 {volume_name} {primary|secondary} [attr1 value1 [attr2 value2...]] PARAMETER LIST NAME TYPE EXPECTED VALUES volume_name(M) String volume_type(M) Multiple choice primary|secondary bucket_name(O) String Bucket name url(O) String S3 API compatible service url (ex: s3api.service.com) access_key(O) String Service username secret(O) String Service password server_prefix(O) String Prefix to the server id used in all objects keys bucket_configuration_id(O) String UUID for already existing service credentials (zxsuite config global get attribute s3BucketConfigurations) prefix(O) String Prefix added to blobs keys (M) == mandatory parameter, (O) == optional parameter Usage example: zxsuite hsm doCreateVolume ScalityS3 volumeName primary bucket_name bucket url http://host/service access_key accessKey secret secretKet zxsuite hsm doCreateVolume ScalityS3 volumeName primary bucket_configuration_id uuid
zxsuite hsm doUpdateVolume ScalityS3
Syntax: zxsuite hsm doUpdateVolume ScalityS3 {current_volume_name} [attr1 value1 [attr2 value2...]] PARAMETER LIST NAME TYPE EXPECTED VALUES DEFAULT current_volume_name(M) String volume_name(O) String volume_type(O) String primary|secondary server_prefix(O) String Prefix to the server id used in all objects keys bucket_configuration_id(O) String UUID for already existing S3 service credentials (zxsuite config global get attribute s3BucketConfigurations) current_volume(O) Boolean true|false false (M) == mandatory parameter, (O) == optional parameter
# Add volume, run as zimbra user zxsuite hsm docreatevolume EMC name secondary bucket_name bucket access_key ACCESSKEY secret SECRET url https://url.of.storage # Delete volume zxsuite hsm doDeleteVolume name # set current zxsuite hsm doUpdateVolume EMC name current_volume true
zxsuite hsm doCreateVolume EMC
Syntax: zxsuite hsm doCreateVolume EMC {volume_name} {primary|secondary} [attr1 value1 [attr2 value2...]] PARAMETER LIST NAME TYPE EXPECTED VALUES volume_name(M) String volume_type(M) Multiple choice primary|secondary bucket_name(O) String Bucket name url(O) String S3 API compatible service url (ex: s3api.service.com) access_key(O) String Service username secret(O) String Service password server_prefix(O) String Prefix to the server id used in all objects keys bucket_configuration_id(O) String UUID for already existing service credentials (zxsuite config global get attribute s3BucketConfigurations) prefix(O) String Prefix added to blobs keys (M) == mandatory parameter, (O) == optional parameter Usage example: zxsuite hsm doCreateVolume EMC volumeName primary bucket_name bucket url http://host/service access_key accessKey secret secretKet zxsuite hsm doCreateVolume EMC volumeName primary bucket_configuration_id uuid
zxsuite hsm doUpdateVolume EMC
Syntax: zxsuite hsm doUpdateVolume EMC {current_volume_name} [attr1 value1 [attr2 value2...]] PARAMETER LIST NAME TYPE EXPECTED VALUES DEFAULT current_volume_name(M) String volume_name(O) String volume_type(O) String primary|secondary server_prefix(O) String Prefix to the server id used in all objects keys bucket_configuration_id(O) String UUID for already existing service credentials (zxsuite config global get attribute s3BucketConfigurations) current_volume(O) Boolean true|false false (M) == mandatory parameter, (O) == optional parameter
# add volume, run as zimbra user zxsuite hsm doCreateVolume OpenIO name secondary http://{IP_ADDRESS} ZeXtras OPENIO # Delete volume zxsuite hsm doDeleteVolume name # set current zxsuite hsm doUpdateVolume OpenIO name current_volume true
zxsuite hsm doCreateVolume OpenIO
Syntax: zxsuite hsm doCreateVolume OpenIO {volume_name} {primary|secondary} {url} {account} {namespace} [attr1 value1 [attr2 value2...]] PARAMETER LIST NAME TYPE EXPECTED VALUES volume_name(M) String volume_type(M) Multiple choice primary|secondary url(M) String account(M) String namespace(M) String proxy_port(O) Integer account_port(O) Integer (M) == mandatory parameter, (O) == optional parameter Usage example: zxsuite hsm doCreateVolume OpenIO volumeName primary http://host/service
accountName namespaceString proxy_port 6006 account_port 6009
Syntax: zxsuite hsm doUpdateVolume OpenIO {current_volume_name} [attr1 value1 [attr2 value2...]] PARAMETER LIST NAME TYPE EXPECTED VALUES DEFAULT current_volume_name(M) String volume_name(O) String volume_type(O) String primary|secondary url(O) String account(O) String namespace(O) String proxy_port(O) Integer account_port(O) Integer current_volume(O) Boolean true|false false (M) == mandatory parameter, (O) == optional parameter
# add volume, run as zimbra user zxsuite hsm doCreateVolume Swift name secondary http://{IP_ADDRESS}:8080/auth/v1.0/ user:username password maxDeleteObjectsCount 100 # Delete volume zxsuite hsm doDeleteVolume name # set current zxsuite hsm doUpdateVolume Swift name current_volume true
zxsuite hsm doCreateVolume Swift
Syntax: zxsuite hsm doCreateVolume Swift {volume_name} {primary|secondary} {url} {username} {password} [attr1 value1 [attr2 value2...]] PARAMETER LIST NAME TYPE EXPECTED VALUES DEFAULT volume_name(O) String volume_type(O) String primary|secondary url(O) String username(O) String password(O) String maxDeleteObjectsCount(O) Integer Number of object in a single bulk delete request 500 (M) == mandatory parameter, (O) == optional parameter Usage example: zxsuite hsm doCreateVolume Swift volumeName primary http://host/service accountName password max_delete_objects_count 100
zxsuite hsm doUpdateVolume Swift
Syntax: zxsuite hsm doUpdateVolume Swift {current_volume_name} [attr1 value1 [attr2 value2...]] PARAMETER LIST NAME TYPE EXPECTED VALUES DEFAULT current_volume_name(M) String volume_name(O) String volume_type(O) String primary|secondary url(O) String username(O) String password(O) String maxDeleteObjectsCount(O) Integer Number of object in a single bulk delete request 500 current_volume(O) Boolean true|false false (M) == mandatory parameter, (O) == optional parameter
# add volume, run as zimbra user zxsuite hsm doCreateVolume Cloudian name secondary bucket_name bucket access_key ACCESSKEY secret SECRET url https://url.of.storage # Delete volume zxsuite hsm doDeleteVolume name # set current zxsuite hsm doUpdateVolume Cloudian name current_volume true
zxsuite hsm doCreateVolume Cloudian
Syntax: zxsuite hsm doCreateVolume Cloudian {volume_name} {primary|secondary} [attr1 value1 [attr2 value2...]] PARAMETER LIST NAME TYPE EXPECTED VALUES volume_name(M) String volume_type(M) Multiple choice primary|secondary bucket_name(O) String Bucket name url(O) String S3 API compatible service url (ex: s3api.service.com) access_key(O) String Service username secret(O) String Service password server_prefix(O) String Prefix to the server id used in all objects keys bucket_configuration_id(O) String UUID for already existing service credentials (zxsuite config global get attribute s3BucketConfigurations) prefix(O) String Prefix added to blobs keys (M) == mandatory parameter, (O) == optional parameter Usage example: zxsuite hsm doCreateVolume Cloudian volumeName primary bucket_name bucket url http://host/service access_key accessKey secret secretKet zxsuite hsm doCreateVolume Cloudian volumeName primary bucket_configuration_id uuid
zxsuite hsm doUpdateVolume Cloudian
Syntax: zxsuite hsm doUpdateVolume Cloudian {current_volume_name} [attr1 value1 [attr2 value2...]] PARAMETER LIST NAME TYPE EXPECTED VALUES DEFAULT current_volume_name(M) String volume_name(O) String volume_type(O) String primary|secondary server_prefix(O) String Prefix to the server id used in all objects keys bucket_configuration_id(O) String UUID for already existing service credentials (zxsuite config global get attribute s3BucketConfigurations) current_volume(O) Boolean true|false false (M) == mandatory parameter, (O) == optional parameter
zxsuite hsm doDeleteVolume
Syntax: zxsuite hsm doDeleteVolume {volume_name} PARAMETER LIST NAME TYPE volume_name(M) String (M) == mandatory parameter, (O) == optional parameter Usage example: zxsuite hsm dodeletevolume hsm Deletes volume with name hsm
Syntax: zxsuite hsm doVolumeToVolumeMove {source_volume_name} {destination_volume_name} PARAMETER LIST NAME TYPE source_volume_name(M) String destination_volume_name(M) String (M) == mandatory parameter, (O) == optional parameter Usage example: zxsuite hsm doVolumeToVolumeMove sourceVolume destVolume Moves the whole contents of sourceVolume to destVolume
The Centralized Storage feature allows to use an S3 bucket to host data coming from multiple servers at the same time sharing the same directory structure, as opposed to "independent" volumes which are self-contained and whose directory structure is strictly related to the server and volume itself.
This allows for better data management in large multistore environments and greatly improves mailbox move speed.
-
Create the centralized volume on any one of your servers using the
zxsuite hsm doCreateVolume
command.-
All volume types except for FileBlob are compatible;
-
Make sure to add the centralized TRUE flag to set the volume as a Centralized Storage;
-
The full syntax for the command depends on the storage type;
-
-
Once the Centralized Volume has been created, use the
zxsuite doCreateVolume Centralized
command on all other mailbox servers to copy the Centralized Volume’s configuration from the first server and add it to the volume list.-
The full syntax for the command is zxsuite hsm doCreateVolume Centralized {server_name} {volume_name}
-
Storage Structure Data is stored in a Centralized Volume plainly, as the main directory of the volume contains a single empty directory for each server connected to the volume and a directory for each mailbox stored in it at the very same level.
In the following example, servers 3aa2d376-1c59-4b5a-94f6-101602fa69c6 and 595a4409-6aa1-413f-9f45-3ef0f1e560f5 are both connected to the same Centralized volume, where 3 mailboxes are stored. As you can see, the effective server where the mailboxes are hosted is irrelevant to the storage.
_ |- 3aa2d376-1c59-4b5a-94f6-101602fa69c6/ |- 595a4409-6aa1-413f-9f45-3ef0f1e560f5/ |- ff46e039-28e3-4343-9d66-92adc60e60c9/ \ |-- 357-104.msg |-- 368-115.msg |-- 369-116.msg |-- 373-120.msg |-- 374-121.msg |-- 375-122.msg |-- 376-123.msg |-- 383-130.msg |- 4c022592-f67d-439c-9ff9-e3d48a8c801b/ \ |-- 315-63.msg |-- 339-87.msg |-- 857-607.msg |-- 858-608.msg |-- 859-609.msg |-- 861-611.msg |-- 862-612.msg |-- 863-613.msg |-- 864-614.msg |-- 865-615.msg |-- 866-616.msg |-- 867-617.msg |-- 868-618.msg |- dafd5569-4114-4268-9201-14f4a895a3d5/ \ |-- 357-104.msg |-- 368-115.msg |-- 369-116.msg |-- 373-120.msg |-- 374-121.msg |-- 375-122.msg |-- 376-123.msg |-- 383-130.msg |-- 384-131.msg
An HSM policy is a set of rules that define what items
will be moved from the Primary Store to the Secondary Store when the
doMoveBlobs
operation of HSM NG is triggered, either manually or by
scheduling.
A policy can consist of a single rule that is valid for all item types (Simple policy) or multiple rules valid for one or more item types (Composite policy). Also, an additional sub-rule can be defined using Zimbra’s search syntax.
Here are some policy examples. To see how to create the policies in the HSM NG module, see below.
-
Move all items older than 30 days
-
Move emails older than 15 days and items of all other kinds older than 30 days
-
Move calendar items older than 15 days, briefcase items older than 20 days and all emails in the "Archive" folder
Policies can be defined both from the HSM NG tab of the Administration Zimlet and from the CLI. You can specify a Zimbra Search in both cases.
To define a policy from the Administration Zimlet:
-
Log into the Zimbra Administration Console.
-
Click HSM NG on the Administration Zimlet.
-
Click the Add button in the Storage Management Policy section.
-
Select the Item Types from the
Items to Move:
list. -
Enter the Item Age from the
Move Items older than:
box. -
OPTIONAL: Add a Zimbra Search in the Additional Options box.
-
You can add multiple lines to narrow down your policy. Every line will be evaluated and executed after the line before has been applied.
Two policy management commands are available in the CLI:
-
setHsmPolicy
-
+setHsmPolicy
zxsuite hsm setHsmPolicy {policy}
This command resets the current policy and creates a new one as specified by the policy parameter.
The policy parameter must be specified in the following syntax
itemType1[,itemType2,itemtype3,etc]:query
zxsuite hsm +setHsmPolicy {policy}
This command adds the query specified by the policy parameter to the current HSM Policy.
The policy parameter must be specified in the following syntax
itemType1[,itemType2,itemtype3,etc]:query
Primary and Secondary volumes created with HSM NG can be hosted on S3 buckets, effectively moving the largest part of your data to secure and durable cloud storage.
While any storage service compatible with the Amazon S3 API should work out of the box with HSM NG, listed here are the only officially supported platforms:
-
FileBlob (standard local volume)
-
Amazon S3
-
EMC
-
OpenIO
-
Swift
-
Scality S3
-
Cloudian
-
Custom S3 (any unsupported S3-compliant solution)
In order to create a remote Primary Store on a mailbox server a local "Incoming"
directory must exist on that server. The default directory is
/opt/zimbra/incoming
, but you can check or modify the current value using these commands:
zxsuite config server get $(zmhostname) attribute incomingPath
zxsuite config server set $(zmhostname) attribute incomingPath value /path/to/dir
Storing a volume on third-party remote storage solutions requires a local directory to be used for item caching, which must be readable and writable by the zimbra user.
The local directory must be created manually and its path must be entered in the HSM NG section of the Administration Zimlet in the Zimbra Administration Console.
If the Local Cache directory is not set, you won’t be able to create any secondary volume on an S3-compatible device or service.
Warning
|
Failing to correctly configure the cache directory will cause items
to be unretrievable, meaning that users will get a No such BLOB error
when trying to access any item stored on an S3 volume.
|
Local Volumes (i.e. FileBlob type) can be hosted on any mountpoint on the system regardless of the mountpoint’s destination and are defined by the following properties:
-
Name: A unique identifier for the volume.
-
Path: The path where the data is going to be saved. The zimbra user must have r/w permissions on this path.
-
Compression: Enable or Disable file compression for the volume.
-
Compression Threshold: the minimum file size that will trigger the compression.
ImportantFiles under this size will never be compressed even if compression is enabled.
A Current Volume is a volume where data will be written upon arrival (Primary Current) or HSM Policy Application (Secondary Current). Volumes not set as Current won’t be written upon except by specific manual operations such as the Volume-to-Volume move.
HSM NG doesn’t need any dedicated setting or configuration on the S3 side, so setting up a bucket for your volumes is easy. Although creating a dedicated user bucket and access policy are not required, they are strongly suggested because they make it much easier to manage.
All you need to start storing your secondary volumes on S3 is:
-
An S3 bucket. You need to know the bucket’s name and region in order to use it.
-
A user’s Access Key and Secret.
-
A policy that grants the user full rights on your bucket.
A centralized Bucket Management UI is available in the Zimbra Administration Console. This facilitates saving bucket information to be reused when creating a new volume on an S3-compatible storage instead of entering the information each time.
To access the Bucket Management UI:
-
Access the Zimbra Administration Console
-
Select the "Configure" entry on the left menu
-
Select the "Global Settings" entry
-
Select the S3 Buckets entry
Any bucket added to the system will be available when creating a new volume of the following type: Amazon S3, Cloudian, EMC, Scality S3, Custom S3.
Files are stored in a bucket according to a well-defined path, which can be customized at will in order to make your bucket’s contents easier to understand even on multi-server environments with multiple secondary volumes:
/Bucket Name/Destination Path/[Volume Prefix-]serverID/
-
The Bucket Name and Destination Path are not tied to the volume itself, and there can be as many volumes under the same destination path as you wish.
-
The Volume Prefix, on the other hand, is specific to each volume and it’s a quick way to differentiate and recognize different volumes within the bucket.
To create a new volume with HSM NG from the Zimbra Administration Console:
-
Enter the HSM Section of the NG Administration Zimlet in the Zimbra Administration Console
-
Click on Add under either the Primary Volumes or Secondary Volumes list
-
Select the Volume Type among the available storage choices
-
Enter the required volume information
ImportantEach volume type will require different information to be set up, please refer to your storage provider’s online resources to obtain those details.
To edit a volume with HSM NG from the Zimbra Administration Console:
-
Enter the HSM Section of the NG Administration Zimlet in the Zimbra Administration Console
-
Select a volume
-
Click on Edit
-
When done, click Save
To delete a volume with HSM NG from the Zimbra Administration Console:
-
Enter the HSM Section of the NG Administration Zimlet in the Zimbra Administration Console
-
Select a volume
-
Click on Delete
Note
|
Only empty volumes can be deleted. |
Storing your secondary Zimbra volumes on Amazon S3 doesn’t have any specific bucket requirements, but we suggest that you create a dedicated bucket and disable Static Website Hosting for easier management.
To obtain an Access Key and the related Secret, a Programmatic
Access
user is needed. We suggest that you create a dedicated user in Amazon’s
IAM Service for easier management.
In Amazon’s IAM, you can set access policies for your users. It’s mandatory that the user of your Access Key and Secret has a set of appropriate rights both on the bucket itself and on its contents. For easier management, we recommend granting full rights as shown in the following example:
{ `Version`: `[LATEST API VERSION]`, `Statement`: [ { `Sid`: `[AUTOMATICALLY GENERATED]`, `Effect`: `Allow`, `Action`: [ `s3:*` ], `Resource`: [ `[BUCKET ARN]/*`, `[BUCKET ARN]` ] } ] }
Warning
|
This is not a valid configuration policy. Don’t copy and paste it into your user’s settings as it won’t be validated. |
If you only wish to grant minimal permissions, change the Action
section to:
"Action": [ `s3:PutObject`, `s3:GetObject`, `s3:DeleteObject`, `s3:AbortMultipartUpload` ],
The bucket’s ARN is expressed according to Amazon’s standard naming format: arn:partition:service:region:account-id:resource. For more information about this topic, please see Amazon’s documentation.
Files are stored in a bucket according to a well-defined path, which can be customized at will to make your bucket’s contents easier to understand (even on multi-server environments with multiple secondary volumes):
/Bucket Name/Destination Path/serverID/
The Bucket Name and Destination Path are not tied to the volume itself, and there can be as many volumes under the same destination path as you wish.
The Volume Prefix, on the other hand, is specific to each volume and it’s a quick way to differentiate and recognize different volumes within the bucket.
HSM NG is compatible with the Amazon S3 Standard - Infrequent access
storage class and will set any file larger than the Infrequent Access
Threshold
value to this storage class as long as the option has been enabled on the volume.
For more information about Infrequent Access, please refer to the official Amazon S3 Documentation.
HSM NG is compatible with the Amazon S3 - Intelligent Tiering
storage class and will set the appropriate Intelligent Tiering flag on all files, as long as the option has been enabled on the volume.
For more information about Intelligent Tiering, please refer to the official Amazon S3 Documentation.
Item deduplication is a technique that allows you to save disk space by storing a single copy of an item and referencing it multiple times instead of storing multiple copies of the same item and referencing each copy only once.
This might seem like a minor improvement. However, in practical use, it makes a significant difference.
Item deduplication is performed by Zimbra at the moment of storing a new item in the Current Primary Volume.
When a new item is being created, its message ID
is compared to a list
of cached items. If there is a match, a hard link to the cached
message’s BLOB is created instead of a whole new BLOB for the message.
The dedupe cache is managed in Zimbra through the following config attributes:
zimbraPrefDedupeMessagesSentToSelf
Used to set the deduplication behavior for sent-to-self messages.
<attr id="144" name="zimbraPrefDedupeMessagesSentToSelf" type="enum" value="dedupeNone,secondCopyifOnToOrCC,dedupeAll" cardinality="single" optionalIn="account,cos" flags="accountInherited,domainAdminModifiable"> <defaultCOSValue>dedupeNone</defaultCOSValue> <desc>dedupeNone|secondCopyIfOnToOrCC|moveSentMessageToInbox|dedupeAll</desc> </attr>
zimbraMessageIdDedupeCacheSize
Number of cached Message IDs.
<attr id="334" name="zimbraMessageIdDedupeCacheSize" type="integer" cardinality="single" optionalIn="globalConfig" min="0"> <globalConfigValue>3000</globalConfigValue> <desc> Number of Message-Id header values to keep in the LMTP dedupe cache. Subsequent attempts to deliver a message with a matching Message-Id to the same mailbox will be ignored. A value of 0 disables deduping. </desc> </attr>
zimbraPrefMessageIdDedupingEnabled
Manage deduplication at account or COS-level.
<attr id="1198" name="zimbraPrefMessageIdDedupingEnabled" type="boolean" cardinality="single" optionalIn="account,cos" flags="accountInherited" since="8.0.0"> <defaultCOSValue>TRUE</defaultCOSValue> <desc> Account-level switch that enables message deduping. See zimbraMessageIdDedupeCacheSize for more details. </desc> </attr>
zimbraMessageIdDedupeCacheTimeout
Timeout for each entry in the dedupe cache.
<attr id="1340" name="zimbraMessageIdDedupeCacheTimeout" type="duration" cardinality="single" optionalIn="globalConfig" since="7.1.4"> <globalConfigValue>0</globalConfigValue> <desc> Timeout for a Message-Id entry in the LMTP dedupe cache. A value of 0 indicates no timeout. zimbraMessageIdDedupeCacheSize limit is ignored when this is set to a non-zero value. </desc> </attr>
(older Zimbra versions might use different attributes or lack some of them)
The HSM NG features a doDeduplicate
operation that parses a
target volume to find and deduplicate any duplicated item.
Doing so you will save even more disk space, as while Zimbra’s automatic deduplication is bound to a limited cache, HSM NG’s deduplication will also find and take care of multiple copies of the same email regardless of any cache or timing.
Running the doDeduplicate
operation is also highly suggested after a
migration or a large data import in order to optimize your storage
usage.
To run a volume deduplication via the Administration Zimlet, simply click on the HSM NG tab, select the volume you wish to deduplicate and press the Deduplicate button.
To run a volume deduplication through the CLI, use the doDeduplicate
command:
zimbra@mailserver:~$ zxsuite hsm doDeduplicate command doDeduplicate requires more parameters Syntax: zxsuite hsm doDeduplicate {volume_name} [attr1 value1 [attr2 value2... PARAMETER LIST NAME TYPE EXPECTED VALUES DEFAULT volume_name(M) String[,..] dry_run(O) Boolean true|false false (M) == mandatory parameter, (O) == optional parameter Usage example: zxsuite hsm dodeduplicate secondvolume Starts a deduplication on volume secondvolume
To list all available volumes, you can use the `zxsuite hsm getAllVolumes` command.
The doDeduplicate
operation is a valid target for the monitor
command, meaning that you can watch the command’s statistics while it’s
running through the zxsuite hsm monitor [operationID]
command.
Sample Output
Current Pass (Digest Prefix): 63/64 Checked Mailboxes: 148/148 Deduplicated/duplicated Blobs: 64868/137089 Already Deduplicated Blobs: 71178 Skipped Blobs: 0 Invalid Digests: 0 Total Space Saved: 21.88 GB
-
Current Pass (Digest Prefix): The
doDeduplicate
command will analyze the BLOBS in groups based on the first character of their digest (name). -
Checked Mailboxes: The number of mailboxes analyzed for the current pass.
-
Deduplicated/duplicated Blobs: Number of BLOBS deduplicated by the current operation / Number of total duplicated items on the volume.
-
Already Deduplicated Blobs: Number of deduplicated blobs on the volume (duplicated blobs that have been deduplicated by a previous run).
-
Skipped Blobs: BLOBs that have not been analyzed, usually because of a read error or missing file.
-
Invalid Digests: BLOBs with a bad digest (name different from the actual digest of the file).
-
Total Space Saved: Amount of disk space freed by the doDeduplicate operation.
Looking at the sample output above we can see that:
-
The operation is running the second to last pass on the last mailbox.
-
137089 duplicated BLOBs have been found, 71178 of which have already been deduplicated previously.
-
The current operation deduplicated 64868 BLOBs, for a total disk space saving of 21.88GB.
At first sight, HSM NG seems to be strictly dedicated to HSM. However, it also features some highly useful volume-related tools that are not directly related to HSM.
Due to the implicit risks in volume management, these tools are only available through the CLI.
The following volume operations are available:
doCheckBlobs: Perform BLOB coherency checks on one or more volumes.
doDeduplicate: Start Item Deduplication on a volume.
doVolumeToVolumeMove: Move all items from one volume to another.
getVolumeStats: Display information about a volume’s size and number of thereby contained items/blobs.
Usage
zimbra@mail:~$ zxsuite hsm doCheckBlobs command doCheckBlobs requires more parameters Syntax: zxsuite hsm doCheckBlobs {start} [attr1 value1 [attr2 value2... PARAMETER LIST NAME TYPE EXPECTED VALUES DEFAULT action(M) String start volume_ids(O) Integer[,..] 1,3 mailbox_ids(O) Integer[,..] 2,9,27 missing_blobs_crosscheck(O) Boolean true|false true traced(O) Boolean true|false false (M) == mandatory parameter, (O) == optional parameter Usage example: Usage examples: zxsuite hsm doCheckBlobs start: Perform a BLOB coherency check on all message volumes. zxsuite hsm doCheckBlobs start volume_ids 1,3: Perform a BLOB coherency check on volumes 1 and 3. zxsuite hsm doCheckBlobs start mailbox_ids 2,9,27: Perform a BLOB coherency check on mailboxes 2,9 and 27. zxsuite hsm doCheckBlobs start missing_blobs_crosscheck false: Perform a BLOB coherency check without checking on other volumes. zxsuite hsm doCheckBlobs start traced true: Perform a BLOB coherency check, logging even the correct checked items.
Description and Tips
The doCheckBlobs operation can be used to run BLOB coherency checks on volumes and mailboxes. This can be useful when experiencing issues related to broken or unviewable items, which are often caused because either Zimbra cannot find or access the BLOB file related to an item or there is an issue with the BLOB content itself.
Specifically, the following checks are made:
-
DB-to-BLOB coherency: For every Item entry in Zimbra’s DB, check whether the appropriate BLOB file exists.
-
BLOB-to-DB coherency: For every BLOB file in a volume/mailbox, check whether the appropriate DB data exists.
-
Filename coherency: Checks the coherency of each BLOB’s filename with its content (as BLOBs are named after their file’s SHA hash).
-
Size coherency: For every BLOB file in a volume/mailbox, checks whether the BLOB file’s size is coherent with the expected size (stored in the DB).
Important
|
The old zmblobchk command is deprecated and replaced by zxsuite hsm doCheckBlobs on all infrastructures using HSM NG module.
|
Usage
zimbra@mail:~$ zxsuite hsm doDeduplicate command doDeduplicate requires more parameters Syntax: zxsuite hsm doDeduplicate {volume_name} [attr1 value1 [attr2 value2... PARAMETER LIST NAME TYPE EXPECTED VALUES DEFAULT volume_name(M) String[,..] dry_run(O) Boolean true|false false (M) == mandatory parameter, (O) == optional parameter Usage example: zxsuite hsm dodeduplicate secondvolume Starts a deduplication on volume secondvolume
Usage
zimbra@mail:~$ zxsuite hsm doVolumeToVolumeMove command doVolumeToVolumeMove requires more parameters Syntax: zxsuite hsm doVolumeToVolumeMove {source_volume_name} {destination_volume_name} PARAMETER LIST NAME TYPE source_volume_name(M) String destination_volume_name(M) String (M) == mandatory parameter, (O) == optional parameter Usage example: zxsuite hsm doVolumeToVolumeMove sourceVolume destVolume Moves the whole sourceVolume to destVolume
Description and Tips
This command can prove highly useful in all situations where you need to stop using a volume, such as:
-
Decommissioning old hardware: If you want to get rid of an old disk in a physical server, create new volumes on other/newer disks and move your data there.
-
Fixing little mistakes: If you accidentally create a new volume in the wrong place, move the data to another volume.
-
Centralize volumes: Centralize and move volumes as you please, for example, if you redesigned your storage infrastructure or you are tidying up your Zimbra volumes.
Usage
zimbra@mail:~$ zxsuite hsm getVolumeStats command getVolumeStats requires more parameters Syntax: zxsuite hsm getVolumeStats {volume_id} [attr1 value1 [attr2 value2... PARAMETER LIST NAME TYPE EXPECTED VALUES DEFAULT volume_id(M) Integer show_volume_size(O) Boolean true|false false show_blob_num(O) Boolean true|false false (M) == mandatory parameter, (O) == optional parameter Usage example: **BE CAREFUL** show_volume_size and show_blob_num options are IO intensive and thus disabled by default zxsuite hsm getVolumeStats 2 Shows stats for the volume with ID equal to 2
Description and Tips
This command provides the following information about a volume:
name | description |
---|---|
id |
The ID of the volume |
name |
The Name of the volume |
path |
The Path of the volume |
compressed |
Compression enabled/disabled |
threshold |
Compression threshold (in bytes) |
lastMoveOutcome |
Exit status of the latest doMoveBlobs operation |
lastMoveTimestamp |
End timestamp of the latest doMoveBlobs operation |
lastMoveDuration |
Duration of the last doMoveBlobs operation |
lastItemMovedCount |
Number of items moved to the current secondary volume during the latest doMoveBlobs operation |
bytesSaved |
Total amount of disk space freed up thanks to deduplication and compression |
bytesSavedLast |
Amount of disk space freed up thanks to deduplication and compression during the latest doMoveBlobs operation |
The show_volume_size
and show_blob_num
options will add the
following data to the output:
option | name | description |
---|---|---|
show_volume_size |
totSize |
Total disk space used up by the volume |
show_blob_num |
blobNumber |
Number of BLOB files in the volume |
The doMailboxMove
command allows you to move a single mailbox or all accounts
from a given domain, from one mailbox server to another within the same
Zimbra infrastructure.
Warning
|
If the HSM NG module is installed and enabled, this command replaces the old zmmboxmove and zmmailboxmove commands. Using any of the legacy commands will return an error and won’t move any data.
|
Syntax
Syntax: zxsuite hsm doMailboxMove {an account name: john@example.com or a domain name: example.com} {destinationHost} [attr1 value1 [attr2 value2...]] PARAMETER LIST NAME TYPE EXPECTED VALUES DEFAULT destinationHost(M) String accounts(O) String[,..] john@example.com,smith@example.com[,...] domains(O) String[,..] example.com,test.com[,...] input_file(O) String stages(O) String[,..] blobs|backup|data|account data=blobs+backup[,...] blobs,backup,account compress(O) Boolean true|false true checkDigest(O) Boolean if false skip digest calculation and check true overwrite(O) Boolean true|false false threads(O) Integer 1 hsm(O) Boolean true|false false notifications(O) Email Address ignore_partial(O) Boolean true|false false drop_network_backup(O) Boolean true|false false read_error_threshold(O) Integer (M) == mandatory parameter, (O) == optional parameter Usage example: zxsuite HSM NG domailboxmove john@example.com mail2.example.com Move mailbox for account john@example.com to mail2.example.com host
destinationHost(M) |
The host where the mailbox must be moved to. |
accounts(O) |
Comma separated list of mailbox(es) to move. Can be combined with the "domains" option. |
domains(O) |
Comma separated list of domain(s) to move. Can be combined with the "accounts" option. |
input_file(O) |
File containing the list of mailboxes to move, one per line. |
stages(O) |
The stages of the move to perform among blobs, backup, data, account. The "Data" stage will move both blobs and backup, while the "account" stage will effectively move the mailbox information. |
compress(O) |
Whether to compress the moved blobs on the destination host or not. |
checkDigest(O) |
Whether to check item digests during the move or not. Safer but slower. |
overwrite(O) |
Whether to overwrite previously moved items for the same mailbox. |
threads(O) |
Number of threads to use for the move. Higher threads mean faster moves but with more impact on the system’s performances. |
hsm(O) |
Whether to apply the HSM policies on the destination host when moving the blobs. |
notifications(O) |
Comma separated list of email addresses to notify about the outcome of the operation. |
ignore_partial(O) |
Ignore previous move attempts. |
drop_network_backup(O) |
Delete Legacy Backup data during the move |
read_error_threshold(O) |
Maximum amount of read I/O errors to allow before stopping the operation. |
doMailboxMove Details
-
When moving a domain, each account from the current server is enumerated and moved sequentially.
-
The mailbox is set to maintenance mode only during the 'account' stage.
-
The move will be stopped if 5% or more write errors are encountered on items being moved.
-
When multiple mailboxes are moved within the same operation, the error count is global and not per-mailbox.
-
-
Moves will not start if the destination server does not have enough space available to host the mailbox.
-
When a single operation is used to move multiple mailboxes, the space check will be performed before moving each mailbox.
-
-
All data is moved at a low-level and will not be changed except for the mailbox id.
-
The operation is made up of 3 stages: blobs|backup|account. For each mailbox:
-
blobs: All blobs are copied from the source server to the destination server.
-
backup: All backup entries are copied from the source server to the destination server.
-
account: All database entries are moved as-is and LDAP entries are updated, effectively moving the mailbox.
-
-
All of the stages are executed sequentially.
-
On the reindex stage’s completion, a new HSM operation is submitted to the destination server, if not specified otherwise.
-
All volumes' compression options are taken.
-
The MailboxMove operation can be executed if and only if no others operations are running on the source server.
-
A move will not start if the destination server does not have enough space available or the user just belongs to the destination host.
-
By default, items are placed in the Current Primary volume of the destination server.
-
The
hsm true
option can be used to apply the HSM policies of the destination server after a mailbox is successfully moved.
-
-
If, for any reason, the move stops before it is completed the original account will still be active and the appropriate notificaton will be issued.
-
Should the mailboxd crash during move, the "Operation Interrupted" notification is issued as for all operations, warning the users about the interrupted operation.
-
Index information are moved during the 'account' stage, so no manual reindexing is needed nor one will be triggered automatically.
A new indexing engine has been added to HSM NG to index attachment contents.
The indexing engine works together with Zimbra’s default engine. The main Zimbra indexing process analyzes the contents of an item, splitting it into several parts based on the MIME parts of the object. Next, Zimbra handles the indexing of known contents - plaintext - and passes the datastream on to the HSM NG handlers for all other content.
The indexing engine includes an indexing cache that speeds up the indexing process of any content that has already been analyzed. Datastreams over 10Kb are cached by default, and the cache hold 10000 entries, while smaller datastreams are not cached as the cache benefits only apply to large datastreams.
Extension | Parser | Content-type |
---|---|---|
|
|
application/x-asp |
|
|
application/xhtml+xml |
|
|
application/xhtml+xml, text/html |
|
|
application/xhtml+xml |
|
|
application/xhtml+xml |
Extension | Parser | Content-type |
---|---|---|
|
|
application/rtf |
|
|
application/pdf |
|
|
application/x-mspublisher |
|
|
application/vnd.ms-excel |
|
|
application/vnd.ms-excel |
|
|
application/vnd.ms-excel |
|
|
application/vnd.ms-powerpoint |
|
|
application/vnd.ms-powerpoint |
|
|
application/vnd.ms-project |
|
|
application/msword |
|
|
application/msword |
|
|
application/vnd.ms-outlook |
|
|
application/vnd.visio |
|
|
application/vnd.visio |
|
|
application/vnd.visio |
|
|
application/vnd.visio |
|
|
application/vnd.ms-excel.sheet.macroenabled.12 |
|
|
application/vnd.ms-powerpoint.presentation.macroenabled.12 |
|
|
application/vnd.openxmlformats-officedocument.spreadsheetml.template |
|
|
application/vnd.openxmlformats-officedocument.wordprocessingml.document |
|
|
application/vnd.openxmlformats-officedocument.presentationml.template |
|
|
application/vnd.openxmlformats-officedocument.spreadsheetml.sheet |
|
|
application/vnd.openxmlformats-officedocument.presentationml.presentation |
|
|
application/vnd.ms-excel.addin.macroenabled.12 |
|
|
application/vnd.ms-word.document.macroenabled.12 |
|
|
application/vnd.ms-excel.template.macroenabled.12 |
|
|
application/vnd.openxmlformats-officedocument.wordprocessingml.template |
|
|
application/vnd.ms-powerpoint.slideshow.macroenabled.12 |
|
|
application/vnd.ms-powerpoint.addin.macroenabled.12 |
|
|
application/vnd.ms-word.template.macroenabled.12 |
|
|
application/vnd.openxmlformats-officedocument.presentationml.slideshow |
|
|
application/vnd.oasis.opendocument.text |
|
|
application/vnd.oasis.opendocument.spreadsheet |
|
|
application/vnd.oasis.opendocument.presentation |
|
|
application/vnd.oasis.opendocument.graphics |
|
|
application/vnd.oasis.opendocument.chart |
|
|
application/vnd.oasis.opendocument.formula |
|
|
application/vnd.oasis.opendocument.image |
|
|
application/vnd.oasis.opendocument.text-master |
|
|
application/vnd.oasis.opendocument.text-template |
|
|
application/vnd.oasis.opendocument.spreadsheet-template |
|
|
application/vnd.oasis.opendocument.presentation-template |
|
|
application/vnd.oasis.opendocument.graphics-template |
|
|
application/vnd.oasis.opendocument.chart-template |
|
|
application/vnd.oasis.opendocument.formula-template |
|
|
application/vnd.oasis.opendocument.image-template |
|
|
application/vnd.oasis.opendocument.text-web |
|
|
application/vnd.sun.xml.writer |
Extension | Parser | Content-Type |
---|---|---|
|
|
application/x-compress |
|
|
application/x-bzip |
|
|
application/x-bzip2 |
|
|
application/x-bzip2 |
|
|
application/gzip |
|
|
application/x-gzip |
|
|
application/x-gzip |
|
|
application/x-xz |
|
|
application/x-tar |
|
|
application/java-archive |
|
|
application/x-7z-compressed |
|
|
application/x-cpio |
|
|
application/zip |
|
|
application/x-rar-compressed |
|
|
text/plain |
Parsers can be turned on or off by changing the related value to true
or false
via the zxsuite config
CLI command.
Attribute | Parsers |
---|---|
pdfParsingEnabled |
PDFParser |
odfParsingEnabled |
OpenDocumentParser |
archivesParsingEnabled |
CompressorParser, PackageParser, RarParser |
microsoftParsingEnabled |
OfficeParser, OOXMLParser, OldExcelParser |
rtfParsingEnabled |
RTFParser |
e.g. to disable PDF parsing run:
zxsuite config server set server.domain.com attribute pdfParsingEnabled value false
By default, all parsers are active.