Overview
Most of the replication functionality remains unchanged, and continuous, namespace, and trigger replication functionality is still offered. The replication interface for data will continue to use the bfstv2 API defined for the current blockpool.
Since data is not stored in its native form, and instead is deduplicated immediately, replication will not need the on-demand deduplication function, and namespace replications will no longer have the possibility of being in the “Waiting” state while waiting for deduplication to occur.
What is different between DXi 1.x and DXi 2.x?
- With DXi 2.x software, blockpool wrapper component (BPW API) drives continuous repliation, tag/metadata collection for NS and trigger-based replication.
- With DXi 2.x software, NAS trigger replication is automatically initiated from bpw api component.
- With DXi 2.x software, the bpw API holds/drops replicated tag references.
- DXi 2.x software provides a means to identify its software revision and uses that information to determine how to handle incoming namespace and trigger replications, and also incoming failbacks. This information is also used during failback to another system to determine whether to use the Galaxy 2.0 or pre-Galaxy 2.0 replication data format. This versioning scheme will cover both the replicationd, re_message, and metatar components. Also, the version information should be persisted in the namespace and failback metadata.
Backwards Compatibility
Tag Size
- Variable size tags
- Max size is 1 GB in 2.0; 256 MB in 1.x.
Caution: The current setting is 256MB and should not be changed to 1GB because this could cause the system to run out of memory.
There's no significant ingest and dedupe performance difference between 1.x and 2.0 tags.
What is “Normalization”?
- Making sure the tag size and file metadata is compatible with 1.x system
- It could take a long time for a large amount of data generated on a 2.0 system in 2.0 format (variable tag size) to get normalized
Replication Compatibility
- Yes: DXi 1.x to DXi 1.x
- Yes: DXi 2.x to DXi 2.x
- Yes: DXi 2.0 configured in 1.x mode to 1.x: All replication (When replicating from 2.0 to a 1.x DXi system, make sure all that DXi systems are set to 1.x mode.)
- Yes: Galaxy 1.x to 2.0: All replication
Failback Compatibility
- Yes: DXi 1.x to DXi 1.x
- Yes: DXi 2.x to DXi 2.x
- Yes: DXi 2.x in 1.x mode to 1.x
- No: 2.x in 2.x mode, to 1.x (would require recovery on 2.x then have painful normalized replication to 1.x)
Failback Restrictions
- Failback of 2.0 replicated data to a 1.x system is disallowed because the tags are not “normalized” as needed for 1.x where tags have a standard size of 256M. That is, the 1.x system cannot handle the variable length Blobs that are greater than 256M.
- Failback of 1.x replicated data to a 2.0 system is allowed. The old 1.x metatar data, which is used in replication, can be properly interpreted by 2.0.
How Replication Works Internally
The key difference between DXi 1.x and DXi 2.x software is theintegration of BPW.
Starting and Stopping Replication
To start/stop replicationd/bpgc/spaced daemon:
- move /var/Dxi/processwatcher file
- /etc/rc.d/init.d/<processname> start | stop
The binaries are in /hurricane. Replication also uses re_message CGI program.
Location of Replication/Space Reclamation Configuration Files
- /data/hurricane/replication.conf (dynamic)
- /data/hurricane/gc.conf
- Seer files: (reads new values when process starts)
- ObjectManager.conf (replication streams/workq size)
- SpaceManagement.conf (number of saved replications)
Logging
Primary Logging Level
- INFO: replicationd, bpgc, healthcheck
- WARN (default): spaced
Changing Logging Levels
- Move /var/DXi/processwatcher
- Edit /opt/DXi/log-common.conf
- /etc/rc.d/init.d/log4cplus-server restart
- /etc/rc.d/init.d/<process name>/restart
Log Files
- /var/log/DXI:
- tsunami.log, tsunami_trace.log, tsunami_lc.log
- /var/log/blockpool_master.log
Troubleshooting Steps
- Verify correct configuration.
- Review logs for that time period.
- tsunami.log file
- /var/log/blockpool_master.log
- Review other relevant log files/bpw hold files/replication bundles/missing tag files.
- Review internal files:
- /data/hurricane/replication.conf file.
- bpw hold files: /snfs/common/data dir
- AttrBallHoldList and PrepostHoldList
What's Next?
Space Reclamation >