Configure FlexSync for NFSv4 (Isilon)
This topic provides information on how to configure FlexSync for NFSv4 (Isilon) so that you can preserve NFSv4 ACLs during the FlexSync replication process.
Note: The configuration also provides a considerable increase in performance when you use an Isilon web server.

Prerequisites
-
The configuration requires you mount a source host directory and a destination host directory to an NFS client.
-
The configuration requires you only install FlexSync on the same NFS client from above.
Note: The replication is performed locally, from a mounted directory to another; however, an effective replication with an Isilon web server is also available. Additional details are provided below.
FlexSync supports two processes:
-
Isilon to an NFS client to any target.
Note: Isilon is the host using the OneFS operating system and an NFSv3 file system.
-
Any source to an NFS client to any target.
Note: "Any source" and "any target" can be anything mounted on an NFS client; however, it is usually a StorNext file system.
StorNext Storage Manager Managed File Systems
You can use a StorNext Storage Manager managed file system as a source; however, FlexSync cannot use the FSM and TSM functionality of the StorNext managed file system mounted on an NFS client as a source. As a result, you cannot retrieve a truncated file from media and the process causes a replication error.
See also Considerations for an NFSv4 ACL Replication and a StorNext Storage Manager Managed File System.

Prerequisites
-
The configuration requires you mount a source host directory and a destination host directory to an NFS client.
-
The configuration requires you only install FlexSync on the same NFS client from above.
Note: The replication is performed locally, from a mounted directory to another; however, an effective replication with an Isilon web server is also available (see Configure an Isilon Web Server).
The replication is performed locally; therefore, additional configuration is not required to execute the replication (keep in mind, the information from StorNext Storage Manager Managed File Systems).
Note: There are some considerations for ACLs replication and manipulation (see Considerations for an NFSv4 ACL).
Configure an Isilon Web Server
Do the following to configure an Isilon web server (for example, Isilon to an NFS client to any target):
Note: Quantum recommends you take advantage of the functionality provided by an Isilon web server because the functionality is similar to a StorNext FSM.
-
At the prompt, execute the following command to enable replication by an Isilon host web server and add the authentication for the host:
/opt/quantum/flexsync/bin/flexsyncauth -a -d nfsv4 –H <isilon_host_ip> -U <username> -P <password>Note: The
<username>
and the<password>
are the credentials of the web server on your Isilon host and serve as the source in your replication process.Note: This also adds the web server authentication in your /opt/quantum/flexsync/var/flexsyncd.dat file.
-
(Optional) By default, an Isilon host web server listens on port 8080. At the prompt, execute the following command to designate a different port:
/opt/quantum/flexsync/bin/flexsyncauth -a -d nfsv4 -H <isilon_host_ip>:<port_num> -U <username> -P <password> -
At the prompt, execute the following command to restart the flexsyncd daemon and allow the configuration to retrieve the necessary data from your database.
systemctl restart flexsyncd -
At the prompt, execute the following command to verify the authentication data is added to the database:
/opt/quantum/flexsync/bin/flexsyncdump -d flexsyncd -l -
At the prompt, execute the following command to verify the
<username>
and the<password>
are valid:curl https://<isilon_host_ip>:8080/platform/1/snapshot/snapshots --insecure --basic --user <username>:<password>

Use the StorNext Unified User Interface (UUI)
When the flexsyncd daemon determines the source directory is NFSv4, the daemon detects the real IP address of the mounted host and verifies the authentication entry for this IP address is present in the flexsyncd.dat file. If the verification is successful, then the authentication is extracted from database and the web server communication is allowed using qevhttp.
-
The flexsyncd daemon attempts to detect the previous snapshot for your task using the uuid.
-
If a snapshot is not detected for your task, then the process creates an initial snapshot and switches to a local replication.
-
If a previous snapshot is detected for your task, then the process creates the current snapshot for your task and then creates a change list using current and previous snapshots. Afterward, the process uses a file entry from of a change list and performs a comparison of each file in your change list.
-
When the comparison process completes, the change list is deleted. If there are errors, then the current snapshot is deleted; otherwise, the previous snapshot is deleted.
Note: To identify your task snapshot, the task uuid and the time stamp is displayed. For example, 260e2820-bf02-43db-b1af-260cb3ed5bba_snap_1652790578.
Use the FlexSync CLI
When you use the FlexSync CLI, you must include the following command line parameters:
For an initial replication task, execute the following command to run a replication task and create an initial snapshot:
Execute the following command to display the snapshot:
After the replication is processed, you can view the name of your snapshot in the last snapshot JSON that is displayed.
For a subsequent replication task, execute the following command(s):
or
Note: The suffix with the time stamp (_snap_1652790578) might be cut off. If a previous snapshot is detected, then the change list is used similar to the replication by a task.

You can access an NFSv4 file system object using the web server, the command line, and curl. There are two objects used by FlexSync in a replication task:
-
A snapshot.
-
A change list.
In the examples below, assume the Isilon web server credentials are myuser:mypassword.
Example: Display your snapshots.
Execute the following command to display your snapshots:
Example: Display your change lists.
Execute the following command to display your change lists:
Example: Display a specific change list.
Execute the following command to display a change list with an identification number 178_188:
Example: Display entries in a specific change list.
Execute the following command to display entries in a change list with an identification number 10_12:
Example: Delete a specific snapshot.
Execute the following command to delete a snapshot with an identification number 350:
Example: Delete a specific change list.
Execute the following command to delete a change list with an identification number 682_684:

The cause of an NFSv4 ACL replication issue is the presence of an ACE that you cannot map on a destination. As a result, the replication task fails.
Note: By default, FlexSync does not complete a replication task when a metadata error is detected.
To workaround this issue, enable the Continue on meta failure option in the Advanced section of your task configuration.
For an NFSv4 ACL, the Continue on meta failure option allows you to:
-
Apply only an ACE that is mapped on your destination.
-
Issue a warning about an ACE that is not mapped.
-
Continue the data replication task.
Note: Quantum recommends you use an Isilon web server to perform a replication task consisting of a large number of files that rely on a change list. Otherwise, Isilon actively sets predefined ACEs and all files, at each run, are compared since the metadata is always different between your source and your destination.
See also Considerations for an NFSv4 ACL Replication and a StorNext Storage Manager Managed File System.

The replication task issues below were discovered during a test on a virtual machine and might not occur on a physical system.
-
Creation of a change list requires a large amount of time. Normally, the creation of the change list is extremely fast and done by a OneFS job that is controlled by the job engine. It works in parallel and multi_threaded on all nodes (see Dell PowerScale OneFS Job Engine).
-
During a web replication, you might encounter an authentication issue and a 404 error, likely due to your Isilon web server.
-
An Isilon file system object is accessible by any authorized user who can delete, or change the object using the command line. As a result, your task might be impacted.
-
See also Considerations for an NFSv4 ACL Replication and a StorNext Storage Manager Managed File System.

Below are considerations for an NFSv4 ACL replication and a StorNext Storage Manager managed file system.
-
An ACEs cannot map inside an ACL. For example, A::duke:r is applicable on the source but cannot be mapped on the destination due to a lack of the user duke. The option Continue on meta failure is useful because it allows FlexSync to replicate an ACEs that you can map on your destination.
For example, if you mount a source directory as a Samba share on a Windows OS system, then when you create a new file, the system automatically retrieves the Isilon ACL with three ASEs:
-
A::sysadmin
-
A::nobody
-
A:g:EVERYONE
If you create a directory, then it contains an ACL with six ASEs:
-
A::sysadmin
-
A::nobody
-
A:g:EVERYONE
-
A:fdig:+creator owner
-
A:fdig:+creator group
-
A:fdig:EVERYONE
Again, the option Continue on meta failure is useful in this case.
-
-
Isilon grants you access to a file system object and allows you to manipulate it. When you use an Isilon web server, this access can create an issue during your replication task. For instance, an authorized user can delete a previous snapshot causing the replication task to switch to a less efficient full scan.
-
During an Isilon web server replication task, you might encounter an issue with creating a change list. The issue does not indicate an error and does not provide information about the job identifier. The Isilon documentation and Isilon API examples mention that the identifier of the change list resembles older-snap-id_newer-snap-id (for example, 34_38).
-
When you create a file on an Isilon system, Isilon sets a predefined ACL with the ACEs A::OWNER@:rwatTnNcCy, A::GROUP@:rtncy, A::EVERYONE@:rtncy. It also updates their ctime time stamp. During testing the ctime is set to the time, which can be several seconds (up to one minute) later than the time of file creation. This causes an issue when a file is replicated immediately after creation, since its ctime might be changed after creating the initial snapshot of the task, so during its subsequent task run it gets into a change list and into a replication process even if it is not changed.
-
When a replication task is performed from a mounted Isilon system to a non-Isilon system (for example, StorNext) using an NFS client, the predefined Isilon ACEs (A::OWNER@:rwatTnNcCy, A::GROUP@:rtncy, A::EVERYONE@:rtncy) are silently stripped from an ACL during replication when the system call setxattr() is invoked. As a result, the metadata in a file is identified as being different by FlexSync and processes their replication on each run even if the file has not changed, unless an Isilon web server is used, where a file replication is regulated by a snapshot and a change list.
-
Quantum recommends you use an Isilon web server to perform a replication task consisting of a large number of files that rely on a change list. Otherwise, Isilon actively sets predefined ACEs and all files, at each run, are compared since the metadata is always different between your source and your destination.
-
The NFSv4 file system contains its own logic to correct a file mode according to an ACL applied to a file. This logic is regulated by an Isilon configuration (see Approximate Owner Mode Bits When ACL Exists in the Isilon documentation). As a result, there might be metadata differences between a source and a destination in a successfully replicated file when an Isilon and a non-Isilon file system are used.

This section provides information on how to troubleshoot ACL related replication issues that you might encounter when you use FlexSync with an NFSv4 source and destination, and when you enable the Preserve StorNext ACLs option.
A FlexSync replication task operates on a system with both a source and a target are mounted over NFSv4. For example, when you execute the command df, the output displays the following:
# df Filesystem 1K-blocks Used Available Use% Mounted on 10.65.173.168:/ifs/data/share1 254041381 2859392 251181989 1% /nfs/share1 10.65.186.24:/stornext/snfs1/backups 867041280 4860416 862180864 1% /nfs/backups
If you execute the command mount, then the output displays the following:
# mount 10.65.173.168:/ifs/data/share1 on /nfs/share1 type nfs4 (rw,relatime,vers=4.1,rsize=524288,wsize=524288,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.65.172.13,local_lock=none,addr=10.65.173.168) 10.65.186.24:/stornext/snfs1/backups on /nfs/backups type nfs4 (rw,relatime,vers=4.1,rsize=524288,wsize=524288,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.65.172.13,local_lock=none,addr=10.65.186.24)
Note: NFSv4 is used and FlexSync cannot replicate an ACL with NFSv3.
Instead of using a dedicated NFS client, it might be possible to perform an NFSv4 to NFSv4 replication where the source and/or target is a Linux system that contains a mounted StorNext file system over a SAN or LAN (DLC). You can use a "loopback" NFSv4 mount where the system runs nfsd to export the file system and mount it back on itself. FlexSync requires an NFSv4 client mount for both the source and destination.
To replicate an ACL, you must enable the Preserve StorNext ACLs option and your source and destination must be an NFS mount on your FlexSync system:
# flexsyncadmin -U admin -P password edit-task flexsync-rep mytask show-options | egrep 'source|destination|acls' "acls": true, "destination": "/nfs/backups", "source": "/nfs/share1",
In addition, the NFSv4 servers on both sides of the replication must have NFSv4 ACL support enabled. Do the following on an Isilon system:
-
Use the OneFS GUI to confirm that the Access > ACL policy settings is properly configured.
-
Set the NFSv4 Domain to the short form Netbios name in the Protocols > Unix sharing (NFS) > Edit NFS zone field. For example, for mycompany.domain.com, use mycompany.
-
Disable the NFSv4 replace domain option.
-
For a StorNext file system, you must run the NFS on a Quantum appliance that contains a qtm Linux kernel. The Quantum version of the kernel allows an ACL to be passed directly from the NFS server to the cvfs.ko driver.
Note: Quantum appliances that run the latest version of StorNext use the qtm kernel.
Example
# uname -r
3.10.0-1160.36.2.el7.qtm.x86_64
-
Configure your /etc/imapd.conf file on your NFS server(s) that you use to export a StorNext file system to have its Domain parameter set to the NFSv4 domain name, if it is different from the DNS domain name.
-
Configure your source and destination to use the same directory service in order to validate names and IDs for users and groups. Typically, both sides use the same Active Directory server.
Verify an ACL
After you configure your system(s), perform an ACL verification to confirm it functions properly. The ACL verification requires you use the NFSv4 ACL CLI available on Linux. On a RedHat system, the commands are included in the nfs4-acl-tools package. In the example below, replace username@mycompany.domain.com with a valid user in the customer domain. You must execute the command against both the source and the target NFSv4 mounts.
$ touch testfile
$ nfs4_setfacl -a A::username@mycompany.domain.com:rxtncy testfile
$ nfs4_getfacl testfile
# file: testfile
A::username@mycompany.domain.com:rxtncy
If a command fails, then you must resolve any underlying issues to prevent your FlexSync replication task from failing.
Note: FlexSync uses the same mechanism as the nfs4_getfacl and the nfsv4_setfacl to retrieve and update an ACL.
When you work with an ACL, the two most common errors you might encounter are EOPNOTSUPP (operation not supported) and EINVAL.
Example of EOPNOTSUPP when running nfs4_setfacl:
# nfs4_setfacl -a A::username@mycompany.domain.com:rxtncy testfile
Operation to request attribute not supported: testfile
Failed to instantiate ACL.
You might encounter a similar error during a replication task if an ACL is found on an NFSv4 source, but the destination does not support ACLs. If the destination is not an NFSv4 mount, then FlexSync detects this and displays a warning:
flexsync_vfs_posix_set_nfsv4_acl: system.nfs4_acl(NFSv4 Acl) detected, but destination filesystem doesn't support NFS Acls. NFS Acls will be ignored. To avoid this message, turn off ACL preservation, or correct the destination to support NFS Acls.
If your FlexSync destination directory does not support an NFSv4 ACL, then it might be due to a configuration issue (see Configure).
You might encounter an EINVAL error when you set an ACL on a file. The EINVAL error is due to an ACL that the NFS server considers invalid and causes it to be rejected. The most common issue is there is one or more ACEs in an ACL that contains a user or group that cannot be mapped to a UID or GID. You can reproduce the issue when you use nfs4_setfacl and specify a username that does not exist. For example:
# nfs4_setfacl -a A::bogus@mycompany.domain.com:rxtncy testfile
Failed setxattr operation: Invalid argument
The FlexSync replication results in an error message such as:
20220616 10:56:02 flexsync: starting enqueue compare, source /nfs/share1, destination /nfs/backups/share1
20220616 10:56:02 Failed to set NFSv4 ACL: check ACE 'bogus@mycompany.domain.com: ALLOW, mask 0x100002, flags 0x0' on destination.
20220616 10:56:02 Replication completed with errors. 0 files/segments copied 0.00 B moved.
20220616 10:56:02 Task completed with errors.
In general, there are several reasons you might encounter an EINVAL error during a FlexSync replication task including:
-
A user or group in an ACE for a file in the source is "local" (non-domain) and there is no corresponding user/group name on the target.
-
A user or group in an ACE for a file in the source is a Windows "built-in" user or group.
-
The destination has a communication problem with the directory server.
-
The source and target do not use the same directory server.
-
Potential issues with other ACE attributes such as a bad mask or bad flags.
Assuming a stable, properly configured system, number 1 and number 2 are the most likely causes for EINVAL. As illustrated above, the task report displays information about problematic ACEs but not file names. To view the full paths, enable task debug for flexsyncd, as follows. Include the following line the file /etc/sysconfig/flexsyncd:
Note: You might need to create the file.
After you configure the file, you must restart the flexsyncd daemon:
If you enable the debug=task-report option, then your task report contains a line. For example:
The log file contains an ACL replication failure with paths. For example:
To further investigate a user/group mapping failure, you can enable debugging for idmapd running on the NFS server if the target is running Linux. This daemon runs in a user space and acts as a "helper" for nfsd. By default, idmapd logs to /var/log/messages. However, you can provide the daemon its own log file by creating a file titled /etc/rsyslog.d/idmapd.conf that contains the following content:
# cat /etc/rsyslog.d/idmapd.conf
if $programname == "rpc.idmapd" then {
action(type="omfile" file="/var/log/idmapd.log")
}
For your change to take effect, you must restart syslog:
# systemctl stop rsyslog.service
# systemctl restart systemd-journald.socket
# systemctl start rsyslog.service
You can also increase the verbosity level of idmapd. Do the following to modify the /etc/idmapd.conf file.
Change:
[General]
#Verbosity = 0
to:
[General]
Verbosity = 8
Then restart nfs-idmapd:
Below is an example of a mapping failure logged by idmapd:
Jun 16 21:34:05 mdc2 rpc.idmapd[7985]: nss_getpwnam: name 'bogus@mycompany.domain.com' domain 'mycompany.domain.com': resulting localname 'bogus'
Jun 16 21:34:05 mdc2 rpc.idmapd[7985]: nss_getpwnam: name 'bogus' not found in domain 'mycompany.domain.com'
Jun 16 21:34:05 mdc2 rpc.idmapd[7985]: nfs4_name_to_uid: nsswitch->name_to_uid returned -2
Jun 16 21:34:05 mdc2 rpc.idmapd[7985]: nfs4_name_to_uid: final return value is -2
Jun 16 21:34:05 mdc2 rpc.idmapd[7985]: Server : (user) name "bogus@mycompany.domain.com" -> id "99"
Do the following to confirm the user bogus does not exist on the NFS server:
# getent passwd bogus
(nothing.)
Note: For a group that is not mapped, execute getent group bogus.
Workaround for an ACE that Contains a User and Group that is not Mapped
Quantum recommends you adjust an ACL on the source system to not include an ACE with a user and group that you cannot map to a UID/GID by rpc.idmapd on the target NFS server. You must remove an ACE or, if possible, replace it with equivalent ones that use a domain user and group.
If you are not concerned about a non-essential ACE that is present in the source and is not mapped in the destination, then enable the Continue on meta failure option in the Advanced section of your task configuration. The option allows you to apply only an ACE that you can map on the destination.
Advanced Debugging with a cvdb Trace
The Diagnose section described how to troubleshoot and address common cases for an EOPNOTSUPP and an EINVAL error. However, you might need to perform advanced troubleshooting for obscure issues or errors that require you debug on the NFS server.
Do the following to collect a cvdb trace on a StorNext system.
-
Execute the following commands on your NFS server to enable a trace:
# /usr/cvfs/bin/cvdbset md_vnops cvsubr md_vfsops fsmvnops vnops cvnc
# /usr/cvfs/bin/cvdb -R 256m
-
Execute the following command to create an output directory:
# mkdir -p /tmp/scratch/cvdb-files -
Execute the following command to begin the trace at the output directory /tmp/scratch/cvdb-files:
# /usr/cvfs/bin/cvdb -g -C -F -n 1800 -N /tmp/scratch/cvdb-files/cvdbout -
Execute the following command to stop the trace:
^C
The process generates cvdbout.* (for example, cvdbout.000000, cvdbout.000001, ...) files in the output directory /tmp/scratch/cvdb-files directory.
Do the following to view information about an ACL operation.
-
Execute the following command on your NFS server:
# grep -R '_acl_TMP' /tmp/scratch/cvdb-files -
Execute the following command on an Isilon server to enable NFS debugging:
# isi nfs log-level modify debugNote: This increases the verbosity of log messages in /var/log/nfs.log.
-
Execute the following to restore the nfs debugging level:
# isi nfs log-level modify warning