Troubleshooting Examples

Overview

The following sections describe some example troubleshooting steps.

Low Space

Source System

  1. The only option is to remove data from the share(s).
  2. Run space reclamation.

Target System

  1. Remove data from the share(s).
  2. Remove namespace replicated data that is no longer needed.
  3. Remove source systems as allowed source OR disable replication from the source systems to this target. This will remove the preposted replication tag files so that those tags are no longer protected.
  4. Run space reclamation.

Deleted Data, But Space Is Not Recovered

There is a chance that the tag for the deleted data is also referenced from another location – in that case, it is correct that the space is not recovered.

  1. If the replicated data was deleted, remove that source system as an allowed host so that the preposted continuous file will be removed . This could free some space.
  2. Customize collect logs  to collect bpw hold files.
  3. Customize collect logs to collect bpgc input files.

Note: In DXi 2.0 software, details have been documented in the /opt/DXi/scripts/collect files to show how Quantum Service/Support can extend the collect logs to gather more information from a customer system. There are many examples that are commented out.


Slow Replication

  1. Run netperf to verify network speed.
  2. Run top on source and target.
    • Identify high cpu/memory usage processes.
    • If bpgc, healthcheck, or something says something is not necessary, right now, stop that  process (using GUI or CLI).
    • If replicationd or some other process
      • Use gdb to attach to each thread and do a backtrace.
  3. Examine /var/log/blockpool_master.log file for clues.
    • Dynamically change log levels.
      • /opt/DXi/blockpool/bin/blockpool server set +FLogFileSeverity=<new level> +Nlocal@localhost

Missing Tags

  1. Replication logging of tags from source when it is actually replicated by the blockpool.
    • This is now done in the blockpool_master.log file, since the logging level is changed.
    • This is also logged by the replicationd app in tsunami_lc.log.
  2. Replication logging of tags coming in to target.
    • This is now done in the blockpool_master.log file, since the logging level is changed
  3. Logging of tag generation in bpw.
    • This is done in /var/log/DXi/bpwd_tag.log file (done).
    • bpwd_tag.log is included in collect logs.
  4. Logging of deleted tags.
    • This is done in the tsunami_lc.log file by the bpgc process.
    • This is also logged in the blockpool_master.log file.
  5. Log of bpw hold/drop calls (to preserve tag references).
    • This is done in the tsunami.log file.
  6. Extend collect logs to include any other specified bash command.
    • There is a 'collect-custom' script that is also in the /opt/DXi/scripts directory that allows additional bash commands  to be run.
    • There is a collect-custom-replication script with useful samples.
    • NOTE: The collect-custom script will be called, regardless of whether the node is in a cluster. There is no checking of syntax/time/space constraints for the commands run in this file. The person who customizes that script is fully responsible for supporting any effects that running the commands in that script have on the system. This needs to be used with caution.
  7. Track preposted files on source and target and tag associated with namespace/trigger bundle file

What's Next?

Assessment >

 

 



This page was generated by the BrainKeeper Enterprise Wiki, © 2018