Making Sense of it All (DRAFT)

Overview

 

The following graphs show a summary of the output from each of the six steps.  Each graph contains a description and bullet points of what to look for.

Data Volume Overview for the lifetime of the system:

  • Capacity never flattens out
  • Used disk space is currently 75 TB which is 7 TB over the truncation start of 68 TB
  • GC is constantly running
  • Reduction ratio is 10.91 : 1
  • Blockpool is 56.996 TB

Timeline determined from lifetime graph starting at the end of the last major capacity decline:

  • Same points as the graph from the lifetime of the system

Space reclamation graph using the timeline

  • Stage 4 hasn’t run since Week 52
  • Week 05 shows a Stage 2 operation running for over a week then goes to stage 1
  • The system is not able to complete GC

 

Ingest during timeline

  • The system is experiencing a lot of Resets and Aborts
  • Weeks 04 and 05 show some very large ingest and read patterns

 

Daily Ingest during timeline

  • Daily ingest rates are often twice the recommended maximum of 6 TB per day and are 3 times as much for a couple of days in weeks 04 and 05.

Capacity - Data Volume Overview

  • Capacity never flattens out
  • GC is constantly running
  • Reduction ratio is 10.98 : 1
  • Blockpool is 58.373 TB
  • There is an increase in blockpool capacity of 5 TB for the previous 9 weeks

 

Capacity - Used Disk Space during timeline

  • Spikes above the Truncation Start line indicate dedup backlog
  • There was a serious capacity problem between weeks 05 and 07

 

Reduction Ratio during timeline:

  • Reduction ratio is approximately 10 which is acceptable

Disk I/O during timeline

  • Disk I/O is under or at 70% which is acceptable
  • Over 70% is not acceptable

 


Here are all of the analysis points without screenshots:

 

Data Volume Overview for the lifetime of the system

  • Capacity never flattens out
  • Used disk space is currently 75 TB which is 7 TB over the truncation start of 68 TB
  • GC is constantly running
  • Reduction ratio is 10.91 : 1
  • Blockpool is 56.996 TB

 

Timeline determined from lifetime graph starting at the end of the last major capacity decline:

  • Same points as the graph from the lifetime of the system

 

Space reclamation graph using the timeline

  • Stage 4 hasn’t run since Week 52
  • Week 05 shows a Stage 2 operation running for over a week then goes to stage 1
  • The system is not able to complete GC

 

Ingest during timeline

  • The system is experiencing a lot of Resets and Aborts
  • Weeks 04 and 05 show some very large ingest and read patterns

 

Daily Ingest during timeline

  • Daily ingest rates are often twice the recommended maximum of 6 TB per day and are 3 times as much for a couple of days in weeks 04 and 05.

 

Capacity - Data Volume Overview

  • Capacity never flattens out
  • GC is constantly running
  • Reduction ratio is 10.98 : 1
  • Blockpool is 58.373 TB
  • There is an increase in blockpool capacity of 5 TB for the previous 9 weeks

 

Capacity - Used Disk Space during timeline

  • Spikes above the Truncation Start line indicate dedup backlog
  • There was a serious capacity problem between weeks 05 and 07

 

Reduction Ratio during timeline:

  • Reduction ratio is approximately 10 which is acceptable

 

Disk I/O during timeline

  • Disk I/O is under or at 70% which is acceptable
  • Over 70% is not acceptable

 

Conclusion

 

After using the methods in this playbook we can say, with confidence, that the customer is ingesting too much data. 

 

Positive performance points:

  • Reduction ratio looks fine
  • Disk I/O is OK
  • Blockpool is at 65% of total capacity

 

Negative performance points:

  • Daily ingest is too high
  • GC can’t complete and won’t catch up
  • Dedup is sometimes backlogged for weeks

 

Resolution options:

  • Implement additional DXi systems
  • Reduce the amount of data that is being ingested

  

 



This page was generated by the BrainKeeper Enterprise Wiki, © 2018