1.0 Overview
The rasdumpoutput file included in the i2k snapshot contains information that is displayed as RAS tickets to the end user. When an issue occurs on a drive, Tape Alert data is collected by the library. With 95% of the Tape Alerts, an associated Fault Symptom Code (FSC) is also captured from the drive.
This document describes how to use a parsing program that will extract the Tape Alert and FSC data from this file and then import it into Excel, so that the data can be manipulated, in a simple and organized way, to understand the issues being seen in a simple and organized way. This document also discusses how to use the Excel data to create a PowerPoint file to present data to customers.
The ideas behind this utility are that snapshot data should be examined as the first stage of any escalation, and that an appropriate plan should be made to address each drive or media issue on the first visit to the site, rather than leaving the customer with a partially working solution that requires followup visits.
2.0 Parsing the rasDumpOutput File
There are three parts to the parsing program:
- Rasdump.exe - The parsing program
- Run_ras.bat - A batch file with the correct input parameters for the program
- RasDump_Analysis.xls - A spreadsheet with pivot tables to allow the data to be displayed in a way that can be effectively communicated
To run the program, put Rasdump.exe and Run_ras.bat in the same folder as the rasDumpOutput file and execute (double click on) Run_ras.bat.
This will create two files:
- rasdumpoutput.csv - The parsed data
- rasdumpoutput.txt - A log file that summarizes the program's activity and stores any error information
As a batch file, Run_ras.bat will run in a DOS box, which will automatically close when the program completes. If an error occurs, the DOS box will not close, and an error will be reported. If this occurs, send the original rasdumpoutput file and the associated .txt file to Steve Cooper for investigation.
3.0 Preparing the Data in Excel
- Open the rasdumpoutput.csv file in Excel. It will automatically load the data, correctly delimited.
- Open the template data file RasDump_Analysis.xls, which includes the pivot tables.
- In the rasdumpoutput.csv file, click on the box in the top left corner of the worksheet (to the left of A and above 1) to select the contents of the entire sheet.
- Copy the data.
- Switch to the RasDump_Analysis.xls spreadsheet and select the “Raw Data” worksheet.
- Select the entire worksheet by clicking the box in the top left corner again.
- Paste the new data .
Note: Do NOT Distribute the Excel spreadsheets to customers. Charts derived from these Quantum Tools should be copied/pasted into other documents such as Word or PowerPoint so that the formulas in the original spreadsheets can be kept proprietary to Quantum.
3.1 Viewing and Manipulating the Data – Initial Analysis
The procedure shown above creates a series of pivot tables to view specific aspects of the stored data. You can manipulate these as needed, but they should be useful as configured.
After you have added the new data to the “Raw Data” worksheet, you will need to refresh each pivot table and then use the drop-down menus to select the new customer or site name that has been added. When you've done this, the charts associated with each pivot table (on the worksheet next to the pivot table) will refresh, as well.
To refresh a pivot table, right-click in the data (center) area of the table and select Refresh Data.
Note: Please consult your local RTS if you have questions about how to refresh the pivot table/chart data.
The charts can be copied from Excel into PowerPoint or Word for presentation/discussion with Engineering or with customers. The procedure for copying charts into PowerPoint is discussed below.
4.0 Spreadsheet Manipulation and Features
The following sections discuss each of the spreadsheet tabs and their respective purpose.
4.1 FSC Chart
The FSC Chart is a summary of all Fault Symptom Codes (FSCs) seen by all drives in the library. You can use it to set the priority for the first issue to address. Taking care of the #1 issue will have the greatest impact on customer stability, in the shortest time. After the most significant issue is identified, it can be investigated using the other worksheets.
4.2 FSC Timeline
The FSC Timeline is a tool to review progress made in resolving the issues at a site. It creates a chart of FSCs by month, which can be provided to customers to confirm that the actions taken to date have made a positive impact. It can also be used (when data is available) to look backwards to see how an issue has developed over time, and to look for any external influences that might have introduced the issues seen.
4.3 Tape Alert Timeline
This chart is similar to the one for FSCs, but it gives Tape Alert data. Please note that many FSC 0000s are associated with Tape Alert 8. Tape Alert 8 is posted when the drive thinks it has hit a marginal condition, but no hard error is seen. If a true hard error occurs on the same Load/Eject cycle as the Tape Alert 8, the FSC for that issue may be shown with the Tape Alert 8. If there are no other issues, the TA 8 will be associated with an FSC 0000.
Note: In most cases, Tape Alert 8 is an annoyance, seen only with LTO-2. It has been removed in firmware version 8571.
4.4 Drive Over Eject Chart
This feature was introduced in version 1.10 of the rasdump parser, specifically to help identify all drives in a library that suffer from the IBM LTO-2 Over Eject issue. When this is seen, the drive needs to be replaced. It is impossible to identify this issue in drive logs or in the standard repair screen, so take care to ensure that the Over Eject issue is highlighted in the Oracle “Problem Description Summary” that is taken from the first line of the RAS ticket.
Normally, this will be something like:
- Verified-"Drive sled at [1,2,1,8,1,1](Drive brick) is outside Specification"
From the Library Management Console, navigate into the details of the repair ticket to identify a DRIVE OVER EJECT message seen further on in the RAS ticket. Without this extra attention, drives with Over Eject issues will continue to cycle through the field, with little hope of ever being identified in repair. Trends of Over Eject issues should be escalated.
4.5 Volser vs. Drive
This provides the first tool to attempt to isolate whether drives of tapes are causing the most significant issues. Typically, you will see media that causes issues in multiple drives, or a drive that has an issue with multiple pieces of media. The charts that give data on FSCs vs. drives and media Volsers can be used to further zone in on specific problems.
4.6 Volser vs. FSC
Looking at the Volser vs. Drive chart might make you wonder why a particular tape fails in a particular drive. The Volser vs. FSC chart is designed to answer this question. It looks exactly the same as the Volser vs. Drive chart, but it adds the FSC. This is where you can start to identify specific issues. For example, one piece of media that has one FSC associated with it suggests a strong correlation to a media problem.
Note: If a media issue exists, it may still just mean that a firmware issue has been identified, which is associated with a particular piece of media. Typically, this will mean that the media will still require FA, and potential corrective action.
4.7 Drive vs. FSC
This chart shows the correlation of specific drives to specific FSCs. Strong correlation here indicates a potential drive issue.
4.8 Specific FSC Chart
This chart is designed for studying a specific FSC. You should be familiar with pivot tables before manipulating this worksheet. It includes a drop-down menu called FSC_2 that you can use to select a specific FSC. The drives and media associated with that FSC will be displayed. You can then compare the contribution of each one to the issue to be established.
This chart is organized by date, so that long-term intermittent issues can be isolated and the affected drives removed from use.
Note: If this chart is to be presented to anyone, change the title to reflect the selected FSC code.
4.9 Specific Drive Chart
Like the Specific FSC Chart, this chart has a drop-down menu that lets you select a specific drive – ideally, the one showing most issues in the Drive vs. FSC chart discussed in section 4.7.
This chart has also been organized by date, so that long-term intermittent issues can be isolated and the affected drives removed from use.
5.0 Presenting Data to Customers
Typically, when pivot tables are used, very large files are created. Giving a customer a 9 MB - 25 MB Excel spreadsheet and then discussing it is difficult, to say the least. You can make things much easier by reviewing the raw data and then copying key parts of it into a PowerPoint presentation, which you can present to the customer.
This document includes a template that you can use for this purpose. It includes information on:
- Imported graphs from the spreadsheet
- Errors shown in the spreadsheet, so that the customer will understand what is going on at a very high level
Note: The RasDump_Analysis.xls file can be obtained from CSweb, local RTS of the Drive Engineering Team.
5.1 Copying Charts from Excel to PowerPoint
To copy a chart from Excel to PowerPoint:
- Make sure the chart is correct, including the title (right-click the chart and pick Chart Options to modify elements such as titles and fonts).
- When you've finished making all changes, click on the background area of the chart (any place where's there no data) to select the entire chart.
- Select Edit > Copy.
- Open a new or existing PowerPoint presentation (new is best).
- Select Edit > Paste Special.
- Select Picture (Enhanced Metafile). This step is IMPORTANT. If you don't do this, the entire spreadsheet may be pasted, creating a huge file.
- Click OK.
- Move and/or size the resulting picture to fit where you want it.
5.2 Not Sending Huge Files!
One thing about pivot tables is that they generate huge Excel spreadsheets. These are often in the range of 10 MB - 20 MB. However, these often shrink to 100 KB or so when they are compressed. It may be obvious, but in this case, zipping these files will make them a lot easier to transfer.
6.0 Rasdump Program Release Notes
The Release Notes below give useful information about using Rasdump Parser.
6.1 Rev 1.2
- Fixed handling of the optional "(last: Warning)" in both scanf2 and scanf11 -
- e.g. Scanf11 failed: T18: D0, Drives--Failed(last: Warning), RQ:34, Qual--Resolved, SN--"1210171549"
- Corrected the fprintf so that the scanf 11 error now says scanf11 instead of scanf2
- Initialize all rasdump variables to avoid garbage characters if the first RAS ticket does not start with a "Closed" or "Verified" entry (firmware updates are one case).
6.2 Rev 1.3
- Fixed handling of (Cleaning cartridge) in Scanf1
- e.g. Scanf1 failed: Closed-"Data cartridge CLNU06L2(Cleaning cartridge) has issued a Tape Alert 22",Label: ,Alert:0
6.3 Rev 1.4 (10/1/2007)
- Rasdumpoutput from Mark Antoniccio (rasdumpoutputf) does not have a 0xD at the end of line so changes were made to break on a 0xA. This removed decrementing the line # for a zero length line which was a previous feature, hopefully not needed for the rasdump function.
- Found that the change above broke the handling of files with a double 0xA at the end of each line. Added code to detect this condition - now all variations of 0xA and 0xD should be handled.
- Added SDLT Log Parsing support
- Handle the fact that Error Modifiers for SDLT drives do NOT have error codes
- Clean up the situation where the 0x0 Error modifier was being handled as if it had 8 characters.
6.4 Rev 1.5 (11/6/2007)
- Added code to suppress any commas input for the customer name or the site name.
- Added support for Tape Alerts from the Data Cartridge in scanf5
- i.e. fixed the following error:
- Scanf5 failed: "Data cartridge at [1,1,1,2,1,1] has issued a Tape Alert 33"
- Added support for media id's appearing in the middle of a Summary line in scanf9
- i.e. fixed the following error:
- Scanf9 failed: "Desc"[52 - Tape System Write Failure] type=LTO1 rev=5AU1 media=REDAB1002321 Write errors while writing system area on unload"
- Added an "Open Year" to the output data to allow better time management in the final spreadsheet, ready for the New Year
- Note: This feature can only work if an assumption is made to carry the year from one RAS ticket to the next one. This causes an issue where the first entry of the first RAS ticket of the New Year is shown as being from the previous year. This will cause an isolated January entry when the previous year is selected in the pivot tables. This is easy to fix by manually changing the entry on the raw data sheet. A change to the program to deal with this is not simple and cannot be considered at this time, but will hopefully occur in a future release.
6.5 Rev 1.6
- Interim Build - not released
6.6 Rev 1.7 (11/8/2007)
- Removed a left over print statement from 1.5 testing that leaves a "Year: 2007" entry in the .txt result file
- Fixed an issue where the drive serial number was not being picked up in the "interim" data
- Added support for "Open" Ras tickets so that Ticket numbers are correctly updated.
- Note: There is no close time or date for Open tickets in the final spreadsheet.
6.7 Rev 1.9 (3/17/2008)
- Added the ability to deal with Tape Alert 22 entries that refer to the drive location rather than the Cartridge Barcode, e.g.:
- Scanf1 failed: Closed-"Data cartridge at [1,1,1,1,1,1](Cleaning cartridge) has issued a Tape Alert 22
- Also applied to the Scanf14 case
- Improved scan capability to include interim entries starting with "Cleaning cartridge which will increase the number of TA 22 and TA 23 issues displayed.
- Corrected an issue that might cause FSC codes to appear incorrectly in column FSC_2, e.g '0xb002c0 instead of '2c07
6.8 Rev 1.10 (3/28/2008)
Some major changes:
- Added support for the LTO-2 DRIVE OVER EJECT Issue. This will allow a quick parse of the RasDumpOutput to show all drives that have had this issue in a single chart.
- Note: the information on DRIVE OVER EJECT issues is added to the Sense Key and ASC/ASCQ columns of the .csv file
- Changed the program so that now, only lines with complete data will appear in the output .csv file. Up until now entries without complete data may have introduced duplicate Tape Alerts making an issue seem bigger than it was.
- Added an extra column in the output file with the Drive Number - Useful if you only know that Drive x was removed, and not its serial number. Initial review suggests that data from the RAS ticket may not be 100% reliable.
- Cleaned up some spurious Cleaning related messages that were appearing in the log file, possibly introduced in Rev. 1.9
- Added a new command line flag - /no_ras_cust_data. When used, the program will use the Customer name "No Data Customer" and the site name "No Data Site". This can be used to automate the program when parsing large numbers of files with a batch file.
- NOTE: to work, this must be on the command line before the /i2krasdump flag.
6.9 Rev 1.12 (4/30/2008) (Rev 1.11 - not released)
- Scanf24 issue: A feature was added in 1.10 to fill the Ticket # and Drive # variables whenever possible. This encountered some unexpected sequences, causing the program to stop. A change has been made so the program will gather whatever data it can, but a failure will not be considered an error
- Decided to use the rasdumpoutput filename as the "Customer" name when /no_ras_cust_data is set on the command line.
- Corrected an issue where the error code (FSC equivalent) for some OVER EJECT issues was carried over from the previous entry
- Re-wrote the check for an OVER EJECT issue that is carried out before parsing this kind of Ticket. The original implementation allowed many Over Eject issues to go undetected.
- Item 2 in Rev 1.10 reduced the amount of duplicate data parsed. Unfortunately, this also eliminated some unique data (files which have nothing but Closed or Verified entries are one example.
- Changes were made to store information from the last information packet and to output the data if the next packet is not either an "Interim" or "Cleaning" entry. This will reduce the number of incomplete/redundant entries, but make sure that limited data is reported if this is all that is available. One unwelcome side effect of the lack of "Interim" tickets is that the "Firmware" and "Open Year" information will be missing from the parsed data for the tickets which have only "Closed" and "Verified" entries disabling key features in the pivot tables created in the spreadsheets.
- Added two new features:
- /ras_split_drive: This is for use with the rasdump parser when crunching large amounts of data which exceed the 65536 line limit in excel. When this option is used, the output data will be split into one output file per drive type.
- Note: When this option is used, four output files (one for each LTO version) will be created even if there is only data for one, two or three drive types available. The other files will have
- /ras_pick_year: This will allow data for a specific year to be displayed. The year has to immediately follow the flag, e.g.:
- Added a function to remove all duplicate entries where the Customer name, Ticket Number and Ticket Open Time are the same. This is primarily for the processing of multiple data files from the same site, but also has benefits where previous versions would report more than one occurrence of an issue if there were many "Interim" updates to the same ticket, even within one rasdumpoutput file.
- Another flag:
- /ras_just_remove_duplicates: This will only call the function to remove duplicates in the input file and output the data as a .csv file. It is used when data from multiple snapshots is concatenated and duplicates then need to be removed. The input file in this situation cannot have a .csv extension. This is used without /rasdump flag
Rev 2.0
The RASDump Parser Training Materials is zip file that contains a recorded presentation (requires the WebEx Player
) on how to use the RASDump Parser. Note: The content of the video that is included is a little out of date and does not represent the latest version on the tool. The associated word file has been updated and should be the first place for detailed information – View the video as well as it gives a good overview of using the data in the tool, specifically:
- Looking at the whole library, not individual issues
- Differences and improvements over 1.2 version
- Using RasDump 2.0
- Command line format
- Batch files
- Using the data
- Bring it into the spreadsheet
- Understanding pivot tables
- Communicating with customers
- Quick review
- Full library analysis
Download the actual parser utility under the Downloads section of the Scalar i2000 and i6000 page on CSWeb.
What's Next...
Drive Utility and Log Capture HP LTO >
Additional Resources