periph.diag(8) — Stardent Computer Inc. (October 15, 1988)
NAME
periph.diag − tests the Titan peripherals
DESCRIPTION
This document describes the Titan peripheral diagnostics and describes their use.
BOARDS THAT MUST BE INSTALLED
This test requires a CPU board, at least one memory board, an I/O board and an Adaptek Controller card installed in the I/O board.
DETAILS
These devices are supported:
1. SCSI PRIAM disk.
2. SCSI WANGTEK 1/4" tape.
3. SCSI HP 1/2" tape.
The following notes apply to these tests
•The peripheral diagnostic tests run stand-alone similar to the existing diagnostics.
•The tests use the standard diagnostic monitor for user interface.
•The tests are single-threaded, i.e. each device is tested separately without concurrency. Only one test, backup simulation, provides some concurrency; it copies data from one device to another using read and write commands.
•The peripheral diagnostics are intended for use by Field Engineering and Manufacturing Process, probably at the initial test, burn in, and final acceptance test stations.
•SCSI disconnect is always enabled on all commands.
RUNNING THE TESTS
WARNING: there is no hardware for executing SCSI bus reset from the Ardent machine. This means that if the bus is hung, the user must reset the machine from the PROM by entering "reset" before booting the test.
WARNING: The format test contained here only tests the ability of the system to format a disk. IT DESTROYS THE CURRENT DISK DATA AND PARTITIONING INFORMATION. If you use the format command, you will have to re-format the disk using the standard system format command.
There are two ways to interface with the program; using command line arguments, and using menu selections. To run the diagnostic from the PROM prompt, enter:
PROMx> b periph.diag
A menu will be displayed. The user may run tests or change fail limits from the menu interface. The first menu displayed is called the top menu. From the top menu, the user can select second level submenus, such as the disk and tape submenus.
It is also possible to run tests directly without going through the menu. This is done by entering:
PROMx> b periph.diag all
SPECIAL COMMAND LINE PARAMETERS
In addition to the command line arguments supported by the diagnostic monitor, the following are offered:
all This causes the program to find the devices present in the system, then run all the tests on them. All disks and tapes found will be tested.
alldThis causes the program to find the devices present in the system, then run all the disk tests on the first disk found.
alltThis causes the program to find the devices present in the system, then run all the tape tests on the first tape found.
dm This causes the program to skip the top menu and display the disk menu directly. Normally, the disk menu is a submenu selected from the top menu.
tm This causes the program to skip the top menu and display the tape menu directly. Normally, the tape menu is a submenu selected from the top menu.
dco This allows the user to specify the disk controller number to be used when selecting disk tests. An equal sign, ’=’, then an unsigned integer value must follow the argument (no spaces). The default value of this argument is -1 indicating that the program will determine the disk controller number when it finds the devices present in the system. If the user specifies a value, e.g. dco=1, then the specified controller number will be used.
dtarThis allows the user to specify the disk target number to be used when selecting disk tests. An equal sign, ’=’, then an unsigned integer value must follow the argument (no spaces). The default value of this argument is -1 indicating that the program will determine the disk target number when it finds the devices present in the system. If the user specifies a value, e.g. dtar=1, then the specified target number will be used.
tco This allows the user to specify the tape controller number to be used when selecting tape tests. An equal sign, ’=’, then an unsigned integer value must follow the argument (no spaces). The default value of this argument is -1 indicating that the program will determine the tape controller number when it finds the devices present in the system. If the user specifies a value, e.g. tco=1, then the specified controller number will be used.
ttarThis allows the user to specify the tape target number to be used when selecting tape tests. An equal sign, ’=’, then an unsigned integer value must follow the argument (no spaces). The default value of this argument is -1 indicating that the program will determine the tape target number when it finds the devices present in the system. If the user specifies a value, e.g. ttar=1, then the specified target number will be used.
rconThis commands the program to prompt the user for confirmation before reassigning a defective disk block detected during defect scanning. When prompted, the user may select ’y’ to reassign the defective block or ’n’ to bypass it.
ss This causes the program to run in single step mode. The program prompts the user before executing any SCSI command. When prompted, the user may press ENTER to execute the command, enter "s" to skip it, or "g" to execute the command and disable single step mode. On-line h.PP is also available by entering "h".
DISK DIAGNOSTICS DESCRIPTION
This describes the SCSI disk diagnostic test.
Disk Parameters
The SCSI READ CAPACITY command will be used to determine disk parameters, such as the block size and number of blocks. Fail limits and test parameters are changeable by the user through the disk parameter submenu.
Disk Write Levels
Three levels of disk write testing are available:
1.Non-destructive (no writing). All parts of the test that write to the disk are skipped. The test can be run on a user disk without causing any damage to its data. This is the default level.
2.Writing is only allowed on the diagnostic partition only. The test can run on a user disk without causing damage to its data, unless there is a disk seek or servo problem, in which case user data can get destroyed. The diagnostic partition is a small area of the disk allocated by the format program. Old versions of the format program did not support the diagnostic partition. The diagnostic test will detect the absence of the diagnostic partition and will not write to the disk.
3.Writing is allowed on the entire disk (destructive). The test will overwrite whatever data is on the disk, therefore, the user must backup the disk before running this level if the disk data needs to be preserved.
Disk Tests
Before any disk test is executed, some initial checks are done. First the I/O board fuse and bus state are checked. The SCSI INQUIRY command is executed and the device type is verified. The SCSI REQUEST SENSE command is executed to clear any pending SCSI UNIT ATTENTION condition. Then the SCSI READ CAPACITY is executed to determine the size and block length. If any of these checks fails, the test ends.
•SCSI check. The objective is to verify that the I/O board can communicate with the target and that the target can execute basic SCSI commands used by the operating system. This excludes the FORMAT and REASSIGN BLOCK commands. The WRITE command may or may not get executed depending on the write test level. The SCSI disconnect/reconnect feature is also verified during the read and write operation.
•Performance check. The objective is to make transfer rate benchmark measurements. Four benchmarks are performed; sequential (sustained) read transfer rate, sequential (sustained) write transfer rate, random access read transfer rate, and close random access read transfer rate. Sustained read transfer rate is done by reading 1.28 MB, 64 KB at a time starting from block 0. Sustained write transfer rate is skipped if the write level is 1, or done by writing 1.28 MB, 64 KB at a time starting from block 0 (write level 3) or within the diagnostic partition (write level 2). Random transfer rate is done by reading 100 KB, 1 block at a time from random block addresses. Close random transfer rate is done by reading 100 KB, 1 block at a time from random block addresses within the first 2000 blocks.
•Self-test. This is done by executing the SCSI SEND DIAGNOSTIC command to the drive, causing it to execute self-diagnostics and return results.
•Volume header check. This is done by calling the SCSI driver to open the disk volume header partition, reading the volume header, checking the magic number and checksum, then verifying the diagnostic partition.
•Sequential defect scan. The objective is to scan blocks sequentially, looking for defects. This is done by writing, reading, and checking for medium or recovered errors detected by the disk.
•Random defect scan. The objective is to scan blocks randomly, looking for defects. This is done by writing the disk with a sequential pattern, then reading the disk using random addresses and checking for defects.
DEFECT SCAN STRATEGY
•The drive’s defect detection and error reporting are used.
•Blocks with RECOVERED ERRORS as well as MEDIUM ERRORS are reassigned (mapped out) and logged as soon as detected. The reassigned block is rewritten with the pattern.
•At the end of the test, counts of the RECOVERED and MEDIUM errors are reported and the drive is passed or failed.
•The STOP AFTER X ERRORS capability of the diagnostic monitor cannot be used for this since defects are not necessarily errors.
TAPE DIAGNOSTICS DESCRIPTION
The tape diagnostic requires a scratch tape to be mounted on the drive. The test writes test patterns to the tape and, therefore, destroys any previous data on the tape.
Tape Tests
Before any tape test is executed, some initial checks are done. First the I/O board fuse and bus state are checked. The SCSI INQUIRY command is executed and the device type is verified. The tape is checked for Write Protect errors. The drive is put in BUFFERED mode using the SCSI MODE SELECT command. Finally, the SCSI REWIND command is issued in case the tape needs to be rewound. If any of these checks fails, the test ends.
•SCSI check. The objective is to verify that all the SCSI commands used by the operating system can be executed, except the ERASE command.
•Self-test. This is done by executing the SCSI SEND DIAGNOSTIC command to the drive, causing it to execute self-diagnostics and return results.
•Erase test. The SCSI ERASE command is executed. This takes about 10 minutes.
•Sequential write/read. The objective is to write all blocks sequentially, then reading them back.
ASSUMPTIONS
•The I/O board SCSI channels, DMA hardware, memory, SCSI cables, and other system hardware are tested using other diagnostics.
•The drive is already configured and formated as the customer will see it using the MODE SELECT and FORMAT commands. This is done at the receiving inspection station and is not part of the diagnostic.
POSSIBLE FUTURE ENHANCEMENTS
The following are ideas that can be incorporated in the future:
•VME and SMD diagnostics.
•Multi-threaded operation. More than one device can be tested at the same time using the SCSI disconnect feature. This would cut down test time and emulate real life usage.
•Multiple disk pattern support: The test can support more than one pattern, say 3, low frequency, high frequency, and worst case. Defect scan would use all different patterns. The program would determine the patterns to use automatically depending on the drive model. For example, worst case pattern may be different From one disk model to another, the program automatically uses the correct worst case pattern depending on the model. Mix patterns during the random defect scan. For example, the drive may be written with one pattern, then a number of random blocks are written with a different pattern, then reading is done.
•SCSI command level debug environment: Provide the user with the capability to execute SCSI commands using commands like read and write. Provide variables that the user can set, such as block address, which are used with SCSI commands. Allow the user to modify and display the data buffer. Provide error information.
•A new parameter for stopping if a number of disk defects is reached.
•Tape block size test. The tape is written and read using all possible block sizes.
MENU ITEMS SPECIFIC TO THIS TEST
The menu structure is as follows:
top menu
|
|
---------------------------------
| |
disk submenu tape submenu
| |
| |
disk parameter submenu tape parameter submenu
The top menu allows the user to run all tests, all disk tests, all tape tests, and select submenus. The disk submenu allows the user to run one or all disk tests, and select the disk parameter submenu. The disk parameter submenu allows the user to change disk test parameters and fail limits. The tape submenu allows the user to run one or all tape tests, and select the tape parameter submenu. The tape parameter submenu allows the user to change tape test parameters and fail limits.
TOP MENU
In addition to the menu selections supported by the diagnostic monitor, the following are available:
dm Select the disk submenu.
tm Select the tape submenu.
nd= Change the expected minimum number of disks in the system. The expected minimum number of disks is used in the "all" option only. If the number of disks present in the system is less than the expected minimum number of disks, an error message is reported and the test is considered to have failed.
nt= Change the expected minimum number of tapes in the system. The expected minimum number of tapes is used in the "all" option only. If the number of tapes present in the system is less than the expected minimum number of tapes, an error message is reported and the test is considered to have failed.
findFind bus devices present in the system. The bus devices present in the system is found automatically at the beginning of the program. This option allows the user to find the present devices again. Finding present devices is done by issuing SCSI INQUIRY command to all possible target ID’s on all SCSI bus channels, starting from bus A target 0, and ending with bus B target 7.
alldRun all disk tests on one disk. The disk to test is the first disk found, starting from SCSI bus A, target ID 0. The bus and target ID of the disk may be changed using command line arguments or menu selections.
alltRun all tape tests on one tape. The tape to test is the first tape found, starting from SCSI bus A, target ID 0. The bus and target ID of the tape may be changed using command line arguments or menu selections.
bak Run the backup simulation test. The test is run only if there is a disk and a tape on the system. The disk and tape ID to use are those of the first disk and tape found on the system. The disk and tape addresses can be changed using command line arguments or menu selections. The test copies 640 blocks from the disk to the tape, 32 blocks at a time, using disk read and tape write commands. This test is run on one disk and tape as part of the "all" option also.
all Run all peripheral tests on all present devices.
In addition to the states supported by the diagnostic monitor, the following are available:
disksExpected minimum number of disks.
tapesExpected minimum number of tapes.
DISK SUBMENU
In addition to the menu selections supported by the diagnostic monitor, the following are available:
pm Select the disk parameter submenu.
chk Run the disk SCSI check test.
bm Run the disk benchmark test.
selfRun the disk self test.
vh Run the disk volume header test.
seq Run the disk sequential defect scan.
rnd Run the disk random defect scan.
fmt Format the disk. Note that this is not a test and is not executed as part of the "alld" or "all" options. This utility must be used very carefully, since it issues the SCSI FORMAT command, erasing the entire disk. This option can only be executed if the write level is 3. (See Disk Write Levels).
alldRun all disk tests on one disk, i.e. options:
chk bm self vh seq rnd
In addition to the states supported by the diagnostic monitor, the following are available:
controller Controller number of the disk to test.
target Target ID of the disk to test.
The controller and target numbers are determined by the program when it finds the devices present in the system. However, these values may be overwritten using command line arguments or menu selections.
DISK PARAMETER SUBMENU
In addition to the menu selections supported by the diagnostic monitor, the following are available:
co= Change controller number of disk to test.
tar=Change target ID of disk to test.
df= Change maximum number of defects on the disk to be used in the SCSI check and defect scan tests. The disk is failed if the number of defects exceeds this.
md= Change maximum number of medium defects (unrecoverable medium errors) detected during the defect scan tests. The disk is failed if the number of unrecoverable defects exceeds this.
rd= Change maximum number of recovered defects (recoverable medium errors) detected during the defect scan tests. The disk is failed if the number of recoverable defects exceeds this.
sr= Change minimum sustained transfer rate in bytes/second in the disk benchmark test. The disk is failed if the transfer rate is below this.
rr= Change minimum random transfer rate in bytes/second in the disk benchmark test. The disk is failed if the transfer rate is below this.
cr= Change minimum close random transfer rate in bytes/second in the disk benchmark test. The disk is failed if the transfer rate is below this.
lb= Change low block address to start defect scan from.
hb= Change high block address to end defect scan at.
rt= Change random defect scan number of loops.
pat Change disk write pattern by prompting the user to enter the pattern.
wr Change write level by prompting the user to enter the write level. Supported write levels are:
1 No disk writing allowed.
2 Writing can be done on the diagnostic partition.
The user is prompted for confirmation.
3 Writing is done on the entire disk.
The user is prompted for confirmation twice.
ft Change disk SCSI format type by prompting the user to enter the format type. Supported format types are:
1 Format with no defects.
2 Format with factory defects only.
3 Format with all defects (factory and grown).
In addition to the states supported by the diagnostic monitor, the following are available:
wr Write level.
df Maximum number of defects.
md Maximum number of medium defects.
rd Maximum number of recovered defects.
sr Minimum sustained transfer rate.
rr Minimum random transfer rate.
cr Minimum close random rate.
lb Defect scan low block address.
hb Defect scan high block address.
rt Random defect scan number of loops.
ft Format type.
TAPE SUBMENU
In addition to the menu selections supported by the diagnostic monitor, the following are available:
pm Select tape parameter menu.
chk Run the tape SCSI check test.
self Run the tape self test.
ers Run the tape erase test.
wrrd Run the tape write/read test.
retn Tape retension. Note that this is not a test and is not executed as part of the "allt" or "all" options.
allt Run all the tape tests, i.e. options:
chk self ers wrrd
In addition to the states supported by the diagnostic monitor, the following are available:
controller Controller number of the tape to test.
target Target ID of the tape to test.
The controller and target numbers are determined by the program when it finds the devices present in the system. However, these values may be overwritten using command line arguments or menu selections.
TAPE PARAMETER SUBMENU
In addition to the menu selections supported by the diagnostic monitor, the following are available:
co= Change controller number of tape to test.
tar=Change target ID of tape to test.
wrb=Change number of blocks in the write/read test.
wrl=Change number of loops in the write/read test.
pat
Change tape write pattern by prompting the user to enter the pattern.
In addition to the states supported by the diagnostic monitor, the following are available:
wrbNumber of blocks in the write/read test.
wrlNumber of loops in the write/read test.
INTERPRETING THE ERROR CODES
The following errors may be generated by this test. (The actual error wording may differ, however the type of error is as summarized here.)
[TO BE EXPLAINED IN DETAIL]
/∗ general errors ∗/
#define ERR_TOO_FEW_DISKS 1
#define ERR_TOO_FEW_TAPES 2
#define ERR_FUSE 3
#define ERR_BUS_HUNG 4
#define ERR_PROGRAM_ERROR 5
/∗ scsi errors ∗/
#define ERR_TOO_MANY_DEFECTS 20
#define ERR_SCSI_COMMAND_ERROR 21
#define ERR_BAD_DEVICE_TYPE 22
#define ERR_SCSI_READ 23
#define ERR_SCSI_WRITE 24
#define ERR_NO_DISCONNECT 25
#define ERR_SCSI_SENSE 26
#define ERR_MISSING_SENSE 27
/∗ disk errors ∗/
#define ERR_NO_GOOD_BLOCK 40
#define ERR_SUSTAINED_READ_RATE 41
#define ERR_SUSTAINED_WRITE_RATE 42
#define ERR_RANDOM_TRANSFER_RATE 43
#define ERR_DVH_OPEN 50
#define ERR_DVH_READ 51
#define ERR_DVH_MAGIC_NUMBER 52
#define ERR_DVH_CHECKSUM 53
#define ERR_DVH_DIAGNOSTIC_PARTITION 54
/∗ tape errors ∗/
#define ERR_NO_FILEMARK 60
#define ERR_WRITE_PROTECT 61
September 29, 2021