Museum

Home

Lab Overview

Retrotechnology Articles

⇒ Online Manual

Media Vault

Software Library

Restoration Projects

Artifacts Sought

Related Articles

intro(1sql)

sql(1sql)

intro(8sql)

statdump(8sql)

optimizedb(8sql)

Name

optimizedb − generate statistics for use by the optimizer. 

Syntax

optimizedb [options] [ULTRIX/SQL options] dbname
[ -rtablename [-acolumnname]... ]...  optimizedb −zf filename

Description

The optimizedb command retrieves values from all tables and columns in the specified database or (if the −r and −a options are used) from specific tables and columns. These values are used to generate statistics, which are stored in system catalog base tables and can be seen through the iistats and iihistograms views of the Standard Catalog Interface. The statistics generated by optimizedb are used by the ULTRIX/SQL query optimizer to select an efficient data retrieval strategy for execution of SQL statements.

You should run optimizedb to generate statistics for all columns that may appear in the search criteria specified in an SQL statement. Statistics for columns named in the select clause of a subselect expression or in the order by clause are not used by the query optimizer at run time.

After running optimizedb, you should run sysmod, especially if this is the first time you are running optimizedb on the database.

More complete and accurate statistics in the system catalogs generally result in more efficient query execution strategies, and therefore faster system performance. The process of generating such complete and accurate statistics may require some time, but a compromise between accurate statistics and the time to generate them can be achieved by specifying the −zx or −zs option.  Another compromise relies on how often you regenerate the statistics. The statistics need infrequent regeneration, usually only when a significant change has occurred in the distribution of a column’s values.  The statistics generated by optimizedb affect only the speed of execution and do not determine whether query execution can occur. A query will execute if there are no statistics for any columns whatsoever or if statistics are incorrect.

The statistics generated by the optimizedb command for any column consist of two basic elements: The number of unique values in the column and a histogram with a variable number of variable-width cells. The accuracy of the histograms can be controlled by the −zu# and −zr# options.  Increasing the number of cells in the histograms increases the amount of disk space required for the iihistograms table and thus increases somewhat the amount of disk space and time required by the query optimizer at run time.  However, the increased accuracy of the statistics generally results in more efficient data retrieval strategies when SQL statements execute. 

Note

While optimizedb is running, ULTRIX/SQL does not lock either the database or individual tables.

Options

−acolumnname
Generate statistics for columnname in the table specified by the −rtablename option.  By default, optimizedb generates statistics for all columns in the table.

−i filename
Direct optimizedb to read statistics from filename rather than generate statistics directly from values in the database.  For filename, specify the name of a file that has been generated by the statdump command using the −o option. The file generated by statdump is in ASCII format and can be edited. However, only two types of edits are acceptable: a) you can modify values, and b) you can add rows describing cells. Do not change the format of the file; that is, do not change the order in which data appears or add an incomplete new row. When the −r and −a options are used in conjunction with the −ifilename option, they act as filters, causing optimizedb to read in from filename only statistics that belong to the specified table or column.  The optimizedb command does not use the row and page count values in filename unless you also specify the −zp option.  Row and page count values are vital for correct operation of the database.  Give careful consideration to any filename edits that change row and page count values. 

−rtablename
Generate statistics for tablename.  By default, optimizedb generates statistics for all tables in dbname. 

−zf filename
Direct optimizedb to read a file specified by filename for all other command line options, database names, and additional command line parameters. This file must contain only one option per line (see Examples).  If the −zf option is specified, no other options or parameters can appear on the command line; rather, they must be included in the specified file. 

−zhPrint the histogram that was generated for each column. This option also implies the −zv option. 

−zkGenerate statistics for columns that are indexed or keys for the table in addition to columns specified on the command line. 

−zpDirect optimizedb to read the row and page count values in the file specified with the −i option and to store those values in the appropriate system catalog.  You can see the values through the iitables view of the Standard Catalog Interface. 

−zr#Set the maximum number of cells that an inexact histogram can contain to #.  In an inexact histogram, each cell represents a range of values. The default maximum number of cells for an inexact histogram is 15. 

−zs[s]#
Create statistics based on sample data. The percentage of table rows sampled is determined by the value of #. This value must be a floating-point number in the range of 0 to 100. If you specify the optional s, you direct ULTRIX/SQL to sort the tuple identifiers (TIDs), which are used to retrieve the sample rows, before the rows are retrieved. Sorting the TIDs decreases retrieval time but increases the amount of memory required by optimizedb.

−zvPrint information about each column as it is being processed. 

−zu#Set the maximum number of cells that an exact histogram can contain to #. In an exact histogram, each cell represents a unique value. The default for maximum number of cells in an exact histogram is 100.  If the number of unique values exceeds either the default or specified number (whichever applies to command execution), optimizedb creates an inexact histogram. (You can use the −zr option to control the number of cells in an inexact histogram.)  Note that optimizedb always attempts to create an exact histogram first, but if this is not possible, then creates an inexact histogram.

−zxDirect optimizedb to determine only the minimum and maximum values for each column rather than full statistics. Because minimum and  maximum values for columns from the same table can be determined by a single scan through the table, the −zx option provides a quick way to generate a minimal set of statistics. Minimal statistics cannot be generated for columns holding only null values. 

ULTRIX/SQL options
Direct optimizedb to pass the specified options to ULTRIX/SQL at SQL statement execution time. ULTRIX/SQL options are those listed for the sql command. See sql(1sql).

Examples

Generate full statistics for all columns in all tables of the empdata database:

optimizedb empdata

Generate statistics for key or indexed columns of the employee and dept tables of the empdata database and also generate statistics for the dno column of the dept table:

optimizedb -zk empdata -remployee -rdept -adno

Create a file named flagfile that contains options and parameters for the optimizedb command line.  For example, to specify the same statistics as the preceding example, flagfile would contain the following lines:

-zk
empdata
-remployee
-rdept
-adno

Direct optimizedb to read options and parameters from flagfile:

optimizedb -zf flagfile

Generate statistics for all key or indexed columns in employee and salhist tables in the empdata database and include in those statistics only the minimum and maximum values from the columns.  Print status information as each column is being processed:

optimizedb -zk -zv -zx empdata -remployee -rsalhist

Generate statistics for all columns in the employee table of the empdata database and allow up to 75 unique values from each column in the table before merging adjacent values into the same histogram cell:

optimizedb -zu75 empdata -remployee

Files

$II_DATABASE/ingres/data/default/dbname
Default directory for database files associated with dbname.  In addition, note that the directory that contains iidbdb, the ULTRIX/SQL system catalogs database, is $II_DATABASE/ingres/data/default/iidbdb.

See Also

intro(1sql), sql(1sql), intro(8sql), statdump(8sql)

Typewritten Software • bear@typewritten.org • Edmonds, WA 98026