Museum

Home

Lab Overview

Retrotechnology Articles

⇒ Online Manual

Media Vault

Software Library

Restoration Projects

Artifacts Sought

Related Articles

ed(C)

grep(C)

lex(CP)

printf(S)

sed(C)


 awk(C)                        06 January 1993                         awk(C)


 Name

    awk: awk, oawk, nawk   - pattern scanning and processing language

 Syntax

    awk [ -Fsep ] [ [-e] 'prog' ] ... [ -f progfile ] ...
    [ [-v] var=value ... ] [ file ... ]

 Description

    awk is an interpreted pattern-matching language with a wide range of
    applications.  See the chapter on awk in the User's Guide for a complete
    discussion of its use.  (nawk and oawk are alternative versions of awk.
    awk should be used in preference to nawk or oawk. See ``Notes'' below for
    more details.)

    You can enter an awk program (prog) directly from the command-line,
    enclosing it in single quotes to prevent interpretation by the shell. The
    -e flag preceding prog is optional.  For longer awk programs, it may be
    more convenient to fetch them from a file (progfile); this is done with
    the -f option.  You can specify multiple -e programs and -f files; they
    are concatenated together (with intervening newlines) to form the program
    that is executed. (This is like the -e and -f options in sed(C).)

    Input files are read in order.  If no files are given on the command
    line, the standard input is used.

    You can change the awk field separator on the command line with the -fsep
    option, where the regular expression sep is the new delimiter.  You can
    also specify the field separator as a single character; this sets the
    field separator to be that character.  awk -Ft is a special case that
    sets the field separator to a tab.  (The field separator can also be
    changed within an awk program using the variable FS.)

    You can set the value of variables you are going to use in the awk pro-
    gram from the command line using var=value, where var is the variable and
    value is its initial value.  This can be preceded with an optional -v.

    What awk does with your program

    After awk checks the syntax of your program, it reads each record (gen-
    erally, each line) of the input and attempts to match it against the pat-
    terns specified in the program. For each pattern in the program, there
    may be an associated action performed when an input record matches the
    pattern. Actions can be made up of a single action statement, like print,
    or of a combination of statements.

    A pattern-action statement has the form:

       pattern { action }

    Either pattern or action may be omitted. If there is no action with a
    pattern, the matching line is printed. If there is no pattern with an
    action, the action is performed on every input line.

    Programming conventions

    Pattern-action statements, and individual statements within actions, gen-
    erally begin on a new line.

    The opening brace ({) must be on the same line as the pattern for which
    the actions should be performed.  Multiple action statements may appear
    on a single line if they are separated by semicolons (;).

    A newline can be hidden with a backslash (\), so you can use backslash-
    newline to continue a long line.

    Comments in awk are introduced by a number sign (#) and end with the end
    of the line. Comments can appear anywhere in a line.

    Blank lines and whitespace (blanks and tabs) in an awk program are
    ignored.

    Fields, records, and built-in variables

    awk presumes that each field in a record is separated by whitespace, and
    that each record consists of one line of input.  Both of these defaults
    can be modified.

    You can change the field separator on the command line, as discussed ear-
    lier, using the -Fsep option.  You can also reset the value of the input
    field separator variable FS from within your awk program.  FS can be set
    to any regular expression.  The following action is a special case that
    resets FS to its default behavior:

       BEGIN { FS = " " }

    The BEGIN in this example is a special pattern that matches before the
    first record is read; this is the mechanism awk provides for doing intro-
    ductory processing.

    Setting FS to a single blank is equivalent to:

       BEGIN { FS = "[ \t]+" }

    That is, setting FS to a single blank tells awk to regard any combination
    of blanks and tabs (any whitespace) as a field separator.  Note that once
    you set the input field separator to something other than a single blank
    (that is, to all whitespace), leading whitespace (before the first field)
    is no longer ignored.

    awk is designed to consider each line of input as a complete record, but
    you can get awk to recognize multiline records by resetting the variable
    RS.

    To get awk to recognize multiline records, set RS to the null string:

       BEGIN { RS = "" }

    Now, awk will presume that records are separated by one or more blank
    lines.  When you reset RS like this to use multiline records, newline is
    always considered a field separator, no matter what the value of FS is.
    To restore the default record separator, reset RS to a newline:

       { RS = "\n" }

    You can address any field in the input record using the syntax $1, $2,
    etc., where $1 is the first field in a record, $2 is the second field,
    and so on.  The entire record is referred to as $0.

    Fields can also be referred to in relation to the built-in field vari-
    ables, for example, for a five-field record:

       $(NF - 2)

    would refer to the third field.  The NF in this example is a built-in
    variable awk provides that counts the number of fields in a current
    record. (Thus, $NF refers to the last field in the current record.)

    The following list shows all the built-in variables in awk:

    _________________________________________________________________________
    Variable               Meaning
    _________________________________________________________________________
    ARGC                   number of command-line arguments plus 1
    ARGV                   array of command-line arguments (ARGV[0 ... ARGC-
                           1])
    ENVIRON                array of environment variables, indexed by the
                           name of the variable
    FILENAME               name of current input file
    FNR                    input record number in current file
    FS                     input field separator (default: any whitespace)
    NF                     number of fields in current input record
    NR                     number of records read so far
    OFMT                   output format for numbers (default: "%.6g"; see
                           printf(S))
    OFS                    output field separator (default: blank)
    ORS                    output record separator (default: newline)
    RS                     input record separator (default: newline)
    RSTART                 index of first character matched by match()
    RLENGTH                length of string matched by match()
    SUBSEP                 separates multiple subscripts in array elements
                           (default: ``\034'')


    Patterns

    Patterns can be any of the following:

       BEGIN
       END
       /expr/
       relational expression
       pattern && pattern
       pattern ||pattern
       (pattern)
       !pattern
       pattern1,pattern2

    BEGIN and END match before the first line is read, and after the last
    line has been read, respectively.

    All other patterns can contain extended regular expressions, like in
    egrep.  See grep(C) and ed(C) for the pattern-matching syntax of extended
    regular expressions.  (In the following discussion, extended regular
    expressions will be referred to simply as regular expressions.)

    You can create a string matching pattern using a regular expression in
    one of three ways:

    /regexpr/             This will match the current record if regexpr is
                          contained anywhere in the current record.

    expression ~ /regexpr/
                          This will match if regexpr is contained anywhere in
                          the string value of expression.

    expression !~ /regexpr/
                          This will match if regexpr is not contained any-
                          where in the string value of expression.

    A relational expression is made up of two numeric or string expressions
    compared with one of the following operators:

    _________________________________________________________________________
    Operator                         Meaning
    _________________________________________________________________________
    <                                less than
    <=                               less than or equal to


    >                                greater than
    >=                               greater than or equal to
    ==                               equal to
    !=                               not equal to

    When strings are compared using relational operators (<, <=, >, >=), they
    are compared character by character using the sort order provided by the
    machine, which is usually the ASCII sort order.  One string is less than
    another string if it would appear earlier (before) the other in the sort
    order.

    When one operand in a relational expression is a string, the other
    operand is converted to a string as well and they are compared using the
    method described above.

    Patterns can be joined using the logical operators && (AND) and || (OR).
    When patterns are joined like this, the pattern matches the current
    record if the entire pattern evaluates to true (nonzero or nonnull).  A
    pattern can be negated using the ! logical NOT operator.  Parentheses may
    be used for grouping patterns.

    pattern && pattern matches a record when both the first pattern and the
    second pattern match the record.

    pattern ||pattern matches a record when either the first pattern or the
    second pattern matches the record.

    !pattern means ``does not match pattern.''  That is, !pattern matches
    every record that is not matched by pattern.

    pattern1, pattern2 defines a matching range. The accompanying action is
    performed for all records that match from the first occurrence of pat-
    tern1 to the following occurence of pattern2, inclusive.  (The action is
    performed for the lines containing pattern1 and pattern2, as well as all
    the lines in between.)

    Actions

    The actual work your awk program does occurs in the action part of the
    program.

    Action statements can be made up of:

         +  expressions (numeric and string constants, variables, array
            references, and so on)

         +  flow control statements (branches or loops)

         +  built-in arithmetic or string functions or functions you define
            yourself

    Variables in awk are not explicitly declared; they simply spring into
    existence when they are first used.  awk determines from the context
    whether a variable is numeric or string.  Numeric variables are automati-
    cally initialized to 0; string variables are automatically initialized to
    the empty string ("").  (See ``Number or string'' below, and the chapter
    on awk in the User's Guide for more information about variable types and
    type coercion in awk.)

    Values are assigned to variables in the usual way in awk:

       a = 100

    creates a numeric variable a with the value ``100''.  You can assign
    several variables in a single statement:

       water = oil = "wet"

    This creates two string variables, water and oil, and sets them both to
    contain the string ``wet''.

    Assignment operators are evaluated from right to left.

    The following assignment operators are available; the shorthand assign-
    ment notation is borrowed from the C programming language:

    _________________________________________________________________________
    Operator  Meaning
    _________________________________________________________________________
    a=b       set a equal to b
    a+=b      set a equal to a + b
    a-=b      set a equal to a - b
    a*=b      set a equal to a * b
    a/=b      set a equal to a / b
    a%=b      set a equal to a % b; a becomes the remainder of a divided by b
    a^=b      set a equal to a ^ b; a becomes ab

    awk offers the usual arithmetic operators:  ``+'' (add), ``-'' (sub-
    tract), ``*'' (multiply), ``/'' (divide), ``%'' (modulo; divide and give
    remainder),  ``^'' (exponentiation; ``**'' is a synonym).  The unary
    ``+'' (plus) and ``-'' (minus) are also available.

    All arithmetic in awk is done in floating point.

    Relational expressions in action statements use the same operators as
    relational expressions in patterns; consult the relational operators
    table in ``Patterns'' above.

    The logical AND and logical OR (&& and ||) are also available, as well as
    the logical NOT (!, as in !expr).

    There is also a conditional operator: ``?'':

       expression1 ? expression2 : expression3

    expression is evaluated, and if it is non-empty and non-zero, then the
    expression has the value of expression2.  Otherwise, it has the value of
    expression3.

    Variables can be incremented using prefix or postfix notation, as in C.
    x++ and ++x are both equivalent to x = x + 1, and x-- and --x both are
    equivalent to x = x-1. The difference between prefix (++x) and postfix
    (x++) is when x assumes its new value.  In prefix notation, x is immedi-
    ately incremented; in postfix notation, the current value of x is used
    and then x is incremented.

    Parentheses can be used to alter the order of evaluation in arithmetic
    and relational expressions.

    The following table of precedence shows all the available action state-
    ment operators and the order in which they are evaluated.  The table is
    in decreasing order of precedence; operators higher in the table are
    evaluated before operators lower in the table.

    _________________________________________________________________________
    Operator                        Meaning
    _________________________________________________________________________
    $                               field
    ++  --                          increment, decrement (prefix and postfix)
    ^                               exponentiation (** is a synonym)
    !                               logical negation
    +  -                            unary plus, unary minus
    *  /  %                         multiply, divide, mod
    +  -                            add, subtract
    (no explicit operator)          string concatenation
    <  <=  >  >=  !=  ==            relationals
    ~   !~                          regular expression match, negated match
    in                              array membership
    &&                              logical AND
    ||                              logical OR
    ?:                              conditional expression
    =  +=  -=  *=  /=  %=  ^=       assignment

    All of these operators are evaluated from left to right (they are left
    associative), except for the assignment operators, the conditional
    expression operator, and exponentiation, which are evaluated from right
    to left (they are right associative).

    Arrays

    One-dimensional arrays are available in awk.  Like other variables in
    awk, arrays and array elements do not need to be declared; they come into
    existence upon their first use.

    awk allows you to use strings as array subscripts; arrays that do this
    are called associative arrays.  This lets you group together data quite
    simply.

    Say we have a data file listing employee names, department names, and the
    number of sick days the employee has taken:

       Steve           Engineering     2
       Chris           Engineering     1
       Susannah        Documentation   0
       Vipin           Sales           2
       Connie          Marketing       3
       Matt            Documentation   1
       Nancy           Sales           1
       Nigel           Documentation   0

    The first field, $1, contains the employee name; the second field, $2,
    contains the department, and the third field, $3, contains the number of
    sick days for that employee.

    To accumulate the number of sick days in each department:

       { sickness[$2] += $3 }

    This creates the array sickness, which uses the values in the second
    field (``Engineering'', ``Documentation'', ``Sales'', and ``Marketing'')
    as its subscripts. The sick day totals in field three are then collected
    under the appropriate subscript.

    The construct:

       for (i in arr) statement

    does statement for every subscript i in the array arr.  Subscripts are
    looped over in a random order.  If the value of i is changed within
    statement, unpredictable results may occur.

    The split function splits input into subscripts in an array.  It takes
    the form:

       split(string,arr,fs)

    where string is the string you want to split, arr is the array into which
    you want to split it, and fs is the field separator on which you want to
    split.  The first component of string is stored in arr[1], the second in
    arr[2] and so on.  The return value is the number of fields.

    Elements can be deleted from an array with the delete statement:

       delete arr [subscript]

    After this is done, arr [subscript] no longer exists.

    awk does not support multi-dimensional arrays, but this can be simulated
    by using a list of subscripts; see the User's Guide for details.

    Flow of control

    awk uses branching and looping statements borrowed from the C programming
    language.  In all the following constructs, a single statement can be
    replaced by a statement list enclosed in { braces }.

    Each statement in a statement list should begin on a new line or after a
    semicolon.

    The following constructs are available:

       if (expression) statement1 else statement2

    If expression is non-zero and non-empty, do statement1; otherwise, do
    statement2.  The ``else statement2'' is optional.  If there are several
    ifs together with an else, the else belongs with the nearest preceding
    if.

       while (expression) statement

    While expression is non-zero and non-empty, statement is executed.

       for (expression1; expression; expression2) statement

    This is a generalized form of the while statement.

    The for statement is the same as:

       expression1
       while (expression2) {
               statement
               expression3
       }

    All three expressions are optional.

    This is often used to go through a loop based on the value of a counter,
    where expression1 is used to initialize a counter; expression is the
    test; and expression2 increments the counter. While expression is non-
    empty and non-zero, statement is executed.

       do statement while (expression)

    statement is repeatedly executed until expression becomes null or zero.

    The break, continue, and next statements can be used to break out of
    loops that would otherwise keep going.  break drops out of the innermost
    while, for, or do loop.  continue causes the next iteration of the loop
    to begin. Execution will go to the test expression in a while or do loop,
    and to expression3 in a for loop.  next reads the next record and starts
    the main input loop again.

    exit will go straight to the END statements, if there are any.  If exit
    occurs in an END statement, the program itself exits.  If a numeric
    expression is given after exit, this expression is taken as the exit
    status for the awk program.

    Output

    The print and printf statements are used to write output in awk.

       print expr1,expr2, ...,exprn

    will print the string value of each expression separated by the output
    field separator, followed by the output record separator.  Without the
    commas, the expressions are concatenated.

    print by itself is an abbreviation for print $0.

    To print an empty line use:

       print ""

    The printf function in awk is like printf(S) in C:

       printf format, expr1, expr2, ... , expn

    format can be made up of regular characters, which are printed as-is,
    escaped special characters, such as Tab (\t) or Newline (\n), and format
    keyletters that specify how to print the expressions following the for-
    mat.  Format keyletters begin with a ``%'' and can be preceded with a
    width specification, a precision statement, and/or an instruction to
    left-justify an expression in its field.  The first expression replaces
    the first formatting keyletter, and so on.

    If a print or printf statement includes an expression with the greater-
    than operator (>), this expression should be enclosed in parentheses to
    avoid confusion between the greater-than operator and redirection into a
    file.  For example:

       { print $0 $2 > $3 }

    This statement says ``print the record and then field 2 into a file named
    by field 3,'' while:

       { print $0 ($2 > $3) }

    says ``print the record, followed by a 1 if field 2 is greater than field
    3, or a 0 it is not.''

    printf keyletters are:

    _________________________________________________________________________
    Keyletter        Prints expr as
    _________________________________________________________________________
    %c               the ASCII character referred to by the least significant
                     8 bits of the numeric value of expr; truncates expr to
                     the nearest integer
    %d               a decimal integer; truncates expr to the nearest integer
    %e               scientific notation using the form [-]d.ddddddE[+-]dd
    %f               scientific notation using the form [-]ddd.dddddd
    %g               the shorter of e or f conversion, with nonsignificant
                     zeros suppressed
    %o               an unsigned octal number
    %s               a string
    %x               unsigned hexadecimal number


    %%               prints a ``%'', no argument is converted

    The following escape sequences are recognized within regular expressions
    and strings:

    _________________________________________________________________________
    Escape sequence                       Meaning
    _________________________________________________________________________
    \b                                    Backspace
    \f                                    Formfeed
    \n                                    Newline
    \r                                    Carriage return
    \t                                    Tab
    \ddd                                  octal value ddd

    Output can be redirected into files using:

       > filename

    and

       >> filename

    Files are opened only once using the redirection operator.  The first
    form will overwrite whatever is in filename, if filename already exists,
    and will create filename if it does not exist. The second form will
    append output to filename.

    To send output to a pipe, use:

       | command-line

    where command-line is the command line to which you want to send the out-
    put.  Filenames and command lines can be expressions, variables, or
    literal filenames or command lines.  If you want to use a literal
    filename or command line, you must enclose it in double quotes, other-
    wise, awk will treat it as a variable.

    There is a limit to how many files and pipes you can open in an awk pro-
    gram (see ``Limits'' below).  Use the close statement to close files or
    pipes:

       close(filename)
       close(command-line)

    where filename or command-line is the open file or pipe.

    Input

    awk provides the getline function to read in successive lines of input
    from a file or a pipe.

    getline                getline by itself takes the next record of input
                           as $0 and sets NF, NR, and FNR.

    getline <file          The next record from file becomes $0; NF is set.

    getline var            The next record of input is placed in var; NR and
                           FNR are set.

    getline var <file      The next record in file is placed in var.

    command | getline      The output of command is piped to getline. $0 and
                           NF are set.

    command | getline var  The output of command is piped to getline and
                           stored in var.

    All forms of getline return 1 for successful input, 0 for end of file,
    and -1 for an error.

    To read input from a file until the file runs out, use:

       while ( ( getline x < file ) > 0) { ... }

    The ``> 0'' is needed so that the test catches a -1 error returned from
    getline. Otherwise, the while loop would read -1 as true, since it is
    non-zero.

    Functions

    The following arithmetic functions are built into awk:

    __________________________________________________________
    Function  Returns
    __________________________________________________________
    atan2(y,x)arctangent of y/x in the range -pi to pi
    cos(x)    cosine of x, with x in radians
    exp(x)    exponential function of x, e^x
    int(x)    integer part of x; truncated toward 0 when x > 0
    log(x)    natural (base e) logarithm of x
    rand()    random number r, where 0 <= r < 1
    sin(x)    sine of x, with x in radians
    sqrt(x)   square root of x
    srand()   set the seed for rand() from the time of day
    srand(x)  x is new seed for rand()

    The string functions are:

    gsub(r,s,t)
              globally substitutes the string s for the regular expression r
              in the string t. If t is omitted, substitutions are made in the
              current record ($0).  The number of substitutions is returned.

    index(s,t)
              returns the position in string s where string t first occurs,
              or 0 if it does not occur at all.

    length(s) returns the length of its argument taken as a string, or of the
              whole record if there is no argument.

    match(s,re)
              returns the position in string s where the regular expression
              re occurs, or 0 if it does not occur at all.  RSTART is set to
              the starting position (which is the same as the returned
              value), and RLENGTH is set to the length of the matched string.

    split(s,a,fs)
              splits the string s into array elements a[1], a[2], a[n], and
              returns n. The separation is done with the regular expression
              fs or with the field separator FS if fs is not given.

    sprintf(format, expr,expr, ... )
              formats the expressions according to the printf format and
              returns the resulting string.

    sub(r,s,t)
              substitutes the string s in place of the first instance of the
              regular expression r in string t and returns the number of sub-
              stitutions.  If t is omitted, awk substitutes in the current
              record ($0).

    substr(s,p)
              returns the suffix of s starting at position p.

    substr(s,p,n)
              returns the n-character substring of s that begins at position
              p.

    toupper(s)
              returns a copy of the string s with lowercase letters converted
              to uppercase.

    tolower(s)
              returns a copy of the string s with uppercase letters converted
              to lowercase.

    awk provides the system function for running commands:

       system(command-line)

    executes command-line and returns its exit status.

    You can define your own functions in awk.  The syntax for this is:

       function name(parameter-list) {
               statements
       }

    name is the name of the function, parameter-list is a comma-separated
    list of variable names, which, within the function refer to the arguments
    with which the function was called, and statements are action statements
    that make up the body of the function.

    Function definitions can appear anywhere a pattern-action statement can
    appear.  Recursion is permitted within user-defined functions; that is, a
    function may call itself directly or indirectly.

    Variables passed to functions (as arguments) are copied and a copy of the
    variable is manipulated by the function; that is, these variables are
    passed by value.  The exception to this in awk is arrays, which are
    passed by reference, that is, the actual array elements are manipulated
    by the function, so array elements can be permanently altered, created,
    or deleted within a function.

    Missing function arguments are set to null; extra arguments are ignored.

    To define a return value for your function, you must include a statement

       return expression

    where expression is the value you want your function to return.  expres-
    sion here is optional; if you leave it out, control will be returned to
    the caller of the function, but the return value will be undefined.  The
    return statement itself is optional as well.

    The formal parameters of a function (the argument list) are local to that
    function, but any other variables are global.  You can use the argument
    list as a way of creating variables local only to the function; like
    other variables in awk these will be automatically initialized with null
    values.

    Number or string?

    In awk, variables come into being when they are used; there is no
    declaration of a variable, and, therefore, you do not declare the type of
    a variable as a string or a number.  Instead, awk assumes the type of a
    variable from its context.

    In an assignment statement, such as v=e, the type of v becomes the type
    of e. When the context is ambiguous, awk determines the types when the
    program runs.

    In comparisons, if both operands are numeric, they are compared as num-
    bers; otherwise, they are compared as strings.  (A string is greater than
    another string if it comes later in the sort sequence, and less than
    another string if it comes earlier in the sort sequence.)

    All field variables are of type string; in addition, each field can be
    considered to have a numeric value (that is, the numeric value of a
    string).  The numeric value of a string is the value of the longest pre-
    fix of a string that looks numeric.  For example, if a field contains the
    string ``123abc'', the numeric value of this would be 123.

    The value of a variable in awk is initially 0 or the string "".

    You can force a variable of one type to become another type; this is
    known as type coercion.  To force a number to a string:

       number ""

    (Concatenate the null string to number.)

    To force a string to a number:

       string + 0

    For more information about variable types, see the chapter on awk in the
    User's Guide.

    Limits

    The following limits exist in this implementation of awk:

       100 fields
       3000 characters per input record
       3000 characters per output record
       3000 characters per field
       3000 characters per printf string
       400 characters per literal string or regular expression
       250 characters per character class
       55 open files or pipes
       double precision floating point

    Numbers are limited to what can be represented on your machine; numbers
    outside this range will have string values only.

 Examples

    The following examples are all individual awk programs; to try them out,
    you will need to put them in a file and call the file with awk -f, or
    enclose them in single quotes on the awk command line.

    Print lines longer than 72 characters:

       length > 72

    Print only the first two fields in opposite order:

       { print $2, $1 }

    Same, with input fields separated by comma and/or blanks and tabs:

       BEGIN   { FS = ",[ \t]* | [ \t]+" }
           { print $2, $1 }

    Add up the first column, print sum and average:

           { s += $1 }
       END {if ( NR > 0 )  print "sum is",  s, " average is", s/NR }

    Print fields in reverse order (on separate lines):

       { for (i = NF; i > 0; --i) print $i }

    Print all lines between start/stop pairs:

       /start/, /stop/

    Print all lines whose first field is different from previous one:

       $1 != prev { print; prev = $1 }

    Simulate echo(C):

       BEGIN   {
           for (i = 1; i < ARGC; i++)
                    printf "%s ", ARGV[i]
           printf "\n"
           exit
           }

    Simple env(C):

       BEGIN   {
           for (e in ENVIRON)
               print e "=" ENVIRON[e]
           }


 See also

    ed(C), grep(C), lex(CP), printf(S) and sed(C).

    ``Simple programming with awk'' in the User's Guide

    Alfred V. Aho, Brian W. Kernighan, and Peter J. Weinberger,
    The AWK Programming Language, Addison-Wesley, 1988.

 Notes

    Input whitespace is not preserved on output if fields are involved.

    func is an obsolete synonym for function.

    This version of awk is the so-called ``new awk'' described in The AWK
    Programming Language (referenced above).  It is mostly compatible with an
    older version of awk still in common use.  On some systems, the ``new
    awk'' is called nawk, the older one is oawk, and awk may be linked to
    either version.  The nawk and oawk names do not exist on all systems, and
    even when they do exist, are not reliable.  Only the name awk should be
    used.

    Known incompatibilities between this version of awk and older awks
    include:

    +  The definition of ``what constitutes a number'' is slightly different.
       In the old awk, a string had a numeric value only if the entire string
       looked numeric.  In the new awk, a string has a numeric value if a
       prefix of the string looks numeric, and the numeric value is the value
       of the longest such prefix.

       For example, the string:

          123foo

       does not have a numeric value in the old awk (and is treated as 0),
       but has the value 123 in the new awk.

    +  Assigning to a nonexistent field in the new awk changes $0 to include
       that field, whereas, in the old awk, $0 did not change.  Thus, the
       program:

          { $2 = $1; print }

       produces different output if the input has only one field.

    +  The new awk allows user-defined functions; these are not recognized in
       the old awk.

    +  There are several new reserved words in the new awk which could be
       used as variable names in the old awk.

    +  In addition, the parsing has changed, which may result in some
       ambiguous-looking expressions that were legal in the old awk failing
       with the new awk.

       For example, in regular expressions, the character class:

          [/]

       is not legal in the new awk, but was in the old.  The equivalent char-
       acter class for the new awk is:

          [\/]

       However, this character class, when used with the old awk, is not
       equivalent to the original expression.

 Standards conformance

    awk is conformant with:

    AT&T SVID Issue 2;
    and X/Open Portability Guide, Issue 3, 1989.


Typewritten Software • bear@typewritten.org • Edmonds, WA 98026