tr(1)

NAME

tr − translate characters

SYNOPSIS

tr [−cds] [string1 [string2]]

DESCRIPTION

tr copies the standard input to the standard output with substitution or deletion of selected characters. Input characters found in string1 are mapped into the corresponding characters of string2. Any combination of the options −cds may be used:

−c Complements the set of characters in string1 with respect to all the characters contained in the current character set.

−d Deletes all input characters in string1.

−s Squeezes all strings of repeated output characters that appear in string2 to single characters.

The following abbreviation conventions can be used to introduce ranges of characters or repeated characters into the strings:

[c1-c2] Stands for the ordered string of collating elements c1 to c2, inclusive.

[[:class:]] Stands for the string of characters in type class. class must be one of alpha, upper, lower, digit, xdigit, alnum, space, print, punct, graph or cntrl. Character classes are expanded in collation order.

[[=c=]] Stands for an equivalence class; i.e., all characters defined as having the same primary collation sequence number as c.

[[.cc.]] Stands for the collating element cc. Multi-character collating elements must be represented in this manner to distinguish them from a list of single-character collating elements.

[a∗n] Stands for n repetitions of a. If the first digit of n is 0, n is considered octal; otherwise, n is taken to be decimal. A zero or missing n is taken to be huge; this facility is useful for padding string2.

The escape character \ can be used as in the shell to remove special meaning from any character in a string. In addition, \ followed by 1, 2, or 3 octal digits stands for the character whose ASCII code is given by those digits.

An ASCII NUL character in string1 or string2 can be represented only as an escaped character; i.e. as "\000", but is treated like other characters and translated correctly if so specified. NUL characters in the input are not stripped out unless the option −d "\000" is given.

EXTERNAL INFLUENCES

Environment Variables

LC_COLLATE determines the order of collating elements, the members of equivalence classes, the order in which character classes are expanded, and the identification of multi-character collating elements.

LC_CTYPE determines the interpretation of text as single- and/or multi-byte characters, the characters matched by character classes, and the current universe of characters when using the −c option.

If LC_COLLATE or LC_CTYPE is not specified in the environment or is set to the empty string, the value of LANG is used as a default for each unspecified or empty variable. If LANG is not specified or is set to the empty string, a default of "C" (see lang(5)) is used instead of LANG. If any internationalization variable contains an invalid setting, tr behaves as if all internationalization variables are set to "C". See environ(5).

International Code Set Support

Single- and multi-byte character code sets are supported.

RETURN VALUE

tr exits with one of the following values:

0 All input was processed successfully.

>0 An error occurred.

EXAMPLES

1. For the ASCII character set and default collation sequence, the following creates a list of all the words in file1 one per line in file2, where a word is taken to be a maximal string of alphabetics. The strings are quoted to protect the special characters from interpretation by the shell; 012 is the ASCII code for a new-line (line feed) character.

tr −cs "[A−Z][a−z]" "[\012∗]" <file1 >file2

2. The following example does the same as the one above but works for all character sets and collation sequences.

tr −cs "[[:alpha:]]" "[\012∗]" <file1 >file2

3. This example translates all lowercase characters in file1 to uppercase and writes the result to standard output.

tr "[[:lower:]]" "[[:upper:]]" <file1

Note that character classes specified in either string1 or string2 are expanded in collation order. Consequently, this example does not produce the desired effect in locales in which there is not a one-to-one mapping between lowercase and uppercase letters, or in which lowercase and uppercase letters collate in a different relative order.

4. This example uses an equivalence class to identify accented variants of the base character e in file1, which are stripped of diacritical marks and written to file2.

tr "[[=e=]]" "[e∗]" <file1 >file2

5. This example translates instances of the multi-character collating element ch to uppercase. Note that the single characters c and h are not affected by this operation unless they are part of a ch character sequence.

tr "[[.ch.]]" "[[.CH.]]" <file1 >file2

STANDARDS CONFORMANCE

tr: SVID2, XPG2, XPG3, proposed POSIX.2 FIPS (June 1990)

Hewlett-Packard Company — HP-UX Release 8.05: June 1991

Museum

Related Articles