tr(1)
NAME
tr − translate characters
SYNOPSIS
tr [-cds] [string1 [string2]]
DESCRIPTION
tr copies the standard input to the standard output with substitution or deletion of selected characters. Input characters found in string1 are mapped into the corresponding characters of string2. Any combination of the options -cds can be used:
-c Complements the set of characters in string1 with respect to all the characters contained in the current character set.
-d Deletes all input characters in string1.
-s Squeezes all strings of repeated output characters that appear in string2 to single characters.
The following abbreviation conventions can be used to introduce ranges of characters or repeated characters into the strings:
c1-c2 or
[c1-c2] Stands for the ordered string of collating elements c1 through c2, inclusive.
[:class:]or
[[:class:]] Stands for the string of characters in type class, where class must be one of alpha, upper, lower, digit, xdigit, alnum, space, print, punct, graph, or cntrl. Character classes are expanded in collation order.
[=c=]or
[[=c=]] Stands for an equivalence class; i.e., all characters that are defined as having the same primary collation sequence number as c.
[.cc.] Stands for the collating element cc. Multi-character collating elements must be represented in this manner to distinguish them from a list of single-character collating elements.
[a*n] Stands for n repetitions of a. If the first digit of n is 0, n is considered octal; otherwise, n is treated as a decimal value. A zero or missing n is taken to be huge; this facility is useful for padding string2.
The escape character \ can be used as in the shell to remove special meaning from any character in a string. In addition, \ followed by 1, 2, or 3 octal digits represents the character whose ASCII code is given by those digits.
An ASCII NUL character in string1 or string2 can be represented only as an escaped character; i.e. as \000, but is treated like other characters and translated correctly if so specified. NUL characters in the input are not stripped out unless the option −d "\000" is given.
WARNINGS
a-z now represents the range of collating elements from a thru z, inclusive and not the three characters a, -, and z. The three characters a, -, and z can be represented as az- or a\-z.
EXTERNAL INFLUENCES
Environment Variables
LC_COLLATE determines the order of collating elements, the members of equivalence classes, the order in which character classes are expanded, and the identification of multi-character collating elements.
LC_CTYPE determines the interpretation of text as single- and/or multi-byte characters, the characters matched by character classes, and the current universe of characters when using the -c option.
If LC_COLLATE or LC_CTYPE is not specified in the environment or is set to the empty string, the value of LANG is used as a default for each unspecified or empty variable. If LANG is not specified or is set to the empty string, a default of "C" (see lang(5)) is used instead of LANG. If any internationalization variable contains an invalid setting, tr behaves as if all internationalization variables are set to "C". See environ(5).
International Code Set Support
Single- and multi-byte character code sets are supported.
RETURN VALUE
tr exits with one of the following values:
0 All input was processed successfully.
>0 An error occurred.
EXAMPLES
For the ASCII character set and default collation sequence, create a list of all the words in file1, one per line in file2, where a word is taken to be a maximal string of alphabetics. Quote the strings to protect the special characters from interpretation by the shell ( 012 is the ASCII code for a new-line (line feed) character:
tr -cs "[A-Z][a-z]" "[\012*]" <file1 >file2
Same as above, but for all character sets and collation sequences:
tr -cs "[:alpha:]" "[ 12*]" <file1 >file2
Translate all lowercase characters in file1 to uppercase and write the result to standard output.
Note that character classes specified in either string1 or string2 are expanded in collation order. Consequently, this example does not produce the desired effect in locales in which there is not a one-to-one mapping between lowercase and uppercase letters, or in which lowercase and uppercase letters collate in a different relative order:
tr "[:lower:]" "[:upper:]" <file1
Use an equivalence class to identify accented variants of the base character e in file1, strip them of diacritical marks and write the result to file2:
tr "[=e=]" "[e*]" <file1 >file2
Translates instances of the multi-character collating element ch to uppercase only in locales which support two-to-one collation. Note that the single characters c and h are not affected by this operation unless they are part of a ch character sequence.
tr "[.ch.]" "[.CH.]" <file1 >file2
SEE ALSO
ed(1), sh(1), ascii(5), environ(5), lang(5), regexp(5).
STANDARDS CONFORMANCE
tr: SVID2, XPG2, XPG3, POSIX.2
Hewlett-Packard Company — HP-UX Release 9.0: August 1992