Museum

Home

Lab Overview

Retrotechnology Articles

⇒ Online Manual

Media Vault

Software Library

Restoration Projects

Artifacts Sought

Related Articles

Commands:  awk(1)

cmp(1)

comm(1)

cut(1)

diff(1)

grep(1)

paste(1)

sdiff(1)

sed(1)

sort(1)

uniq(1)

Standards:  standards(5)

join(1)  —  Commands

NAME

join − Joins the lines of two files

SYNOPSIS

Current syntax

join [−a file_number|−v file_number] [−e string] [−o number.field,...] [−t character] [−1 field] [−2 field] file1 file2

Obsolescent syntax

[join] [−a number] [−e string] [−j number|field|number  field] [−o number.field,...] [−t character] file1 file2

The join command reads file1 and file2 and joins lines in the files that contain common fields, or otherwise according to the options, and writes the results to standard output. 

STANDARDS

Interfaces documented on this reference page conform to industry standards as follows:

join:  XPG4, XPG4-UNIX

Refer to the standards(5) reference page for more information about industry standards and associated tags. 

OPTIONS

−1 field
Joins on the fieldth field of file1. Fields are decimal integers starting with 1. 

−2 field
Joins on the fieldth field of file2. Fields are decimal integers starting with 1. 

−a number
Produces an output line for each unpairable line found in file1 if number is 1, or file2 if number is 2.  Without −a, join produces output only for lines containing a common field.  If both −a 1 and −a 2 are used, all unpairable lines will be output. 

−e string
Replaces empty output fields with string. 

−j number | field | number field
Joins the two files on field of file number, where number is 1 for file1 or 2 for file2.  If you do not specify number, join uses field in each file.  Without −j, join uses the first field in each file. The default value for both number and field is 1. (Obsolescent)

If you enter only a 1 or a 2 as an argument to −j, join interprets this argument as the file number (number); integers greater than 2 are interpreted as the field number (field).  Therefore, if you want to specify a field number of 2, you must precede this specification with a number argument; otherwise, the join program interprets the 2 as the file number (number). 

−o number.field, ...
Produces output lines consisting of the fields specified in one or more number.field arguments, where number is 1 for file1 or 2 for file2, and field is a field number.  Multiple −o arguments should be separated with commas. 

−t character
Uses character (a single character) as the field separator character in the input and the output.  Every appearance of character in a line is significant.  The default separator is a space.  If you do not specify −t, join also recognizes the tab and newline characters as separators. 

With default field separation, the collating sequence is that of sort −b.  If you specify −t, the sequence is that of a plain sort.  To specify a tab character, enclose it in ” (single quotes). 

−v file_number
Produces an output line for each unpairable line in file_number (where file_number is 1 or 2), instead of the default output.  If both −v 1 and −v 2 are specified, produces output lines for all unpairable lines. 

OPERANDS

file1

file2The pathnames of files to be used as input.  If - (hyphen) is specified for either file, standard input is read. 

DESCRIPTION

The join field is the field in the input files that join looks at to determine what will be included in the output.  One line appears in the output for each identical join field appearing in both file1 and file2.  The output line consists of the join field, the rest of the line from file1, then the rest of the line from file2. 

Both input files must be sorted according to the collating sequence specified by the LC_COLLATE environment variable, if set, for the fields where they are to be joined (usually the first field in each line). 

Fields are normally separated by a space, a tab character, or a newline character.  In this case, join treats consecutive separators as one, and discards leading separators.  Use the −t option to specify another field separator. 

EXIT STATUS

The following exit values are returned:

0Successful completion. 

>0An error occurred. 

EXAMPLES

Note that the vertical alignment shown in these examples may not be consistent with your output. 

     1.To perform a simple join operation on two files, phonedir and names, whose first fields are the same, enter:

join  phonedir  names

If phonedir contains the following telephone directory:

Binst           555-6235
Dickerson       555-1842
Eisner          555-1234
Green           555-2240
Hrarii          555-0256
Janatha         555-7358
Lewis           555-3237
Takata          555-5341
Wozni           555-1234

and names is this listing of names and department numbers:

Eisner          Dept. 389
Frost           Dept. 217
Green           Dept. 311
Takata          Dept. 454
Wozni           Dept. 520

then join phonedir names displays:

Eisner          555-1234        Dept. 389
Green           555-2240        Dept. 311
Takata          555-5341        Dept. 454
Wozni           555-1234        Dept. 520

Each line consists of the join field (the last name), followed by the rest of the line found in phonedir and the rest of the line in names. 

     2.To display unmatched lines as well as matched lines, enter:

join  -a 2  phonedir  names

If phonedir contains:

Binst           555-6235
Dickerson       555-1842
Eisner          555-1234
Green           555-2240
Hrarii          555-0256
Janatha         555-7358
Lewis           555-3237
Takata          555-5341
Wozni           555-1234

and names contains:

Eisner          Dept. 389
Frost           Dept. 217
Green           Dept. 311
Takata          Dept. 454
Wozni           Dept. 520

then join -a 2 phonedir names displays:

Eisner          555-1234        Dept. 389
Frost                           Dept. 217
Green           555-2240        Dept. 311
Takata          555-5341        Dept. 454
Wozni           555-1234        Dept. 520

This performs the same join operation as in the first example, and also lists the lines of names that have no match in phonedir. It includes Frost’s name and department number in the listing, although there is no entry for Frost in phonedir. 

     3.To display selected fields, enter:

join  -o 2.3,2.1,1.2 phonedir names

This displays the following fields:

Field 3 of names (Department Number)

Field 1 of names (Last Name)

Field 2 of phonedir (Telephone Number)

If phonedir contains:

Binst           555-6235
Dickerson       555-1842
Eisner          555-1234
Green           555-2240
Hrarii          555-0256
Janatha         555-7358
Lewis           555-3237
Takata          555-5341
Wozni           555-1234

and names contains:

Eisner          Dept. 389
Frost           Dept. 217
Green           Dept. 311
Takata          Dept. 454
Wozni           Dept. 520

then join -o 2.3,2.1,1.2 phonedir names displays:

389     Eisner  555-1234
311     Green   555-2240
454     Takata  555-5341
520     Wozni   555-1234

     4.To perform the join operation on a field other than the first, enter:

sort -b -k 2,3 phonedir | join -1 2 - numbers

This combines the lines in phonedir and names, comparing the second field of phonedir to the first field of numbers. 

First, this sorts phonedir by the second field because both files must be sorted by their join fields. The output of sort is then piped to join. The - (dash) by itself causes the join command to use this output as its first file. The −1 2 defines the second field of the sorted phonedir as the join field. This is compared to the first field of numbers because its join field is not specified with a −2 option. 

If phonedir contains:

Binst           555-6235
Dickerson       555-1842
Eisner          555-1234
Green           555-2240
Hrarii          555-0256
Janatha         555-7358
Lewis           555-3237
Takata          555-5341
Wozni           555-1234

and numbers contains:

555-0256
555-1234
555-5555
555-7358

then sort ... | join ... displays:

555-0256        Hrarii
555-1234        Eisner
555-1234        Wozni
555-7358        Janatha

Each number in numbers is listed with the name listed in phonedir for that number.  Note that join lists all the matches for a given field.  In this case, join lists both Eisner and Wozni as having the telephone number 555-1234. The number 555-5555 is not listed because it does not appear in phonedir. 

ENVIRONMENT VARIABLES

The following environment variables affect the execution of join:

LANG
Provides a default value for the internationalization variables that are unset or null. If LANG is unset or null, the corresponding value from the default locale is used.  If any of the internationalization variables contain an invalid setting, the utility behaves as if none of the variables had been defined. 

LC_ALL
If set to a non-empty string value, overrides the values of all the other internationalization variables.

LC_CTYPE
Determines the locale for the interpretation of sequences of bytes of text data as characters (for example, single-byte as opposed to multi-byte characters in arguments and input files).

LC_MESSAGES
Determines the locale for the format and contents of diagnostic messages written to standard error.

NLSPATH
Determines the location of message catalogues for the processing of LC_MESSAGES. 

SEE ALSO

Commands:  awk(1), cmp(1), comm(1), cut(1), diff(1), grep(1), paste(1), sdiff(1), sed(1), sort(1), uniq(1)

Standards:  standards(5)

Typewritten Software • bear@typewritten.org • Edmonds, WA 98026