Museum

Home

Lab Overview

Retrotechnology Articles

⇒ Online Manual

Media Vault

Software Library

Restoration Projects

Artifacts Sought

Related Articles

adb(1S)

as(1S)

ld(1)

nm(1)

dbx(1)

strip(1)

A.OUT(5)  —  System Interface Manual — File Formats

NAME

a.out − assembler and link editor output

SYNOPSIS

#include <a.out.h>
#include <stab.h>
#include <nlist.h>

DESCRIPTION

A.out is the output file of the assembler as(1S) and the link editor ld(1). The latter makes a.out executable if there were no errors and no unresolved external references.  Layout information as given in the include file for the Sun system is:

/∗

∗ Header prepended to each a.out file.

∗/
struct exec {
longa_magic;/∗ magic number ∗/
unsigneda_text;/∗ size of text segment ∗/
unsigneda_data;/∗ size of initialized data ∗/
unsigneda_bss;/∗ size of uninitialized data ∗/
unsigneda_syms;/∗ size of symbol table ∗/
unsigneda_entry;/∗ entry point ∗/
unsigneda_trsize;/∗ size of text relocation ∗/
unsigneda_drsize;/∗ size of data relocation ∗/
};
 #defineOMAGIC0407 /∗ old impure format ∗/
#defineNMAGIC0410/∗ read-only text ∗/
#defineZMAGIC0413/∗ demand load format ∗/
 #definePAGSIZ2048
#defineSEGSIZ0x8000
#defineTXTRELOC SEGSIZ
/∗

∗ Macros which take exec structures as arguments and tell whether

∗ the file has a reasonable magic number or offsets to text|symbols|strings.

∗/
#defineN_BADMAG(x) \
    (((x).a_magic)!=OMAGIC && ((x).a_magic)!=NMAGIC && ((x).a_magic)!=ZMAGIC)
 #defineN_TXTOFF(x) \
((x).a_magic==ZMAGIC ? PAGSIZ : sizeof (struct exec))
#define N_SYMOFF(x) \
(N_TXTOFF(x) + (x).a_text+(x).a_data + (x).a_trsize+(x).a_drsize)
#defineN_STROFF(x) \
(N_SYMOFF(x) + (x).a_syms)
/∗

∗ Macros which take exec structures as arguments and tell where the

∗ various pieces will be loaded.

∗/
#define N_TXTADDR(x) TXTRELOC
#define N_DATADDR(x) \
(((x).a_magic==OMAGIC)? (N_TXTADDR(x)+(x).a_text) \
: (SEGSIZ+((N_TXTADDR(x)+(x).a_text-1) & ~SEGRND)))
#define N_BSSADDR(x)  (N_DATADDR(x)+(x).a_data)
 

The a.out file has five sections: a header, the program text and data, relocation information, a symbol table and a string table (in that order).  The last three may be omitted if the program was loaded with the ‘−s’ option of ld or if the symbols and relocation have been removed by strip(1).

In the header the sizes of each section are given in bytes.  The size of the header is not included in any of the other sizes. 

When an a.out file is executed, three logical segments are set up: the text segment, the data segment (with uninitialized data, which starts off as all 0, following initialized data), and a stack.  The header is not loaded with the text segment.  If the magic number in the header is OMAGIC (0407), it means that this is a non-sharable text which is not to be write-protected, so the data segment is immediately contiguous with the text segment.  This is rarely used.  If the magic number is NMAGIC (0410) or ZMAGIC (0413), the data segment begins at the first segment boundary following the text segment, and the text segment is not writable by the program; other processes executing the same file will share the text segment.  For ZMAGIC format, the text segment begins on a page boundary in the a.out file; the remaining bytes after the header in the first block are reserved and should be zero.  In this case the text and data sizes must both be multiples of the page size, and the pages of the file will be brought into the running image as needed, and not pre-loaded as with the other formats.  This is especially suitable for very large programs and is the default format produced by ld(1). The macros N_TXTADDR, N_DATADDR, and N_BSSADDR give the core addresses at which the text, data, and bss segments, respectively, will be loaded.

The stack starts at the highest possible location in the memory image, and grows downwards.  The stack is automatically extended as required.  The data segment is extended as requested by brk(2) or sbrk(2).

After the header in the file follow the text, data, text relocation data relocation, symbol table and string table in that order.  The text begins at byte PAGSIZ in the file for ZMAGIC format or just after the header for the other formats.  The N_TXTOFF macro returns this absolute file position when given the name of an exec structure as argument.  The data segment is contiguous with the text and immediately followed by the text relocation and then the data relocation information.  The symbol table follows all this; its position is computed by the N_SYMOFF macro.  Finally, the string table immediately follows the symbol table at a position which can be gotten easily using N_STROFF.  The first 4 bytes of the string table are not used for string storage, but rather contain the size of the string table; this size INCLUDES the 4 bytes, the minimum string table size is thus 4. 
 
RELOCATION

The value of a byte in the text or data which is not a portion of a reference to an undefined external symbol is exactly that value which will appear in memory when the file is executed.  If a byte in the text or data involves a reference to an undefined external symbol, as indicated by the relocation information, then the value stored in the file is an offset from the associated external symbol.  When the file is processed by the link editor and the external symbol becomes defined, the value of the symbol is added to the bytes in the file. 

If relocation information is present, it amounts to eight bytes per relocatable datum as in the following structure:

/∗

∗ Format of a relocation datum.

∗/
struct relocation_info {
intr_address;/∗ address which is relocated ∗/
unsignedr_symbolnum:24,/∗ local symbol ordinal ∗/
r_pcrel:1, /∗ was relocated pc relative already ∗/
r_length:2,/∗ 0=byte, 1=word, 2=long ∗/
r_extern:1,/∗ does not include value of sym referenced ∗/
:4;/∗ nothing, yet ∗/
};

There is no relocation information if a_trsize+a_drsize==0.  If r_extern is 0, then r_symbolnum is actually a n_type for the relocation (i.e. N_TEXT meaning relative to segment text origin.) 
 
SYMBOL TABLE

The layout of a symbol table entry and the principal flag values that distinguish symbol types are given in the include file as follows:

/∗

∗ Format of a symbol table entry.

∗/
struct nlist {
union {
char∗n_name; /∗ for use when in-memory ∗/
longn_strx;/∗ index into file string table ∗/
} n_un;
unsigned charn_type; /∗ type flag, i.e. N_TEXT etc; see below ∗/
charn_other;
shortn_desc;/∗ see <stab.h> ∗/
unsignedn_value;/∗ value of this symbol (or adb offset) ∗/
};
#definen_hashn_desc/∗ used internally by ld ∗/
 /∗

∗ Simple values for n_type.

∗/
#defineN_UNDF0x0/∗ undefined ∗/
#defineN_ABS0x2/∗ absolute ∗/
#defineN_TEXT0x4/∗ text ∗/
#defineN_DATA0x6/∗ data ∗/
#defineN_BSS0x8/∗ bss ∗/
#defineN_COMM0x12/∗ common (internal to ld) ∗/
#defineN_FN0x1f/∗ file name symbol ∗/
 #defineN_EXT01/∗ external bit, or’ed in ∗/
#defineN_TYPE0x1e/∗ mask for all the type bits ∗/
 /∗

∗ Other permanent symbol table entries have some of the N_STAB bits set.

∗ These are given in <stab.h>

∗/
#defineN_STAB0xe0/∗ if any of these bits set, don’t discard ∗/

In the a.out file a symbol’s n_un.n_strx field gives an index into the string table.  A n_strx value of 0 indicates that no name is associated with a particular symbol table entry.  The field n_un.n_name can be used to refer to the symbol name only if the program sets this up using n_strx and appropriate data from the string table.  Because of the union in the nlist declaration, it is impossible in C to statically initialize such a structure.  If this must be done (as when using nlist(3)) the file <nlist.h> should be included, rather that <a.out.h>; this contains the declaration without the union. 

If a symbol’s type is undefined external, and the value field is non-zero, the symbol is interpreted by the loader ld as the name of a common region whose size is indicated by the value of the symbol. 
 
STAB SYMBOLS

Stab.h defines some values of the n_type field of the symbol table of a.out files.  These are the types for permanent symbols (that is, not local labels, etc.)  used by the debuggers adb(1S) and dbx(1) and the Berkeley Pascal compiler pc(1). Symbol table entries can be produced by the .stabs assembler directive.  This allows one to specify a double-quote delimited name, a symbol type, one char and one short of information about the symbol, and an unsigned long (usually an address).  To avoid having to produce an explicit label for the address field, the .stabd directive can be used to implicitly address the current location.  If no name is needed, symbol table entries can be generated using the .stabn directive.  The loader promises to preserve the order of symbol table entries produced by .stab directives. 

The n_value field of a symbol is relocated by the link editor as an address within the appropriate segment.  N_value fields of symbols not in any segment are unchanged by the linker.  In addition, the linker will discard certain symbols, according to rules of its own, unless the n_type field has one of the bits masked by N_STAB set. 

This allows up to 112 (7 ∗ 16) symbol types, split between the various segments.  Some of these have already been claimed.  The debugger, adb(1S), uses the following n_type values:

#defineN_GSYM0x20/∗ global symbol: name,,0,type,0 ∗/
#defineN_FNAME0x22/∗ procedure name (f77 kludge): name,,0 ∗/
#defineN_FUN0x24/∗ procedure: name,,0,linenumber,address ∗/
#defineN_STSYM0x26/∗ static symbol: name,,0,type,address ∗/
#defineN_LCSYM0x28/∗ .lcomm symbol: name,,0,type,address ∗/
#defineN_RSYM0x40/∗ register sym: name,,0,type,register ∗/
#defineN_SLINE0x44/∗ src line: 0,,0,linenumber,address ∗/
#defineN_SSYM0x60/∗ structure elt: name,,0,type,struct_offset ∗/
#defineN_SO0x64/∗ source file name: name,,0,0,address ∗/
#defineN_LSYM0x80/∗ local sym: name,,0,type,offset ∗/
#defineN_SOL0x84/∗ #included file name: name,,0,0,address ∗/
#defineN_PSYM0xa0/∗ parameter: name,,0,type,offset ∗/
#defineN_ENTRY0xa4/∗ alternate entry: name,linenumber,address ∗/
#defineN_LBRAC0xc0/∗ left bracket: 0,,0,nesting level,address ∗/
#defineN_RBRAC0xe0/∗ right bracket: 0,,0,nesting level,address ∗/
#defineN_BCOMM0xe2/∗ begin common: name,, ∗/
#defineN_ECOMM0xe4/∗ end common: name,, ∗/
#defineN_ECOML0xe8/∗ end common (local name): ,,address ∗/
#defineN_LENG0xfe/∗ second stab entry with length information ∗/

where the comments give the adb conventional use for .stabs and the n_name, n_other, n_desc, and n_value fields of the given n_type. Adb uses the n_desc field to hold a type specifier in the form used by the Portable C Compiler, cc(1), in which a base type is qualified in the following structure:

struct desc {
shortq6:2,
q5:2,
q4:2,
q3:2,
q2:2,
q1:2,
basic:4;
};

There are four qualifications, with q1 the most significant and q6 the least significant:

0none
1pointer
2function
3array

The sixteen basic types are assigned as follows:

0undefined
1function argument
2character
3short
4int
5long
6float
7double
8structure
9union
10enumeration
11member of enumeration
12unsigned character
13unsigned short
14unsigned int
15unsigned long

The Berkeley Pascal compiler, pc(1), uses the following n_type value:

#defineN_PC0x30/∗ global pascal symbol: name,,0,subtype,line ∗/

and uses the following subtypes to do type checking across separately compiled files:

1source file name
2included file name
3global label
4global constant
5global type
6global variable
7global function
8global procedure
9external function
10external procedure
11library variable
12library routine

The new dbx(1) debugger uses an entirely different interpretation for the stabs symbol-table entries.  Currently, this is understood only by dbx and cc, but its use should supplant the current interpretation as soon as adb and pc can be modified to use it. 

SEE ALSO

adb(1S), as(1S), ld(1), nm(1), dbx(1), strip(1)

BUGS

There are currently two interpretations of the stabs symbol−table information.  This creates great confusion when trying to build a program for debugging. 

Due to the amount of symbolic information necessary for high−level debugging, the whole a.out structure has been streched well beyond its original design, and should be replaced by something with a more sophisticated symbol−table mechanism. The demands of future languages will only compound the problems. 

Sun System Release 1.0  —  15 January 1983

Typewritten Software • bear@typewritten.org • Edmonds, WA 98026