cord(1) — RISC
Name
cord − rearranges procedures in an executable to facilitate better cache mapping.
Syntax
cord [ −c cachesize ] [ −f ] [ −o outfile ] [ −p maxphases ] [ −v ] obj reorder
Description
The cord command rearranges procedures in an executable object to maximize efficiency in a machine’s cache. By rearranging the procedures properly, the instruction cache miss rate is reduced. The cord command does not attempt to determine the correct ordering, but is given a reorder file containing the desired procedure order. The reorder file is generated by the ftoc program which in turn generates a reorder file from a set of profile feedback files (see prof() ).
Processed lines in the reorder file are called procedure lines. Each procedure line must be on a separate source line. Each procedure line must contain the source name of the file, followed by a blank followed by a qualified procedure name (nested procedures need to be qualified x.y where x is the outer procedure). A newline or blank can follow the procedure name:
foo.c bar >>i ignore this stuff<<
Lines beginning with a pound sign (#) are comments. Lines beginning with a dollar sign ($) are considered cord directive lines. The only directive currently understood is $phase. This directive will consider the rest of the file (until the end of file or next $phase) as a new phase of the program and will order the procedures accordingly. Procedures may appear in more than one phase, resulting in more than one copy of it in the final binary. The cord command will try to relocate references to a procedure to a copy in the requesting phase’s list of procedures first and then a random copy if one is not found.
You should use the −cord option to a compiler driver like cc rather than execute cord directly. Options to cord can be specified with −Wz,cordarg0,cordarg1,.... If you have to run cord manually, you should run it once with the driver using the −v flag on a simple program to see the exact passes and their arguments involved in using cord.
The obj argument is an executable object with its relocation information intact. This can be achieved by passing the −r −z −d options to the linker, ld. The −r linker option maintains relocation information in the object, but will not make it a ZMAGIC file (hence −z) nor will it allocate common variables (hence −d) as it would without the option.
Options
−c cachesizeSpecify the cachesize of the machine you want to execute on in bytes. This only affects the −f option. If not specified 65536 is used.
−fFlip the first cachepage size procedures. The assumption when cord was written was that procedures would be reordered by procedure density (cycles/byte). This option ensures that the densest part of each page following the first cachepage would conflict with least dense part of the first cachepage.
−o outputfilespecifies the output file. If not specified, a.out is used.
−p phasemaxSpecifies the maximum number of phases allowed. The default is 20.
−vPrints verbose information. This includes listing those procedures considered part of other procedures and cannot be rearranged (these are basically assembler procedures that may contain relative branches to other procedures rather than relocatable ones). The listing also list those procedures in the flipped area (if any) and a mapping of old location to new.
Restrictions
Since cord works from an input list of procedures generated from profile output, the resulting binary is data dependent. In other words, it may only perform well on the same input data that generated the profile information and may perform worse than the original binary on other data. Furthermore, if the hot areas in the cache don’t fit well into one cachepage, performance can degrade.