Museum

Home

Lab Overview

Retrotechnology Articles

⇒ Online Manual

Media Vault

Software Library

Restoration Projects

Artifacts Sought

Related Articles

sh(1)

environ(5)

lang(5)

regexp(5)

csplit(1)

NAME

csplit − context split

SYNOPSIS

csplit [-s] [-k] [-f prefix] [-n number] file arg1 [...argn]

DESCRIPTION

csplit reads file, separates it into n+1 sections as defined by the arguments arg1 ... argn, and places the results in separate files. The maximum number of arguments (arg1 through argn) allowed is 99 unless the -n number option is used to allow for more output file names.  If the -f prefix option is specified, the resulting filenames are prefix00 through prefixNN where NN is the two-digit value of n using a leading zero if n is less than 10.  If the -f prefix option is not specified, the default filenames xx00 through xxNN are used.  file is divided as follows:

Default Prefixed
Filename Filename Contents
xx00 prefix00 From start of file up to (but not including) the line referenced by arg1. 
xx01 prefix01 From the line referenced by arg1 up to the line referenced by arg2. 
.
.
.
xxNN prefixNN From the line referenced by argn to end of file. 

If the file argument is -, standard input is used. 

csplit supports the Basic Regular Expression syntax (see regexp(5)).

Options

csplit recognizes the following options:

-s Suppress printing of all character counts (csplit normally prints the character counts for each file created). 

-k Leave previously created files intact (csplit normally removes created files if an error occurs). 

-f prefix Name created files prefix00 through prefixNN (default is xx00 through xxNN. 

-n number The output file name suffix will use number digits instead of the default 2.  This allows creation of more than 100 output files. 

Arguments (arg1 through argn) to csplit can be any combination of the following:

/regexp/ Create a file containing the section from the current line up to (but not including) the line matching the regular expression regexp. The new current line becomes the line matching regexp.

/regexp/+n

/regexp/-n Create a file containing the section from the current line up to (but not including) the nth before (-n) or after (+n) the line matching the regular expression regexp. (e.g., /Page/-5).  The new current line becomes the line matching regexp n lines. 

%regexp% equivalent to /regexp/, except that no file is created for the section. 

line_number Create a file from the current line up to (but not including) line_number. The new current line becomes line_number.

{num} Repeat argument.  This argument can follow any of the above argument forms.  If it follows a regexp argument, that argument is applied num more times.  If it follows line_number, the file is split every line_number lines for num times from that point until end-of-file is reached or num expires. 

Enclose in appropriate quotes all regexp arguments containing blanks or other characters meaningful to the shell.  Regular expressions must not contain embedded new-lines.  csplit does not alter or remove the original file; it is the user’s responsibility to remove it when appropriate. 

EXTERNAL INFLUENCES

Environment Variables

LC_COLLATE determines the collating sequence used in evaluating regular expressions. 

LC_CTYPE determines the characters matched by character class expressions in regular expressions. 

If LC_COLLATE or LC_CTYPE is not specified in the environment or is set to the empty string, the value of LANG is used as a default for each unspecified or empty variable.  If LANG is not specified or is set to the empty string, a default of "C" (see lang(5)) is used instead of LANG.  If any internationalization variable contains an invalid setting, csplit behaves as if all internationalization variables are set to "C".  See environ(5).

International Code Set Support

Single-byte character code sets are supported. 

DIAGNOSTICS

Messages are self explanatory except for:

arg - out of range

which means that the given argument did not reference a line between the current position and the end of the file.  This warning also occurs if the file is exhausted before the repeat count is. 

EXAMPLES

Create four files, cobol00 through cobol03.  After editing the “split” files, recombine them back into the original file, destroying its previous contents. 

csplit -f cobol file ’/procedure division/’ /par5./ /par16./

Perform editing operations

cat cobol0[0-3] > file

Split a file at every 100 lines, up to 10,000 lines (100 files).  The -k option causes the created files to be retained if there are fewer than 10,000 lines (an error message is still printed). 

csplit -k file 100 ’{99}’

Assuming that prog.c follows the normal C coding convention of terminating routines with a } at the beginning of the line, create a file containing each separate C routine (up to 21) in prog.c. 

csplit -k prog.c ’%main(%’ ’/^}/+1’ ’{20}’

SEE ALSO

sh(1), environ(5), lang(5), regexp(5). 

STANDARDS CONFORMANCE

csplit: SVID2, XPG2, XPG3

Hewlett-Packard Company  —  HP-UX Release 9.0: August 1992

Typewritten Software • bear@typewritten.org • Edmonds, WA 98026