Museum

Home

Lab Overview

Retrotechnology Articles

⇒ Online Manual

Media Vault

Software Library

Restoration Projects

Artifacts Sought

Related Articles

regcmp(1)

malloc(3C)

ed(1)

regcmp(3G)





   regcmp(3G)                (Specialized Libraries)                regcmp(3G)


   NAME
         regcmp, regex - compile and execute regular expression

   SYNOPSIS
         #include <libgen.h>

         cc [flag ...] file ...  -lgen [library ...]

         char *regcmp (const char *string1 [, char *string2, ...],
             (char *)0);

         char *regex (const char *re, const char *subject
             [, char *ret0, ...]);

         extern char *__loc1;

   DESCRIPTION
         regcmp compiles a regular expression (consisting of the concatenated
         arguments) and returns a pointer to the compiled form.  malloc(3C) is
         used to create space for the compiled form.  It is the user's
         responsibility to free unneeded space so allocated.  A NULL return
         from regcmp indicates an incorrect argument.  regcmp(1) has been
         written to generally preclude the need for this routine at execution
         time.

         regex executes a compiled pattern against the subject string.
         Additional arguments are passed to receive values back.  regex
         returns NULL on failure or a pointer to the next unmatched character
         on success.  A global character pointer __loc1 points to where the
         match began.  regcmp and regex were mostly borrowed from the editor,
         ed(1); however, the syntax and semantics have been changed slightly.
         The following are the valid symbols and associated meanings.

         []*.^     These symbols retain their meaning in ed(1).

         $         Matches the end of the string; \n matches a newline.

         -         Within brackets the minus means through.  For example,
                   [a-z] is equivalent to [abcd...xyz].  The - can appear as
                   itself only if used as the first or last character.  For
                   example, the character class expression []-] matches the
                   characters ] and -.

         +         A regular expression followed by + means one or more times.
                   For example, [0-9]+ is equivalent to [0-9][0-9]*.

         {m} {m,} {m,u}
                   Integer values enclosed in {} indicate the number of times
                   the preceding regular expression is to be applied.  The
                   value m is the minimum number and u is a number, less than
                   256, which is the maximum.  If only m is present (i.e.,


   8/91                                                                 Page 1









   regcmp(3G)                (Specialized Libraries)                regcmp(3G)


                   {m}), it indicates the exact number of times the regular
                   expression is to be applied.  The value {m,} is analogous
                   to {m,infinity}.  The plus (+) and star (*) operations are
                   equivalent to {1,} and {0,} respectively.

         ( ... )$n The value of the enclosed regular expression is to be
                   returned.  The value will be stored in the (n+1)th argument
                   following the subject argument.  At most, ten enclosed
                   regular expressions are allowed.  regex makes its
                   assignments unconditionally.

         ( ... )   Parentheses are used for grouping.  An operator, e.g., *,
                   +, {}, can work on a single character or a regular
                   expression enclosed in parentheses.  For example,
                   (a*(cb+)*)$0.

         By necessity, all the above defined symbols are special.  They must,
         therefore, be escaped with a \ (backslash) to be used as themselves.

   EXAMPLES
         The following example matches a leading newline in the subject string
         pointed at by cursor.

               char *cursor, *newcursor, *ptr;
                     ...
               newcursor = regex((ptr = regcmp("^\n", (char *)0)), cursor);
               free(ptr);

         The following example matches through the string Testing3 and returns
         the address of the character after the last matched character (the
         ``4'').  The string Testing3 is copied to the character array ret0.

               char ret0[9];
               char *newcursor, *name;
                     ...
               name = regcmp("([A-Za-z][A-za-z0-9]{0,7})$0", (char *)0);
               newcursor = regex(name, "012Testing345", ret0);

         The following example applies a precompiled regular expression in
         file.i [see regcmp(1)] against string.

               #include "file.i"
               char *string, *newcursor;
                     ...
               newcursor = regex(name, string);

   SEE ALSO
         regcmp(1), malloc(3C).
         ed(1) in the User's Reference Manual.




   Page 2                                                                 8/91









   regcmp(3G)                (Specialized Libraries)                regcmp(3G)


   NOTES
         The user program may run out of memory if regcmp is called
         iteratively without freeing the vectors no longer required.


















































   8/91                                                                 Page 3





Typewritten Software • bear@typewritten.org • Edmonds, WA 98026