Museum

Home

Lab Overview

Retrotechnology Articles

⇒ Online Manual

Media Vault

Software Library

Restoration Projects

Artifacts Sought

Related Articles

tcpdump(1)



bpf(4)                         DG/UX R4.11MU05                        bpf(4)


NAME
       bpf - Berkeley Packet Filter

SYNOPSIS
       bpf0

DESCRIPTION
       The Berkeley Packet Filter provides a raw interface to data link
       layers in a protocol-independent fashion.  All packets on the
       network, even those destined for other hosts, are accessible through
       this interface.

       The packet filter appears as a character special clonable device,
       /dev/bpf0.  After opening this file, the file descriptor must be
       bound to a specific interface with the BIOCSETIF or BIOCSETIF2
       ioctls.  The interfaces can be bound to more than one file
       descriptor, and the filter underlying each descriptor will see an
       identical packet stream.  If /dev/bpf0 does not exist, you can build
       a new kernel with the bpf() entry in the system file, and reboot your
       system.

       A user-settable packet filter is associated with each open instance
       of bpf0.  Whenever an interface receives a packet, all file
       descriptors listening on that interface apply their filter.  Each
       descriptor that accepts the packet receives its own copy.

       Reads from these file descriptors return the next group of packets
       that have matched the filter.  To improve performance, the buffer
       passed to read must be the same size as the buffers used internally
       by bpf.  A user application can get/set the size of this buffer with
       the BIOCGBLEN/BIOCSBLEN ioctl.

       The packet filter supports the following link level protocols:
       Ethernet, SLIP, FDDI, and Token Ring.  The packet filter also
       supports attaching at the bottom and top of IP; that is, ip_bottom
       and ip_top, respectively.

       Since packet data is in network byte order, applications should use
       the byteorder(3N) macros to extract multi-byte values.

   Ioctls
       The ioctl command codes below are defined in <net/bpf.h>.  All
       commands require these includes:

            #include <sys/types.h>
            #include <sys/time.h>
            #include <sys/ioctl.h>
            #include <net/bpf.h>

       Additionally, BIOCGETIF, BIOCSETIF, and BIOCSETIF2 require
       <net/if.h>.

       In addition to FIONREAD and SIOCGIFADDR, the following commands may
       be applied to any open instance of bpf0.  The third argument to the
       ioctl should be a pointer to the type indicated.

       BIOCGBLEN (u_int)
                 Returns the required buffer length for reads.

       BIOCSBLEN (u_int)
                 Sets the buffer length (in bytes) for reads.  If the
                 requested buffer size cannot be accommodated, the closest
                 allowable size will be set and returned in the argument.  A
                 read call will result in EIO if it is passed a buffer that
                 is not this size.  Note that an individual packet larger
                 than this size is necessarily truncated.

       BIOCGDLT (u_int)
                 Returns the type of the data link layer underlying the
                 attached interface.  EINVAL is returned if no interface has
                 been specified.  The device types are defined in
                 <net/bpf.h>.

       BIOCPROMISC
                 Forces the interface into promiscuous mode.  All packets,
                 not just those destined for the local host, are processed.
                 Since more than one file can be listening on a given
                 interface, a listener that opened its interface non-
                 promiscuously may receive packets promiscuously.  This
                 problem can be remedied with an appropriate filter.

                 The interface remains in promiscuous mode until all file
                 instances listening promiscuously are closed.

                 If the interface does not have a promiscuous mode, this
                 ioctl has no effect.

                 You must attach to an interface via the BIOCSETIF or
                 BIOCSETIF2 ioctl before issuing the BIOCPROMISC ioctl.

       BIOCFLUSH Flushes the buffer of incoming packets and resets the
                 statistics that are returned by BIOCGSTATS.

       BIOCGETIFLIST (struct bpf_if_list)
                 Returns a list of the interfaces which can be attached to
                 via the BIOCSETIF or BIOCSETIF2 ioctl.  Upon entry, the
                 bifl_len field equals the size (in bytes) of the buffer
                 pointed to by the bifl_buf field.  Upon return, the
                 bifl_len field equals the size (in bytes) of a buffer
                 required to fully accommodate the interface list; if the
                 interface list is larger than the buffer pointed to by the
                 bifl_buf field, only the number of elements which can fully
                 fit into the buffer are returned.  If the bifl_version
                 field equals BPF_IF_VERSION1, each element of the interface
                 list is defined by struct bpf_if.

       BIOCGETIF (struct ifreq)
                 Returns the name of the interface that was attached to by
                 the BIOCSETIF or BIOCSETIF2 ioctl.  The name is returned in
                 the if_name field of ifreq.  All other fields are
                 undefined.

       BIOCSETIF (struct ifreq)

       BIOCSETIF2 (dev_t)
                 Sets the interface associated with the file descriptor and
                 performs the actions of BIOCFLUSH.  One of these ioctls
                 must be performed before any packets can be read.  With
                 BIOSETIF, indicate the device name in the if_name field of
                 ifreq; the device name is a simple file name, not the
                 complete path (e.g. cien0).  With BIOSETIF2, use the device
                 number to indicate the device; the device number of a
                 device is returned by the stat system call in the st_rdev
                 field of the struct stat structure.

       BIOCGRTIMEOUT (struct timeval)

       BIOCSRTIMEOUT (struct timeval)
                 Gets or sets the read timeout parameter.  The value of
                 timeval specifies the maximum length of time the kernel
                 will wait before sending any buffered packets to a process
                 which is pended at a read of a bpf file descriptor.  This
                 parameter is initialized to zero by open(2), indicating no
                 timeout.

       BIOCGSTATS (struct bpf_stat)
                 Returns the following structure of packet statistics:

                 struct bpf_stat {
                      u_int bs_recv;
                      u_int bs_drop;
                 };

                 The fields are:

                 bs_recv        the number of packets received by the
                                descriptor since opened or reset (including
                                any buffered since the last read call); this
                                includes packets which are rejected as well
                                as those which are accepted by the filter
                                program.

                 bs_drop        the number of packets accepted by the filter
                                program but dropped by the kernel because of
                                buffer overflows (i.e., the application's
                                reads aren't keeping up with the packet
                                traffic).

       BIOCIMMEDIATE (u_int)
                 Enables or disables "immediate mode," based on the truth
                 value of the argument.  When immediate mode is enabled,
                 reads return immediately upon packet reception. This is
                 useful for programs that must respond to messages in real
                 time.  Initially, an open instance of bpf0 has immediate
                 mode disabled, which means that reads block until either
                 the kernel buffer becomes full or a timeout occurs and data
                 must be read.

       BIOCGMAXMEM (long)

       BIOCSMAXMEM (long)
                 Gets or sets the maximum number of scratch memory locations
                 available for use by the filter program.  Each location is
                 4 bytes.

                 The BIOCSETF ioctl explains how to set the filter program.

       BIOCGHOSTTBL (bpf_host_table_t)
                 Returns a copy of the host table, which is maintained from
                 packets seen on the associated interface.  The host table
                 can currently be maintained only if the interface is
                 Ethernet; the filter program must also contain an
                 instruction that causes the host table statistics to be
                 kept (see BPF_MISC+BPF_ROUTINES).

                 The hdr.num_table_eles field must be set to the number of
                 table elements in the buffer pointed to by the table_ptr
                 field.  Upon return, the hdr.num_table_eles field is set to
                 the number of host table elements.  (The number actually
                 returned is the smaller of the current number of elements
                 and the number of elements in the buffer.)  Also upon
                 return, the hdr.max_table_eles field is set to the maximum
                 number of elements that can be in the host table.  A user
                 application can get/set this value with the
                 BIOCGHOSTTBLSIZE/BIOCSHOSTTBLSIZE ioctls.

       BIOCGMATRIXTBL (bpf_matrix_table_t)
                 Returns a copy of the matrix table, which is maintained
                 from packets seen on the associated interface.  The matrix
                 table can currently be maintained only if the interface is
                 Ethernet; the filter program must also contain an
                 instruction that causes the matrix table statistics to be
                 kept (see BPF_MISC+BPF_ROUTINES).

                 The hdr.num_table_eles field must be set to the number of
                 table elements in the buffer pointed to by the table_ptr
                 field.  Upon return, the hdr.num_table_eles field is set to
                 the number of matrix table elements.  (The number actually
                 returned is the smaller of the current number of elements
                 and the number of elements in the buffer.)  Also upon
                 return, the hdr.max_table_eles field is set to the maximum
                 number of elements that can be in the matrix table.  A user
                 application can get/set this value with the
                 BIOCGMATRIXTBLSIZE/BIOCSMATRIXTBLSIZE ioctls.

       BIOCGHOSTTBLSIZE (unsigned int)

       BIOCSHOSTTBLSIZE (unsigned int)
                 Gets or sets the maximum number of elements that the kernel
                 will store in this host table before it begins to drop
                 elements.

       BIOCGMATRIXTBLSIZE (unsigned int)

       BIOCSMATRIXTBLSIZE (unsigned int)
                 Gets or sets the maximum number of elements that the kernel
                 will store in this matrix table before it begins to drop
                 elements.

       BIOCSETF (struct bpf_program)
                 Sets the filter program and performs the actions of
                 BIOCFLUSH.  An array of instructions and its length is
                 passed in using the following structure:

                 struct bpf_program {
                      int bf_len;
                      struct bpf_insn *bf_insns;
                 };

       The fields are:

       bf_insns       points to the filter program

       bf_len         is the length of the filter program

       struct bpf_insn
                      is the units of the length.

       The FILTER MACHINE section explains the filter language.

   BPF Header
       The following structure is prepended to each packet returned by
       read(2):

               struct bpf_hdr {
                    struct timeval bh_tstamp;
                    u_long bh_caplen;
                    u_long bh_datalen;
                    u_short bh_hdrlen;
               };

       The fields, whose values are stored in host order, are:

       bh_tstamp      The time the packet was processed by the packet
                      filter.

       bh_caplen      The length of the captured portion of the packet.
                      This is the minimum of the truncation amount specified
                      by the filter and the length of the packet.

       bh_datalen     The length of the packet off the wire.  This value is
                      independent of the truncation amount specified by the
                      filter.

       bh_hdrlen      The length of the BPF header, which may not be equal
                      to sizeof(struct bpf_hdr).

       The bh_hdrlen field accounts for padding between the bpf_hdr
       structure and the lowest level protocol header.  This provides proper
       alignment of the packet data structures, which is required on
       alignment-sensitive architectures and improves performance on many
       other architectures.

       Additionally, individual packets are padded so that each starts on a
       word boundary.  This requires an application to know how to get from
       packet to packet.  The macro BPF_WORDALIGN, defined in <net/bpf.h>,
       rounds up its argument to the nearest word-aligned value (where a
       word is BPF_ALIGNMENT bytes wide).

       For example, if p points to the start of a packet, this expression
       advances it to the next packet:

                p = (char *)p + BPF_WORDALIGN(p->bh_hdrlen + p->bh_caplen)

       For the alignment mechanisms to work properly, the buffer passed to
       read(2) must itself be word aligned.  malloc(3) always returns an
       aligned buffer.

   Filter Machine
       A filter program is an array of instructions, with all branches
       forwardly directed, terminated by a return instruction.  Each
       instruction performs some action on the pseudo-machine state, which
       consists of an accumulator, index register, scratch memory, and
       implicit program counter.

       The following structure defines the instruction format:

              struct bpf_insn {
                   u_short   code;
                   u_char    jt;
                   u_char    jf;
                   long    k;
              };

       The k field is used in different ways by different instructions, and
       the jt and jf fields are used as offsets by the branch instructions.
       The opcodes are encoded in a semi-hierarchical fashion.  There are
       eight classes of instructions: BPF_LD, BPF_LDX, BPF_ST, BPF_STX,
       BPF_ALU, BPF_JMP, BPF_RET, and BPF_MISC.  Various other mode and
       operator bits are or'd into the class to give the actual
       instructions.  The classes and modes are defined in <net/bpf.h>.

       The semantics for each defined BPF instruction are given below.  A is
       the accumulator, X is the index register, P[] packet data, and M[]
       scratch memory.  P[i:n] gives the data at byte offset i in the
       packet, interpreted as a word (n=4), unsigned halfword (n=2), or
       unsigned byte (n=1).  M[i] gives the i'th word in scratch memory,
       which is addressed only in word units.  Scratch memory is indexed
       from 0 to the number of scratch memory locations (see BIOCGMAXMEM).
       k, jt, and jf are the corresponding fields in the instruction
       definition.  len refers to the length of the packet.


       BPF_LD    These instructions copy a value into the accumulator.  The
                 type of the source operand is specified by an "addressing
                 mode" and can be a constant (BPF_IMM), packet data at a
                 fixed offset (BPF_ABS), packet data at a variable offset
                 (BPF_IND), the packet length (BPF_LEN), or a word in
                 scratch memory (BPF_MEM).  For BPF_IND and BPF_ABS, the
                 data size must be specified as a word (BPF_W), halfword
                 (BPF_H), or byte (BPF_B).  The semantics of all the
                 recognized BPF_LD instructions follow.


                 BPF_LD+BPF_W+BPF_ABS          A <- P[k:4]

                 BPF_LD+BPF_H+BPF_ABS          A <- P[k:2]

                 BPF_LD+BPF_B+BPF_ABS          A <- P[k:1]

                 BPF_LD+BPF_W+BPF_IND          A <- P[X+k:4]

                 BPF_LD+BPF_H+BPF_IND          A <- P[X+k:2]

                 BPF_LD+BPF_B+BPF_IND          A <- P[X+k:1]

                 BPF_LD+BPF_W+BPF_LEN          A <- len

                 BPF_LD+BPF_IMM                A <- k

                 BPF_LD+BPF_MEM                A <- M[k]


       BPF_LDX   These instructions load a value into the index register.
                 The addressing modes are more restricted than those of the
                 accumulator loads, but they include BPF_MSH, which
                 efficiently loads the IP header length.

                 BPF_LDX+BPF_W+BPF_IMM         X <- k

                 BPF_LDX+BPF_W+BPF_MEM         X <- M[k]

                 BPF_LDX+BPF_W+BPF_LEN         X <- len

                 BPF_LDX+BPF_B+BPF_MSH         X <- 4*(P[k:1]&0xf)


       BPF_ST    This instruction stores the accumulator into the scratch
                 memory.  We do not need an addressing mode since there is
                 only one possibility for the destination.

                 BPF_ST                        M[k] <- A


       BPF_STX   This instruction stores the index register into the scratch
                 memory.

                 BPF_STX                       M[k] <- X


       BPF_ALU   The alu instructions perform operations between the
                 accumulator and index register or constant, and store the
                 result back in the accumulator.  For binary operations, a
                 source mode is required (BPF_K or BPF_X).

                 BPF_ALU+BPF_ADD+BPF_K         A <- A + k

                 BPF_ALU+BPF_SUB+BPF_K         A <- A - k

                 BPF_ALU+BPF_MUL+BPF_K         A <- A * k

                 BPF_ALU+BPF_DIV+BPF_K         A <- A / k

                 BPF_ALU+BPF_AND+BPF_K         A <- A & k

                 BPF_ALU+BPF_OR+BPF_K          A <- A | k

                 BPF_ALU+BPF_LSH+BPF_K         A <- A << k

                 BPF_ALU+BPF_RSH+BPF_K         A <- A >> k

                 BPF_ALU+BPF_ADD+BPF_X         A <- A + X

                 BPF_ALU+BPF_SUB+BPF_X         A <- A - X

                 BPF_ALU+BPF_MUL+BPF_X         A <- A * X

                 BPF_ALU+BPF_DIV+BPF_X         A <- A / X

                 BPF_ALU+BPF_AND+BPF_X         A <- A & X

                 BPF_ALU+BPF_OR+BPF_X          A <- A | X

                 BPF_ALU+BPF_LSH+BPF_X         A <- A << X

                 BPF_ALU+BPF_RSH+BPF_X         A <- A >> X

                 BPF_ALU+BPF_NEG               A <- -A


       BPF_JMP   The jump instructions alter flow of control.  Conditional
                 jumps compare the accumulator against a constant (BPF_K) or
                 the index register (BPF_X).  If the result is non-zero, the
                 true branch is taken, otherwise the false branch is taken.
                 Jump offsets are encoded in 8 bits, so the longest jump is
                 256 instructions.  However, the jump always (BPF_JA) opcode
                 uses the 32-bit k field as the offset, allowing arbitrarily
                 distant destinations.  All conditionals use unsigned
                 comparison conventions.

                 BPF_JMP+BPF_JA                pc += k

                 BPF_JMP+BPF_JGT+BPF_K         pc += (A > k) ? jt : jf

                 BPF_JMP+BPF_JGE+BPF_K         pc += (A >= k) ? jt : jf

                 BPF_JMP+BPF_JEQ+BPF_K         pc += (A == k) ? jt : jf

                 BPF_JMP+BPF_JSET+BPF_K        pc += (A & k) ? jt : jf

                 BPF_JMP+BPF_JGT+BPF_X         pc += (A > X) ? jt : jf

                 BPF_JMP+BPF_JGE+BPF_X         pc += (A >= X) ? jt : jf

                 BPF_JMP+BPF_JEQ+BPF_X         pc += (A == X) ? jt : jf

                 BPF_JMP+BPF_JSET+BPF_X        pc += (A & X) ? jt : jf

       BPF_RET   The return instructions terminate the filter program and
                 specify the amount of packet to accept (i.e., they return
                 the truncation amount).  A return value of zero indicates
                 that the packet should be ignored.  The return value is
                 either a constant (BPF_K) or the accumulator (BPF_A).

                 BPF_RET+BPF_A                 accept A bytes

                 BPF_RET+BPF_K                 accept k bytes

       BPF_MISC  The miscellaneous category includes instructions that don't
                 fit into the above classes and new instructions that need
                 to be added.  Currently, these are the register transfer
                 instructions, which copy the index register to the
                 accumulator and vice versa, and an instruction that
                 contains an index into a table of routines to perform
                 specific kernel processing on each packet.

                 BPF_MISC+BPF_TAX              X <- A

                 BPF_MISC+BPF_TXA              A <- X

                 BPF_MISC+BPF_ROUTINES         Call the k'th routine.  k can
                                               have two values:
                                               BPF_HOST_TABLE_ROUTINE, which
                                               calls the routine to keep
                                               host table statistics;
                                               BPF_MATRIX_TABLE_ROUTINE,
                                               which calls the routine to
                                               keep matrix table statistics.

       The BPF interface provides the following macros to facilitate array
       initializers:
              BPF_STMT(opcode, operand)
              BPF_JUMP(opcode, operand, true_offset, false_offset)


EXAMPLES
       This filter accepts only Reverse ARP requests.

              struct bpf_insn insns[] = {
                   BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12),
                   BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_REVARP, 0, 3),
                   BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 20),
                   BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, REVARP_REQUEST, 0, 1),
                   BPF_STMT(BPF_RET+BPF_K, sizeof(struct ether_arp) +
                         sizeof(struct ether_header)),
                   BPF_STMT(BPF_RET+BPF_K, 0),
              };

       This filter accepts only IP packets between host 128.3.112.15 and
       128.3.112.35.

              struct bpf_insn insns[] = {
                   BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12),
                   BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_IP, 0, 8),
                   BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 26),
                   BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x8003700f, 0, 2),
                   BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 30),
                   BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x80037023, 3, 4),
                   BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x80037023, 0, 3),
                   BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 30),
                   BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x8003700f, 0, 1),
                   BPF_STMT(BPF_RET+BPF_K, (u_int)-1),
                   BPF_STMT(BPF_RET+BPF_K, 0),
              };

       This filter returns only TCP finger packets.  We must parse the IP
       header to reach the TCP header.  The BPF_JSET instruction checks that
       the IP fragment offset is 0 so we are sure that we have a TCP header.

              struct bpf_insn insns[] = {
                   BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12),
                   BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_IP, 0, 10),
                   BPF_STMT(BPF_LD+BPF_B+BPF_ABS, 23),
                   BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, IPPROTO_TCP, 0, 8),
                   BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 20),
                   BPF_JUMP(BPF_JMP+BPF_JSET+BPF_K, 0x1fff, 6, 0),
                   BPF_STMT(BPF_LDX+BPF_B+BPF_MSH, 14),
                   BPF_STMT(BPF_LD+BPF_H+BPF_IND, 14),
                   BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 79, 2, 0),
                   BPF_STMT(BPF_LD+BPF_H+BPF_IND, 16),
                   BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 79, 0, 1),
                   BPF_STMT(BPF_RET+BPF_K, (u_int)-1),
                   BPF_STMT(BPF_RET+BPF_K, 0),
              };

FILES
       /dev/bpf0

SEE ALSO
       tcpdump(1).


Licensed material--property of copyright holder(s)

Typewritten Software • bear@typewritten.org • Edmonds, WA 98026