audio_hdr(3)

NAME

audio_hdr − audio header structure

SYNOPSIS

#include <multimedia/libaudio.h>

OVERVIEW

Digital audio data represents a quantized approximation of an analog audio signal, which itself is directly proportional to the rapid changes of air pressure that we perceive as sound. Pulse Code Modulation (PCM) refers to the sampling, at discrete intervals, of an audio waveform to produce a (usually linear) quantization. In order to achieve the best approximation of a signal, the highest possible sampling frequency and precision should be used. However, it is often impractical to process and store the huge amounts of data that such a practice yields.

In lowering the overall data rate, a number of compromises must be made. To begin with, the sampling frequency is generally reduced to a level commensurate with the quality of the audio signal. For speech and telephony, 8, 12, and 16 kHz sampling rates are typically used. For high-fidelity audio, 44.1 and 48 kHz sampling rates are predominant.

Sampling precision may also be reduced to yield lower data rates. Various encoding schemes further compress the signal. Some of these, such as μ-law and A-law companding, map the linear PCM domain into a pseudo-logarithmic domain. Like PCM encoding, these schemes maintain a one-to-one relationship between individual samples and waveform quantizations. Other encoding schemes, such as Differential and Adaptive Differential PCM, convolve several samples into a single sound unit that must be treated as an atomic entity.

Multiple channels of audio data may be combined into a single audio stream by interleaving individual sound units from each channel. For each quantized period of time, the set consisting of one sound unit from each audio channel is known as a sample frame.

AUDIO HEADER

For programming purposes, the encoding of audio data is described by an in-core Audio_hdr structure. This audio header is defined in <multimedia/audio_hdr.h> as follows:

typedef struct {
unsignedsample_rate;/∗ samples per second ∗/
unsignedsamples_per_unit;/∗ samples per unit ∗/
unsignedbytes_per_unit;/∗ bytes per sample unit ∗/
unsignedchannels;/∗ # of interleaved channels ∗/
unsignedencoding;/∗ data encoding format ∗/
unsigneddata_size;/∗ length of data (advisory) ∗/
} Audio_hdr;

The sample_rate field is an integer representing the sampling frequency, in samples per second, of the audio data.

The samples_per_unit and bytes_per_unit fields describe an individual sound unit. For PCM and related encodings, the samples_per_unit field is 1, and bytes_per_unit gives the precision. For encodings that group several samples together, bytes_per_unit reflects the size of an atomic unit of audio data.

The channels field describes the number of interleaved audio channels. The quantity

channels ∗ bytes_per_unit

gives the size, in bytes, of each sample frame.

These four values are used to compute the byte offset given a particular time, and vice versa. Specifically:

seconds = offset / C
offset = seconds ∗ C

where:

C = (channels ∗ bytes_per_unit ∗ sample_rate) / samples_per_unit

The encoding field enumerates the specific data encoding. The following values are defined:

AUDIO_ENCODING_ULAW ISDN μ-law companding

AUDIO_ENCODING_ALAW ISDN A-law companding

AUDIO_ENCODING_LINEAR linear PCM 2’s-complement integer (0-center)

AUDIO_ENCODING_FLOAT linear IEEE floating-point (−1. ≤ x ≤ +1.)

Note that floating-point data is normalized to ±1. Many signal-processing algorithms are simplified by such normalization since the result of the multiplication of two signals is guaranteed to lie within the acceptable range. Thus, bounds-checking need only be performed when converting to or from integer representations.

The data_size field contains the number of bytes in the data stream. This value is advisory only, as this quantity is not always known in advance. In such cases, the special value AUDIO_UNKNOWN_SIZE should be used.

WARNINGS

The manual pages, header files, and object library associated with /usr/demo/SOUND/libaudio.a are furnished on an as is basis as a preliminary interface to several useful audio data-processing capabilities. A future release of the operating system may redefine the syntax or semantics of some of the functions described herein.

SunOS 5.2 — Last change: 30 Sep 1991

Museum