[Chapter 14] Programming /dev/dsp

Programming /dev/dsp

/dev/dsp is the digital sampling and digital recording device, and probably the most important for multimedia applications. Writing to the device accesses the D/A converter to produce sound. Reading the device activates the A/D converter for sound recording and analysis.

The name DSP comes from the term digital signal processor, a specialized processor chip optimized for digital signal analysis. Sound cards may use a dedicated DSP chip, or may implement the functions with a number of discrete devices. Other terms that may be used for this device are digitized voice and PCM.

Some sounds cards provide more than one digital sampling device; in this case a second device is available as /dev/dsp1. Unless noted otherwise, this device operates in the same manner as /dev/dsp.

The DSP device is really two devices in one. Opening for read-only access allows you to use the A/D converter for sound input. Opening for write only will access the D/A converter for sound output. Generally speaking you should open the device either for read only or for write only. It is possible to perform both read and write on the device, albeit with some restrictions; this will be covered in a later section.

Only one process can have the DSP device open at a time. Attempts by another process to open it will fail with an error code of EBUSY.

Reading from the DSP device returns digital sound samples obtained from the A/D converter. Figure 14-2(a) shows a conceptual diagram of this process. Analog data is converted to digital samples by the analog to digital converter under control of the kernel sound driver and stored in a buffer internal to the kernel. When an application program invokes the read system call, the data is transferred to the calling program's data buffer. It is important to understand that the sampling rate is dependent on the kernel driver, and not the speed at which the application program reads it.

Figure 14-2: Accessing /dev/dsp

When reading from /dev/dsp you will never encounter an end-of-file condition. If data is read too slowly (less than the sampling rate), the excess data will be discarded, resulting in gaps in the digitized sound. If you read the device too quickly, the kernel sound driver will block your process until the required amount of data is available.

The input source depends on the mixer setting (which I will look at shortly); the default is the microphone input. The format of the digitized data depends on which ioctl calls have been used to set up the device. Each time the device is opened, its parameters are set to default values. The default is 8-bit unsigned samples, using one channel (mono), and an 8 kHz sampling rate.

Writing a sequence of digital sample values to the DSP device produces sound output. This process is illustrated in Figure 14-2(b). Again, the format can be defined using ioctl calls, but defaults to the values given above for the read system call (8-bit unsigned data, mono, 8 kHz sampling).

If the data are written too slowly, there will be dropouts or pauses in the sound output. Writing the data faster than the sampling rate will simply cause the kernel sound driver to block the calling process until the sound card hardware is ready to process the new data. Unlike some devices, there is no support for non-blocking I/O.

If you don't like the defaults, you can change them through ioctl calls. In general you should set the parameters after opening the device, and before any calls to read or write. You should also set the parameters in the order in which they are described below.

All DSP ioctl calls take a third argument that is a pointer to an integer. Don't try to pass a constant; you must use a variable. The call will return -1 if an error occurs, and set the global variable errno.

If the hardware doesn't support the exact value you call for, the sound driver will try to set the parameter to the closest allowable value. For example, with my sound card, selecting a sampling rate of 9000 Hz will result in an actual rate of 9009 Hz being used.

If a parameter is out of range, the driver will set it to the closest value (i.e., the upper or lower limit). For example, attempting to use 16-bit sampling with an 8-bit sound card will result in the driver selecting 8 bits, but no error will be returned. It is up to you, the programmer, to verify that the value returned is acceptable to your application.

All of the ioctl calls for the DSP device are names starting with SOUND_PCM. Calls in the form SOUND_PCM_READ_XXX are used to return just the current value of a parameter. To change the values, the ioctl calls are named like SOUND_PCM_WRITE_XXX. As discussed above, these calls also return the selected value, which is not necessarily the same as the value passed to the sound driver.

The ioctl constants are defined in the header file linux/soundcard.h. Let's examine each of them in detail.

SOUND_PCM_WRITE_BITS
SOUND_PCM_READ_BITS
SOUND_PCM_WRITE_CHANNELS
SOUND_PCM_READ_CHANNELS
SOUND_PCM_WRITE_RATE
SOUND_PCM_READ_RATE

Sample Program

I will now illustrate programming of the DSP device with a short example. I call the program in Example 14-2 parrot. It records a few seconds of audio, saving it to an array in memory, then plays it back.Reading and Writing the /dev/dsp Device

/*
 * parrot.c
 * Program to illustrate /dev/dsp device
 * Records several seconds of sound, then echoes it back.
 * Runs until Control-C is pressed.
 */

#include <unistd.h>
#include <fcntl.h>
#include <sys/types.h>
#include <sys/ioctl.h>
#include <stdlib.h>
#include <stdio.h>
#include <linux/soundcard.h>

#define LENGTH 3    /* how many seconds of speech to store */
#define RATE 8000   /* the sampling rate */
#define SIZE 8      /* sample size: 8 or 16 bits */
#define CHANNELS 1  /* 1 = mono 2 = stereo */

/* this buffer holds the digitized audio */
unsigned char buf[LENGTH*RATE*SIZE*CHANNELS/8];

int main()
{
  int fd;       /* sound device file descriptor */
  int arg;      /* argument for ioctl calls */
  int status;   /* return status of system calls */

  /* open sound device */
  fd = open("/dev/dsp", O_RDWR);
  if (fd < 0) {
    perror("open of /dev/dsp failed");
    exit(1);
  }

  /* set sampling parameters */
  arg = SIZE;      /* sample size */
  status = ioctl(fd, SOUND_PCM_WRITE_BITS, &arg);
  if (status == -1)
    perror("SOUND_PCM_WRITE_BITS ioctl failed");
  if (arg != SIZE)
    perror("unable to set sample size");

  arg = CHANNELS;  /* mono or stereo */
  status = ioctl(fd, SOUND_PCM_WRITE_CHANNELS, &arg);
  if (status == -1)
    perror("SOUND_PCM_WRITE_CHANNELS ioctl failed");
  if (arg != CHANNELS)
    perror("unable to set number of channels");

  arg = RATE;      /* sampling rate */
  status = ioctl(fd, SOUND_PCM_WRITE_RATE, &arg);
  if (status == -1)
    perror("SOUND_PCM_WRITE_WRITE ioctl failed");

  while (1) { /* loop until Control-C */
    printf("Say something:\n");
    status = read(fd, buf, sizeof(buf)); /* record some sound */
    if (status != sizeof(buf))
      perror("read wrong number of bytes");
    printf("You said:\n");
    status = write(fd, buf, sizeof(buf)); /* play it back */
    if (status != sizeof(buf))
      perror("wrote wrong number of bytes");
    /* wait for playback to complete before recording again */
    status = ioctl(fd, SOUND_PCM_SYNC, 0); 
  if (status == -1)
    perror("SOUND_PCM_SYNC ioctl failed");
  }
}

The source file starts by including a number of standard header files, including linux/soundcard.h. Then some constants are defined for the sound card settings used in the program, which makes it easy to change the values used. A static buffer is defined to hold the sound data.

I first open the DSP device for both read and write and check that the open was successful. Next I set the sampling parameters using ioctl calls. Notice that a variable must be used because the driver expects a pointer. In each case I check for an error from the ioctl call (a return value of -1), and that the values actually used are within range. This programming may appear to be overly cautious, but I consider it good coding practice that pays off when trying to debug the code. Note that I do not check that the actual sampling rate returned matches the selected rate because of the sampling rate rounding previously described.

I then run in a loop, first prompting the user to speak, then reading the sound data into the buffer. Once the data is received, I warn the user, then write the same data back to the DSP device, where it should be heard. This repeats until the program is interrupted with Control-C.

The SOUND_PCM_SYNC ioctl has not yet been mentioned. I'll show what this is used for in the section titled "Advanced Sound Programming," later in this chapter.

Try compiling and running this program. Then make some enhancements:

Make the parameters selectable using command-line options (sample rate, size, time). See the effect on sound quality with different sampling rates.
Reverse the sound samples (and listen for hidden messages), or play them back at a different sampling rate from the one at which they were recorded.
Automatically start recording when the voice starts and stop when silence occurs (or a maximum time is reached). Hints: for 8-bit unsigned data the zero value is 0x80, but you will likely see values that vary around this level due to noise. Set a noise threshold (or better yet, measure the background noise level at the start of the program).
Bonus question: modify the program so that it can recognize the words that are spoken.

Advanced Sound Programming

This section describes some miscellaneous sound programming issues that require special consideration or are less commonly used.

We saw earlier that /dev/dsp operates using unsigned data, either 8 or 16 bits in size, while /dev/audio uses mu-law encoded data. It is possible to change the data formats a device uses with the SOUND_PCM_SETFMT ioctl call. A number of data formats are defined in the soundcard.h header file, all prefixed with the string AFMT_. For example, to set the coding format to mu-law, you could use:

fmt = AFMT_MU_LAW;
ioctl(fd, SOUND_PCM_SETFMT, &fmt);

The argument will be returned with the coding format that was selected by the kernel (which will be the same as the one selected unless the device does not support it). The special format AFMT_QUERY will return default format for the device. To find out all of the formats that a given device supports, you can use the SOUND_PCM_GETFMTS ioctl. It returns a bitmask that has bits set for each of the supported formats.

The SNDCTL_DSP_GETBLKSIZE ioctl returns the block size that the sound driver uses for data transfers. The returned value is an integer, indicating the number in bytes. This information can be useful in an application program for selecting a buffer size that ensures that the data passed to the driver is transferred in complete blocks.

The SNDCTL_DSP_GETCAPS ioctl returns a bitmask identifying various capabilities of a sound card DSP device. They are listed in soundcard.h with labels prefixed by DSP_CAP. A typical capability is DSP_CAP_DUPLEX, a boolean flag indicating whether the device supports full duplex mode (simultaneous record and playback).

Example 14-6 illustrates these system calls, displaying information about a DSP device (/dev/dsp by default).Determining DSP Capabilities

/*
 * dsp_info.c
 * Example program to display sound device capabilities
 */

#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <sys/ioctl.h>
#include <fcntl.h>
#include <linux/soundcard.h>

/* utility function for displaying boolean status */
static char *yes_no(int condition)
{
  if (condition) return "yes"; else return "no";
}

/*
 * Set sound device parameters to given values. Return -1 if
 * values not valid. Sampling rate is returned.
 */
static int set_dsp_params(int fd, int channels, int bits, int *rate) {
  int status, val = channels;

  status = ioctl(fd, SOUND_PCM_WRITE_CHANNELS, &val);
  if (status == -1)
    perror("SOUND_PCM_WRITE_CHANNELS ioctl failed");
  if (val != channels) /* not valid, so return */
    return -1;
  val = bits;
  status = ioctl(fd, SOUND_PCM_WRITE_BITS, &val);
  if (status == -1)
    perror("SOUND_PCM_WRITE_BITS ioctl failed");
  if (val != bits)
    return -1;
  status = ioctl(fd, SOUND_PCM_WRITE_RATE, rate);
  if (status == -1)
    perror("SOUND_PCM_WRITE_RATE ioctl failed");
  return 0;
}

int main(int argc, char *argv[])
{
  int rate;
  int channels;            /* number of channels */
  int bits;                /* sample size */
  int blocksize;           /* block size */
  int formats;             /* data formats */
  int caps;                /* capabilities */
  int deffmt;              /* default format */
  int min_rate, max_rate;  /* min and max sampling rates */
  char *device;            /* name of device to report on */
  int fd;                  /* file descriptor for device */
  int status;              /* return value from ioctl */

  /* get device name from command line or use default */  
  if (argc == 2)
    device = argv[1];
  else
    device = "/dev/dsp";

  /* try to open device */
  fd = open(device, O_RDWR);
  if (fd == -1) {
    fprintf(stderr, "%s: unable to open `%s', ", argv[0], device);
    perror("");
    return 1;
  }
  
  status = ioctl(fd, SOUND_PCM_READ_RATE, &rate);
  if (status ==  -1)
    perror("SOUND_PCM_READ_RATE ioctl failed");
  status = ioctl(fd, SOUND_PCM_READ_CHANNELS, &channels);
  if (status ==  -1)
    perror("SOUND_PCM_READ_CHANNELS ioctl failed");
  status = ioctl(fd, SOUND_PCM_READ_BITS, &bits);
  if (status ==  -1)
    perror("SOUND_PCM_READ_BITS ioctl failed");
  status = ioctl(fd, SNDCTL_DSP_GETBLKSIZE, &blocksize);
  if (status ==  -1)
    perror("SNFCTL_DSP_GETBLKSIZE ioctl failed");
  
  printf(
         "Information on %s:\n\n"
         "Defaults:\n"
         "  sampling rate: %d Hz\n"
         "  channels: %d\n"
         "  sample size: %d bits\n"
         "  block size: %d bytes\n",
         device, rate, channels, bits, blocksize
         );

/* this requires a more recent version of the sound driver */
#if SOUND_VERSION >= 301
  printf("\nSupported Formats:\n");
  deffmt = AFMT_QUERY;
  status = ioctl(fd, SOUND_PCM_SETFMT, &deffmt);
  if (status ==  -1)
    perror("SOUND_PCM_SETFMT ioctl failed");
  status = ioctl(fd, SOUND_PCM_GETFMTS, &formats);
  if (status ==  -1)
    perror("SOUND_PCM_GETFMTS ioctl failed");
  if (formats & AFMT_MU_LAW) {
    printf("  mu-law");
    (deffmt == AFMT_MU_LAW) ? printf(" (default)\n") : printf("\n");
  }
  if (formats & AFMT_A_LAW) {
    printf("  A-law");
    (deffmt == AFMT_A_LAW) ? printf(" (default)\n") : printf("\n");
  }
  if (formats & AFMT_IMA_ADPCM) {
    printf("  IMA ADPCM");
    (deffmt == AFMT_IMA_ADPCM) ? printf(" (default)\n") : printf("\n");
  }
  if (formats & AFMT_U8) {
    printf("  unsigned 8-bit");
    (deffmt == AFMT_U8) ? printf(" (default)\n") : printf("\n");
  }
  if (formats & AFMT_S16_LE) {
    printf("  signed 16-bit little-endian");
    (deffmt == AFMT_S16_LE) ? printf(" (default)\n") : printf("\n");
  }
  if (formats & AFMT_S16_BE) {
    printf("  signed 16-bit big-endian");
    (deffmt == AFMT_S16_BE) ? printf(" (default)\n") : printf("\n");
  }
  if (formats & AFMT_S8) {
    printf("  signed 8-bit");
    (deffmt == AFMT_S8) ? printf(" (default)\n") : printf("\n");
  }
  if (formats & AFMT_U16_LE) {
    printf("  unsigned 16-bit little-endian");
    (deffmt == AFMT_U16_LE) ? printf(" (default)\n") : printf("\n");
  }
  if (formats & AFMT_U16_BE) {
    printf("  unsigned 16-bit big-endian");
    (deffmt == AFMT_U16_BE) ? printf(" (default)\n") : printf("\n");
  }
  if (formats & AFMT_MPEG) {
    printf("  MPEG 2");
    (deffmt == AFMT_MPEG) ? printf(" (default)\n") : printf("\n");
  }
  
  printf("\nCapabilities:\n");
  status = ioctl(fd, SNDCTL_DSP_GETCAPS, &caps);
  if (status ==  -1)
    perror("SNDCTL_DSP_GETCAPS ioctl failed");
  printf(
         "  revision: %d\n"
         "  full duplex: %s\n"
         "  real-time: %s\n"
         "  batch: %s\n"
         "  coprocessor: %s\n" 
         "  trigger: %s\n"
         "  mmap: %s\n",
         caps & DSP_CAP_REVISION,
         yes_no(caps & DSP_CAP_DUPLEX),
         yes_no(caps & DSP_CAP_REALTIME),
         yes_no(caps & DSP_CAP_BATCH),
         yes_no(caps & DSP_CAP_COPROC),
         yes_no(caps & DSP_CAP_TRIGGER),
         yes_no(caps & DSP_CAP_MMAP));

#endif /* SOUND_VERSION >= 301 */
  
  /* display table heading */
  printf(
         "\nModes and Limits:\n"
         "Device    Sample    Minimum   Maximum\n"
         "Channels  Size      Rate      Rate\n"
         "--------  --------  --------  --------\n"
         );
  
  /* do mono and stereo */  
  for (channels = 1; channels <= 2 ; channels++) {
    /* do 8 and 16 bits */
    for (bits = 8; bits <= 16 ; bits += 8) {
      /* To find the minimum and maximum sampling rates we rely on
         the fact that the kernel sound driver will round them to
         the closest legal value. */
      min_rate = 1;
      if (set_dsp_params(fd, channels, bits, &min_rate) == -1)
        continue;
      max_rate = 100000;
      if (set_dsp_params(fd, channels, bits, &max_rate) == -1)
        continue;
      /* display the results */
      printf("%8d  %8d  %8d  %8d\n", channels, bits, min_rate, max_rate);
    }
  }
  close(fd);
  return 0;
}

Typical output from the dsp_info program looks like this:

Information on /dev/dsp:
Defaults:
  sampling rate: 8000 Hz
  channels: 1
  sample size: 8 bits
  block size: 4096 bytes
Supported Formats:
  mu-law
  unsigned 8-bit (default)
Capabilities:
  revision: 1
  full duplex: no
  real-time: no
  batch: no
  coprocessor: no
  trigger: yes
  mmap: yes
Modes and Limits:
Device    Sample    Minimum   Maximum
Channels  Size      Rate      Rate
--------  --------  --------  --------
       1         8      4000     43478
       2         8      4000     21739

I mentioned earlier that you can't record and play back at the same time with one sound device. You can, however, change parameters such as sampling rate and sample size "on the fly." First, you need to open the PCM device for read and write. Then, before changing any parameters, use the ioctl call

ioctl(fd, SOUND_PCM_SYNC, 0);

in order to inform the sound driver that it should complete any data transfers that are in progress. You can now change parameters, or even switch between recording and playback. I used this feature earlier in the parrot example program.

You can also stop record or playback immediately using

ioctl(fd, SOUND_PCM_RESET, 0).

Unfortunately, a true bidirectional mode that allows simultaneous recording and playback is not supported (it likely will be in the future, though). This mode would be useful, for example, for implementing a computerized telephone utility that allows users to communicate using a sound card. There is one other alternative: some sound cards, such as the ProAudioSpectrum, have two independent PCM devices--/dev/dsp and /dev/dsp1. You can use one for read and one for write, resulting in simultaneous recording and playback. In order to perform the simultaneous data transfers, it would probably be best to implement the system as two separate processes.

Some applications are time critical. The sound driver transfers data using DMA buffers, a typical buffer size being 64 kilobytes. This can impact real-time applications because of the time needed to fill up buffers for transfer. Transferring 64K of data at 8 kHz would take eight seconds. If a multimedia application was performing an animation, for example, it would be unacceptable to have the display stop for eight seconds while the process was waiting for a full buffer of sound data. You can reduce the buffer size using the ioctl call in this form:

ioctl(fd, SOUND_PCM_SUBDIVIDE, &divisor);

The divisor parameter takes the value 1, 2, or 4; it reduces the DMA buffer size by the corresponding factor. Note that the divisor operates on the default buffer size, not the current value, so you cannot call the function repeatedly to keep reducing the divisor.

For some applications, the smaller DMA buffer size may still not be enough. When the program DOOM was ported to Linux, the performance of the game was impacted by the pauses required to play sound effects. A new real-time ioctl was added to address applications such as this one. The ioctl call is called SNDCTL_DSP_SETFRAGMENT, and is explained in the file experimental.txt included in the kernel sound driver source.