Internet: netlib@nac.no
EARN/BITNET: netlib%nac.no@norunix.bitnet
X.400: s=netlib; o=nac; c=no;
EUNET/uucp: nac!netlib
A similar collection of statistical software is available from statlib@temper.stat.cmu.edu.
The symbolic algebra system REDUCE is supported by reduce-netlib@rand.org.
Naval Surface Warfare Center (E43)
Dahlgren, VA 22448-5000
U.S.A.
[Witold Waldman, Witold.Waldman@dsto.defence.gov.au]
The U.S. DoD's Federal-Standard-1016 based 4800 bps code excited
linear prediction voice coder version 3.2 (CELP 3.2) Fortran and C simulation
source codes are available for worldwide distribution (on DOS diskettes,
but configured to compile on Sun SPARC stations) from NTIS and DTIC. Example
input and processed speech files are included. A Technical Information
Bulletin (TIB), "Details to Assist in Implementation of Federal Standard
1016 CELP," and the official standard, "Federal Standard 1016, Telecommunications:
Analog to Digital Conversion of Radio Voice by 4,800 bit/second Code Excited
Linear Prediction (CELP)," are also available.
NTISFS-1016 CELP 3.2 may also be obtained from file://svr-ftp.eng.cam.ac.uk/pub/comp.speech/coding/celp_3.2a.tar.Z or file://ftp.super.org/pub/speech/celp_3.2a.tar.Z.
U.S. Department of Commerce
5285 Port Royal Road
Springfield, VA 22161
USA
(800) 553-6847
LPC is available from ftp://ftp.super.org/pub/speech/lpc10-1.0.tar.gz or file://svr-ftp.eng.cam.ac.uk/pub/speech/coding/lpc10-1.0.tar.gz.
MATLAB software for LPC-10 is available from http://www.eas.asu.edu/~spanias/srtcrs.html. Also, postscript copies of tutorials of speech coding can be found at http://www.eas.asu.edu/~spanias/papers.html. [Andreas Spanias, spanias@asu.edu]
Campbell, Joseph P. Jr., Thomas E. Tremain and Vanoy C. Welch, The Federal Standard 1016 4800 bps CELP Voice Coder, Digital Signal Processing, Academic Press, 1991, Vol. 1, No. 3, p. 145-155.Additional information on CELP can also be found in the comp.speech FAQ.Campbell, Joseph P. Jr., Thomas E. Tremain and Vanoy C. Welch, The DoD 4.8 kbps Standard (Proposed Federal Standard 1016), in Advances in Speech Coding, ed. Atal, Cuperman and Gersho, Kluwer Academic Publishers, 1991, Chapter 12, p. 121-133.
Campbell, Joseph P. Jr., Thomas E. Tremain and Vanoy C. Welch, The Proposed Federal Standard 1016 4800 bps Voice Coder: CELP, Speech Technology Magazine, April/May 1990, p. 58-64.
The U. S. Federal Standard 1015 (NATO STANAG 4198) is described
in: Thomas E. Tremain, The Government Standard Linear Predictive Coding
Algorithm: LPC-10, Speech Technology Magazine, April 1982, pp. 40-49.
adpcm_coder(short inbuf[], char outbuf[], int nsample, struct adpcm_state *state); adpcm_decoder(char inbuf[], short outbuf[], int nsample, struct adpcm_state *state);Note that this is NOT a G.722 coder. The ADPCM standard is much more complicated, probably resulting in better quality sound but also in much more computational overhead.
This is also available as:
file://svr-ftp.eng.cam.ac.uk/pub/comp.speech/coding/G711_G722_G723.tar.gz
[From Dan Frankowski, dfrankow@winternet.com; Jack Jansen, Jack.Jansen@cwi.nl]
The Communications and Operating Systems Research Group (KBS) at
the Technische Universitaet Berlin is currently working on a set of UNIX-based
tools for computer-mediated telecooperation that will be made freely available.
As part of this effort we are publishing an implementation of the European GSM 06.10 provisional standard for full-rate speech transcoding, prI-ETS 300 036, which uses RPE/LTP (residual pulse excitation/long term prediction) coding at 13 kbit/s.
GSM 06.10 compresses frames of 160 13-bit samples (8 kHz sampling rate, i.e. a frame rate of 50 Hz) into 260 bits; for compatibility with typical UNIX applications, our implementation turns frames of 160 16-bit linear samples into 33-byte frames (1650 Bytes/s). The quality of the algorithm is good enough for reliable speaker recognition; even music often survives transcoding in recognizable form (given the bandwidth limitations of 8 kHz sampling rate).
The interfaces offered are a front end modeled after compress(1), and a library API. Compression and decompression run faster than realtime on most SPARCstations. The implementation has been verified against the ETSI standard test patterns.
Jutta Degener jutta@cs.tu-berlin.de, Carsten Bormann cabo@cs.tu-berlin.de)
Communications and Operating Systems Research Group, TU Berlin
Fax: +49.30.31425156, Phone: +49.30.31424315
B.C.J. Moore, An Introduction to the Psychology of Hearing, Academic Press, London, 1997.This book is available in paperback and makes a good desk reference.
An algorithm implementation that matches a large body of psychoacoustical work, but which is computationally very intensive, is presented in the paper:
Malcolm Slaney and Richard Lyon, "A Perceptual Pitch Detector," Proceedings of the International Conference of Acoustics, Speech, and Signal Processing, 1990, Albuquerque, New Mexico. Available for ftp at ftp://worldserver.com/pub/malcolm/ICASSP90.psc.ZThe definitive papers describing the use of such a perceptual pitch detector as applied to the classical pitch literature is in:
Ray Meddis and M. J. Hewitt. "Virtual pitch and phase sensitivity of a computer model of the auditory periphery. "Journal of the Acoustical Society of America 89 (6 1991): 2866-2682. and 2883-2894.The current work that argues for a pure spectral method starts with the work of Goldstein:
J. Goldstein, "An optimum processor theory for the central formation of the pitch of complex tones," Journal of the Acoustical Society of America 54, 1496-1516, 1973.Two approaches are worth considering if something approximating pitch is appropriate. The people at IRCAM have proposed a harmonic analysis approach that can be implemented on a DSP:
Boris Doval and Xavier Rodet, "Estimation of Fundamental Frequency of Musical Sound Signals," Proceedings of the 1991 International Conference on Acoustics, Speech, and Signal Processing, Toronto, Volume 5, pp. 3657-3660.The classic paper for time domain (peak picking) pitch algorithms is:
B. Gold and L. Rabiner, "Parallel processing techniques for estimating pitch periods of speech in the time domain," Journal of the Acoustical Society of America, 46, pp 441-448, 1969.
[The above from Malcolm Slaney, Interval Research,
and John Lazzaro, U.C. Berkeley.]
AES/EBU is a bit-serial communications protocol for transmitting
digital audio data through a single transmission line. It provides two
channels of audio data (up to 24 bits per sample), a method for communication
control and status information ("channel status bits"), and some error
detection capabilities. Clocking information (i.e., sample rate) is derived
from the AES/EBU bit stream, and is thus controlled by the transmitter.
The standard mandates use of 32 kHz, 44.1 kHz, or 48 kHz sample rates,
but some interfaces can be made to work at other sample rates.
AES/EBU provides both "professional" and "consumer" modes. The big difference is in the format of the channel status bits mentioned above. The professional mode bits include alphanumeric channel origin and destination data, time of day codes, sample number codes, word length, and other goodies. The consumer mode bits have much less information, but do include information on copy protection (naturally). Additionally, the standard provides for "user data", which is a bit stream containing user-defined (i.e., manufacturer-defined) data. According to Tim Channon, "CD user data is almost raq CD subcode; DAT is StartID and SkipID. In professional mode, there is an SDLC protocol or, if DAT, it may be the same as consumer mode."
The physical connection media are commonly used with AES/EBU: balanced (differential), using two wires and shield in three-wire microphone cable with XLR connectors; unbalanced (single-ended), using audio coax cable with RCA jacks; and optical (via fiber optics).
[The above from Phil Lapsley and Tim Channon, tchannon@black.demon.co.uk]
Painter, E. M., and Spanias, A. S. (1997 and revised 1999). A Review of Algorithms for Perceptual Coding of Digital Audio Signals. (PostScript, 3MB) http://www.eas.asu.edu/~spanias/papers.html
[Andreas Spanias, spanias@asu.edu]
Desktop Sparc machines come with routines to convert between linear
and mu-law samples. On a desktop Sparc, see the man page for audio_ulaw2linear
in /usr/demo/SOUND/man.
Michael Villeret, et. al, A New Digital Technique for Implementation
of Any Continuous PCM Companding Law, IEEE Int. Conf. on Communications,
1973, vol. 1, pp. 11.12-11.17.
MIL-STD-188-113, Interoperability and Performance Standards for Analog-to-Digital Conversion Techniques, 17 February 1987.
TI Digital Signal Processing Applications with the TMS320 Family (TI literature number SPRA012A), pp. 169-198.
[From Ed Hall, edhall@rand.org:]
For a start, look at Multirate Digital Signal Processing by Crochiere and Rabiner (see FAQ section 1.1).
Almost any technique for producing good digital low-pass filters will be adaptable to sample-rate conversion. 44.1:48 and vice-versa is pretty hairy, though, because the lowest whole-number ratio is 147:160. To do all that in one go would require a FIR with thousands of coefficients, of which only 1/147th or 1/160th are used for each sample--the real problem is memory, not CPU for most DSP chips. You could chain several interpolators and decimators, as suggested by factoring the ratio into 3*7*7:2*2*2*2*2*5. This adds complexity, but reduces the number of coefficients required by a considerable amount.
[From Lou Scheffer:]
Theory of operation: 44.1 and 48 are in the ratio 147/160. To convert
from 44.1 to 48, for example, we (conceptually):
A paper available as
file://ccrma-ftp.stanford.edu/pub/DSP/Tutorials/BandlimitedInterpolation.eps.Z
explains the algorithm. Free source code, as well as an HTML discussion
of the algorithm, is available at http://ccrma-www.stanford.edu/~jos/resample/.
It all works quite well.
[From Kevin Bradley, kb+@andrew.cmu.edu:]
There is an implementation of polyphase resampling for various rates as a part of the Sox audio toolkit at http://home.sprynet.com/~cbagwell/sox.html. See file polyphas.c for details.
Sox also contains an implementation of bandlimited interpolation and linear interpolation, and serves as a ready vehicle for module experimentation.
[From Fritz M. Rothacher, f.rothacher@ieee.org:]
You can add my Ph.D. thesis on sample-rate conversion to the FAQ:
Fritz M. Rothacher, Sample-Rate Conversion: Algorithms and VLSI Implementation, Ph.D. thesis, Integrated Systems Lab, Swiss Federal Institute of Technology, ETH Zuerich, 1995, ISBN 3-89191-873-9
It can also be downloaded from my homepage at http://www.guest.iis.ee.ethz.ch/~rota.
The mathematical theory behind wavelets (and other related transforms)
is given in the appendix of the XWPL reference manual. The XWPL manual
can be found at
http://venus.javeriana.edu.co/WAVELETS/.
Other sources of information on wavelets are:
Wavelets and Signal Processing- Oliver Rioul and Martin Vetterli, IEEE Signal Processing magazine, Oct. 91, pp 14-38A good introductory book on wavelets:
Randy K. Young, Wavelet Theory and Its Applications, Kluwer Academic Publishers, ISBN 0-7923-9271-X, 1993.A more thorough book:
Ali N. Akansu and Richard A. Haddad, Multiresolution Signal Decomposition Transforms, Subbands, Wavelets Academic Press, Inc., ISBN 0-12-047140-XA couple more interesting papers:
Wavelets and Filter banks: Theory and Design, IEEE Transactions on Signal Processing, Vol. 40, No.9, Sept. 1992, pp 2207-2232Mac Cody's articles in Dr. Dobb's Journal, April 1992 and April 1993
Paper by Ingrid Daubechies in IEEE Trans. on Info. theory , vol 36. No.5 , Sept 1990 and a book titled " Ten lectures on Wavelets" deal with the mathematical aspects of the WT.
There is also a sample data directory containing interesting signals.
[From Fazal Majid majid@math.yale.edu]:
The programs have been tested on Sparcstations running SunOS 4.1.n
with MATLAB 4.1. However, the "mex" code is generic and should run on other
platforms (you may have to tinker the Makefiles a little bit to make this
work). There are several utility routines all of them callable from MATLAB.
All the C files (leading to the mex files) can also be directly accessed
from other C or Fortran code.
A collection of of papers and tech. reports from the DSP group is also available. You could obtain this distribution of software and papers by anonymous ftp from cml.rice.edu.
Report problems/bugs and installation info on non-SUN/non-unix platforms
send mail to wlet-tools@rice.edu (or ramesh@dsp.rice.edu)
For comp.dsp, the gist is:
Andrew Reilly [Reilly@zeta.org.au]