Phil Karn shared
some innovative ideas he had for an innovative HF digital
communications scheme in on the TAPR HFSIG reflector
(archive, November 1994).
Reproduced here with acknowledgement to Phil Karn, KA9Q.

After the last Dayton Hamvention *(1994)*, W3IWI and I drove to Baltimore and
we discussed at length some ideas I had for a "moonbounce modem". We
eventually came to the conclusion that if my scheme worked via the
moon, it should work even better over HF.

First, some background.

The basic problem is that you're operating over a dispersive, fading channel. Received phase is essentially random, so coherence is hard to achieve. But it turns out that you can buy back much of the loss of noncoherent demodulation -- at the cost of increased bandwidth -- with m-ary orthogonal signalling.

"Orthogonal" simply means that the signal states are independent of each other. You can build a detector that responds to one signal state while remaining completely unaffected (in theory) by the other states. Plain mark/space FSK is 2-ary orthogonal signalling, assuming the shift is large enough for the signalling rate. You can detect FSK with two properly tuned filters, and (again in theory) the mark filter does not respond at all to a space signal, and vice versa.

Ordinary BPSK is not orthogonal, it's antipodal. One state is the exact opposite of the other, so instead of being independent, they "fight" each other much like a tug-of-war contest. You use a single detector and compare the result to a threshold. Antipodal modulation is 3dB better than orthogonal modulation but it requires phase coherence, which we don't have on HF.

So we're stuck with orthgonal signalling. But instead of being limited to only two antipodal signal points, you can have as many orthogonal signal points as you like. In m-ary orthogonal signalling, you have m possible channel states, where m is usually a power of 2. E.g., 4-tone FSK is a 4-ary orthogonal modulation scheme. Each channel symbol carries 2 bits of information in the 2^2 = 4 possible states, assuming only one of the four tones is on at any time (otherwise you'd just have several 2-ary systems in parallel.)

It turns out that as m approaches infinity, overall performance asymptotically reaches the theoretical Shannon limit of -1.6dB Eb/N0, even with noncoherent demodulation! This has been known for a long time, but unfortunately the improvement with m is too slow to be impractical.

Given that, here's my idea. Take a reasonable channel bandwidth, say 2 Khz since that will fit through most SSB filters. Divide this 2 Khz into 256 frequency bins, each 7.8125 Hz wide. In order to keep the bins independent and orthogonal, our signalling rate will be limited to no more than 7.8125 symbols per second. But we want a slow symbol rate anyway to minimize intersymbol interference caused by ionospheric multipath. Also, synchronization is easier at a low symbol rate; it may even be possible to use GPS with approximate propagation delay figures to synchronize "open loop". Anyway, with that many bins we could send log2(256) = 8 bits per symbol, or 62.5 bits/sec.

Although this scheme by itself is orthogonal, it isn't particularly resistant to selective fading or narrowband interference. To improve this I propose another layer of orthogonal coding using Walsh functions (rows of a Hadamard matrix).

A Hadamard matrix is easily built up recursively. You start with a lxl matrix containing the single entry "1" Then you double each dimension of the matrix by copying it to the right and below unchanged, and diagonally to the lower right by inverting each bit. Here's the recurrence relation defined using the limited graphics of ASCII:

M (i+l) = M(i) M(i) M(i) ~M(i)

You can keep repeating this process until you have a matrix of the desired size. Here are the first few Hadamard matrices:

1 11 10 1111 1010 1100 1001 11111111 10101010 11001100 10011001 11110000 10100101 11000011 10010110

and so forth.

Now notice that if you take any two rows at random from a Hadamard matrix and XOR them bitwise, you'll always get an equal number of l's and O's. It's easy to see how this follows from the recursire definition given above.

This property means that the rows (and the columns, for that matter) of a Hademard matrix form a set of mutually orthogonal functions. These are called Walsh functions, and have been around for a long time. (As an aside, one clever use in the early days of telephony was to set up wire-crossing patterns for parallel open-wire transmission lines in order to minimize inductive crosstalk.)

Now let's say that instead of just turning on one tone (out of 256) at a time for each transmitted symbol, we take the 8-bit symbol and index the corresponding row from a 256x256 Hadamard matrix. The 256 bits in that row then control whether or not the tone is turned in each of 256 frequency bins. (The amplitude of each tone is reduced to keep the transmitter power the same.)

Each symbol still carries only 8 bits, so the energy per bit is the same in this scheme as with the simple 1-tone-out-of-256 scheme. Moreover, for every Walsh codeword except the all-ls case, exactly 50% of the bins are on and 50% are off, so the transmitter power is constant. (The all l's case could be avoided with some work if necessary.)

So performance with and without Walsh coding should be exactly the same over a nonfading gaussian noise channel. But HF has both non-gaussian (e.g., narrowband) interference and fading, often of the frequency-selective kind. Walsh encoding should help by spreading each symbol's energy out more evenly over the frequency channel.

You could demodulate this signal by first taking a FFT of the raw input samples to determine the total received energy in each frequency bin. Then you take the FHT (Fast Hademard Transform, analogous to the FFT) to determine which of the 256 Walsh functions was mostly likely to have been the one that was sent.

Note that the first column in each of the Hademard matrices is a 1, which is rather redundant. You could omit it and save a frequency bin (which is pretty minor). This is called a "simplex" code, as opposed to a straight orthogonal code. But you could do much better by augmenting the matrix with each bit inverted, doubling the number of code words and adding a bit to each symbol. For example, if we take the 4x4 Hademard matrix

1111 1010 1100 1001

and append to it its bit-inverse, we get

1111 1010 1100 1001 0000 0101 0011 0110

The 8 rows are now either mutually orthogonal (equal number of bit agreements and disagreements) or antipodal (all bits disagree).

This is called a "biorthogonal" code, and in fact it's one of the earliest error correcting codes used in spacecraft. Mariner 9 used one (64x32, I think) at Mars in the late 1960s, before the big switch to convolutional coding.

The only reason to prefer a simplex or orthogonal code set over a biorthogonal code is when your channel has a polarity ambiguity (e.g., BPSK). You could easily determine polarity with an orthogonal or simplex code, but not with a biorthogonal code. But we don't have that problem here, since each "bin" is basically an on-off keyed channel without any polarity ambiguity.

If we built a 128x128 Hademard matrix and turned it into a biorthogonal code we could send 8-bit symbols with only 128 frequency bins. This would let us double our symbol rate to 15.625/sec, or our raw bit rate to 125 bits/sec. Or we could halve the transmitted bandwidth and keep the earlier data rate (more on bandwidth issues later).

Although this scheme should be very robust, there will be bursts of QRM and deep flat fades too great for it to handle. We could deal with this by adding another layer of FEC. A natural choice would be a Reed-Solomon block code over GF(2^8). This code operates on 8-bit symbols that matche the basic channel symbol. Paul, N9FZX has already implemented a Reed-Solomon decoder for 8-bit symbols and has made it freely available over the net. Although the natural block size for a RS coder over GF(256) is a 255-byte block, you can use a smaller blocksize if you like by just agreeing between sender and receiver that the unused symbols are Os and not sending them. On the other hand, larger blocksizes are more robust for the same amount of redundancy.

A more complicated scheme that might work better is convolutional encoding with soft decision decoding, either Fano or Viterbi. The convolutional decoding itself is not a big problem (I've already implemented and tested both decoders) but what makes it complicated is the need to modify the Walsh function demodulator to provide soft samples (instead of "hard" is and Os) for each bit of each demodulated symbol. There are ways to do this, but they involve a fair amount of effort. I need to look into this.

I also need to look into the optimality of the biorthogonal scheme. For maximum efficiency you like to have a channel code where every possible error event is of equal probability, but the biorthogonal code doesn't do this. (The probability of one codeword being turned into its antipodal complement is obviously much less than the probability of it being turned into another codeword that is merely orthogonal to it, since the hamming distance between antipodal codewords is twice that of the distance between two orthogonal codewords.) If you can't keep the codewords at equal distance, it might help to take them into account in your upper layer FEC design. Or it might not make a big difference, I don't know.

I'm aware that mixing a large number of discrete tones would increase the variance of the instantaneous transmitter power. My intuition and the central limit theorem tells me that the instantaneous envelope power will end up looking very Gaussian. But Shannon says that efficient modulation schemes *have* to look Gaussian, and it may turn out that the increased coding gain of this scheme more than makes up for any increased amplifier losses over a constant-envelope signal.

So in summary, my scheme would take a block of 8-bit data bytes, add an appropriate number of Reed-Solomon FEC symbols, encode each symbol into a (128,8) biorthogonal channel code and then use each of the 128 channel bits to gate a set of 128 discrete channel tones.

Some remarks on bandwidth. The bandwidth efficiency of this scheme may at first glance appear to be unacceptably low, especially for HF. 125 bits/sec in 2 Khz is only 0.0625 bits/sec/Hz. But the more I learn about information theory both from the textbooks and from my work here at Qualcomm, the more I really understand that it's counterproductive to limit signal bandwidth. In the real world of a shared band, power efficiency and interference resistance are FAR more important, and these two properties are possible ONLY with extra bandwidth.

This particular modulation scheme should be incredibly robust against both noise and QRM. It ought to be easily decodable well below the operator's ability to even hear it. Because of its spread spectrum-like properties, strong inband QRM can be notched out fairly easily. Since the enerqy distribution of the signal is fairly flat, after the FFT you just look for any frequency bin energies way above averaqe and suppress them before the FHT step. The only limit on the ability to suppress narrow band QRM (a CW carrier, say) then becomes the dynamic range of the receiver and A/D converter.

So, what do people think? This stuff represents a radical departure from the usual approach of trying to minimize congestion by cramming as much as possible into a narrow HF channel. Because of its wide bandwidth, we'd need STAs to test my scheme on the HF bands, and a rule change for general use. But the FCC's bandwidth rules are based on intuition that happens to be completely wrong. John Costas said as much in his classic paper "Poisson, Shannon and the Radio Amateur", published 35 years ago, but now we actually have the technology to pursue his suggestions!

Phil

*Compiled by Johan Forrer, 1998 <forrerj@peak.org>*