Another set of additions to the paper, both text and figures.

git-svn-id: svn+ssh://svn.code.sf.net/p/wsjt/wsjt/branches/wsjtx@6352 ab8295b8-cf94-4d9e-aec4-7959e3be5d79
This commit is contained in:
Joe Taylor 2016-01-05 20:13:13 +00:00
parent 3576e0b868
commit 0f62680559

View File

@ -133,13 +133,10 @@ key "kv2001"
\end_inset \end_inset
, as licensed to K1JT and implemented in a closed-source program for use , licensed to and implemented by K1JT in a closed-source executable for
only in amateur radio applications. use only in amateur radio applications.
Since 2001 the KV decoder has been considered the best available soft-decision Since 2001 the KV decoder has been considered the best available soft-decision
decoder for Reed Solomon codes. decoder for Reed Solomon codes.
\end_layout
\begin_layout Standard
We describe here a new open-source alternative called the Franke-Taylor We describe here a new open-source alternative called the Franke-Taylor
(FT, or K9AN-K1JT) algorithm. (FT, or K9AN-K1JT) algorithm.
It is conceptually simple, built around the well-known Berlekamp-Massey It is conceptually simple, built around the well-known Berlekamp-Massey
@ -149,8 +146,8 @@ We describe here a new open-source alternative called the Franke-Taylor
\emph on \emph on
WSJT-X WSJT-X
\emph default \emph default
, widely used for amateur weak-signal communication with JT65 and several , widely used for amateur weak-signal communication with JT65 and other
other specialized digital modes. specialized digital modes.
The program is freely available and licensed under the GNU General Public The program is freely available and licensed under the GNU General Public
License. License.
\end_layout \end_layout
@ -160,22 +157,22 @@ The JT65 protocol specifies transmissions that normally start one second
into a UTC minute and last for 46.8 seconds. into a UTC minute and last for 46.8 seconds.
Receiving software therefore has up to several seconds to decode a message, Receiving software therefore has up to several seconds to decode a message,
before the operator sends a reply at the start of the next minute. before the operator sends a reply at the start of the next minute.
With today's personal computers, this relatively long time for decoding With today's personal computers, this relatively long time available for
a short message encourages experimentation with decoders of high computational decoding a short message encourages experimentation with decoders of high
complexity. computational complexity.
As a result, on a typical fading channel the FT algorithm extends the decoding As a result, on a typical fading channel the FT algorithm can extend the
threshold by many dB over the hard-decision Berlekamp-Massey decoder, and decoding threshold by many dB over the hard-decision Berlekamp-Massey decoder,
by a meaningful amount over the KV decoder. and by a meaningful amount over the KV decoder.
In addition to its excellent performance, the new algorithm has other desirable In addition to its excellent performance, the new algorithm has other desirable
properties---not the least of which is its conceptual simplicity. properties, not least of which is its conceptual simplicity.
Decoding performance and complexity scale in a convenient way, providing Decoding performance and complexity scale in a convenient way, providing
steadily increasing soft-decision decoding gain as a tunable computational steadily increasing soft-decision decoding gain as a tunable computational
complexity parameter is increased over more than 5 orders of magnitude. complexity parameter is increased over more than 5 orders of magnitude.
This means that appreciable gain is available from our decoder even on Appreciable gain is available from our decoder even on very simple (and
very simple (and relatively slow) computers. relatively slow) computers.
On the other hand, because the algorithm benefits from a large number of On the other hand, because the algorithm benefits from a large number of
independent decoding trials, it should be possible to obtain further performanc independent decoding trials, further performance gains should be achievable
e gains through parallelization on high-performance computers. through parallelization on high-performance computers.
\end_layout \end_layout
\begin_layout Section \begin_layout Section
@ -943,7 +940,7 @@ Here
\end_inset \end_inset
if the received symbol and codeword symbol are different, and if the received symbol and codeword symbol are different, and
\begin_inset Formula $p_{1\,j}$ \begin_inset Formula $p_{1,\,j}$
\end_inset \end_inset
is the fractional power associated with received symbol is the fractional power associated with received symbol
@ -965,12 +962,7 @@ In practice we find that
\end_inset \end_inset
can reliably indentify the correct codeword if the signal-to-noise ratio can reliably indentify the correct codeword if the signal-to-noise ratio
for individual symbols is greater than about 4 in linear power units, or for individual symbols is greater than about 4 in linear power units.
\begin_inset Formula $E_{s}/N_{0}\apprge6$
\end_inset
dB (*** check these numbers ***).
We also find that significantly weaker signals can be decoded by using We also find that significantly weaker signals can be decoded by using
soft-symbol information beyond that contained in soft-symbol information beyond that contained in
\begin_inset Formula $p_{1}$ \begin_inset Formula $p_{1}$
@ -1117,7 +1109,7 @@ est metrics
will likely be close to 1. will likely be close to 1.
We therefore apply a ratio threshold test, say We therefore apply a ratio threshold test, say
\begin_inset Formula $r<r_{0}$ \begin_inset Formula $r<r_{1}$
\end_inset \end_inset
, to identify codewords with high probability of being correct. , to identify codewords with high probability of being correct.
@ -1128,7 +1120,7 @@ reference "sec:Theory,-Simulation,-and"
\end_inset \end_inset
, we have used simulations to set an empirical acceptance threshold , we use simulations to set an empirical acceptance threshold
\begin_inset Formula $r_{0}$ \begin_inset Formula $r_{0}$
\end_inset \end_inset
@ -1145,21 +1137,32 @@ Technically the FT algorithm is a list decoder.
is retained. is retained.
As with all such algorithms, a stopping criterion is necessary. As with all such algorithms, a stopping criterion is necessary.
FT accepts a codeword unconditionally if the Hamming distance and soft FT accepts a codeword unconditionally if the Hamming distance
distance \begin_inset Formula $X$
\end_inset
and soft distance
\begin_inset Formula $d_{s}$ \begin_inset Formula $d_{s}$
\end_inset \end_inset
are less than some conservatively specified limits. are less than conservatively specified limits
Secondary acceptance criteria \begin_inset Formula $X_{0}$
\begin_inset Formula $d_{s}<d_{0}$
\end_inset \end_inset
and and
\begin_inset Formula $r<r_{0}$ \begin_inset Formula $d_{0}$
\end_inset \end_inset
are used to validate additional decodes. .
Secondary acceptance criteria
\begin_inset Formula $d_{s}<d_{1}$
\end_inset
and
\begin_inset Formula $r<r_{1}$
\end_inset
are used to validate additional decodes that did not pass the first test.
A timeout is used to limit the algorithm's execution time if no acceptable A timeout is used to limit the algorithm's execution time if no acceptable
codeword is found in a reasonable number of trials, codeword is found in a reasonable number of trials,
\begin_inset Formula $T$ \begin_inset Formula $T$
@ -1227,7 +1230,7 @@ If BM decoding was not successful, go to step 2.
\begin_layout Enumerate \begin_layout Enumerate
Calculate the hard-decision Hamming distance Calculate the hard-decision Hamming distance
\begin_inset Formula $h$ \begin_inset Formula $X$
\end_inset \end_inset
between the candidate codeword and the received symbols, the corresponding between the candidate codeword and the received symbols, the corresponding
@ -1244,7 +1247,7 @@ Calculate the hard-decision Hamming distance
\begin_inset Formula $u$ \begin_inset Formula $u$
\end_inset \end_inset
is the largest one encountered so far, preserve the previous value of is the largest one encountered so far, preserve any previous value of
\begin_inset Formula $u_{1}$ \begin_inset Formula $u_{1}$
\end_inset \end_inset
@ -1261,7 +1264,7 @@ Calculate the hard-decision Hamming distance
\begin_layout Enumerate \begin_layout Enumerate
If If
\begin_inset Formula $h<h_{0}$ \begin_inset Formula $X<X_{0}$
\end_inset \end_inset
and and
@ -1290,7 +1293,7 @@ If
\end_inset \end_inset
and and
\begin_inset Formula $r<r_{1}$ \begin_inset Formula $r<r_{1},$
\end_inset \end_inset
go to step 10. go to step 10.
@ -1301,11 +1304,7 @@ Otherwise, declare decoding failure and exit.
\end_layout \end_layout
\begin_layout Enumerate \begin_layout Enumerate
An acceptable codeword with An acceptable codeword has been found.
\begin_inset Formula $u_{max}>u_{0}$
\end_inset
has been found.
Declare a successful decode and return this codeword. Declare a successful decode and return this codeword.
\end_layout \end_layout
@ -1365,8 +1364,8 @@ key "ls2009"
is applied to higher-rate Reed-Solomon codes on a binary-input channel is applied to higher-rate Reed-Solomon codes on a binary-input channel
with BPSK-modulated symbols. with BPSK-modulated symbols.
Our 64-ary input channel with 64-FSK modulation required us to develop Our 64-ary input channel with 64-FSK modulation required us to develop
unique methods for assigning erasure probabilities and for defining an unique methods for assigning erasure probabilities and for defining acceptance
acceptance criteria to select the best codeword from the list of candidates. criteria to select the best codeword from the list of candidates.
\end_layout \end_layout
@ -1381,21 +1380,24 @@ Hinted Decoding
\end_layout \end_layout
\begin_layout Standard \begin_layout Standard
The FT algorithm is completely general: it recovers with equal sensitivity The FT algorithm is completely general: with equal sensitivity it recovers
any one of the any one of the
\begin_inset Formula $2^{72}\approx4.7\times10^{21}$ \begin_inset Formula $2^{72}\approx4.7\times10^{21}$
\end_inset \end_inset
different messages that can be transmitted using the JT65 protocol. different messages that can be transmitted with the JT65 protocol.
In many circumstances it's easy to imagine a much smaller list of messages In some circumstances it's easy to imagine a
(say, a few thousand or less) that may be among the most likely ones to \emph on
be received. much
For example, one such situation exists when making short ham-radio contacts \emph default
exchanging minimal amounts of information such as callsigns, signal reports, smaller list of messages (say, a few thousand messages or less) that may
perhaps a Maidenhead locator, and acknowledgments. be among the most likely ones to be received.
Similarly, on the EME path or on a VHF or UHF band with limited geographical One such situation exists when making short ham-radio contacts that exchange
coverage, the most likely received messages will often originate from callsigns minimal information including callsigns, signal reports, perhaps Maidenhead
that have been decoded before. locators, and acknowledgments.
On the EME path or on a VHF or UHF band with limited geographical coverage,
the most likely received messages often originate from callsigns that have
been decoded before.
Saving a list of previously decoded callsigns makes it easy to generate Saving a list of previously decoded callsigns makes it easy to generate
lists of hypothetical messages and their corresponding codewords, at very lists of hypothetical messages and their corresponding codewords, at very
little computational expense. little computational expense.
@ -1420,13 +1422,14 @@ hinted decoding;
\begin_inset Quotes eld \begin_inset Quotes eld
\end_inset \end_inset
Deep Search deep search
\begin_inset Quotes erd \begin_inset Quotes erd
\end_inset \end_inset
algorithm. algorithm.
In certain limited situations it can provide enhanced sensitivity for the In certain limited situations it can provide enhanced sensitivity for the
principal task of any decoder, namely to determine what message was sent. principal task of any decoder, namely to determine precisely what message
was sent.
\end_layout \end_layout
\begin_layout Standard \begin_layout Standard
@ -1459,7 +1462,8 @@ small enough
\begin_inset Quotes erd \begin_inset Quotes erd
\end_inset \end_inset
for adequate confidence, while still ensuring that false decodes are rare. to establish adequate confidence, while still ensuring that false decodes
are rare.
Because tested candidate codewords are drawn from a list typically no longer Because tested candidate codewords are drawn from a list typically no longer
than a few thousand, rather than than a few thousand, rather than
\begin_inset Formula $2^{72},$ \begin_inset Formula $2^{72},$
@ -1469,22 +1473,26 @@ small enough
\begin_inset Formula $r_{2}$ \begin_inset Formula $r_{2}$
\end_inset \end_inset
can be a more relaxed limit than the ones can set a more relaxed limit than
\begin_inset Formula $r_{0}$ \begin_inset Formula $r_{1},$
\end_inset \end_inset
and as used in the FT algorithm.
\begin_inset Formula $r_{1}$ For the limited subset of messages established by operator experience as
\begin_inset Quotes eld
\end_inset \end_inset
used in the FT algorithm. likely,
For the limited subset of messages considered as likely, hinted decodes \begin_inset Quotes erd
can be obtained at lower signal levels than would be required for decodes \end_inset
selected from the full universe of
hinted decodes can be obtained at lower signal levels than required for
decodes obtained from the full universe of
\begin_inset Formula $2^{72}$ \begin_inset Formula $2^{72}$
\end_inset \end_inset
distinct messages. possible messages.
\end_layout \end_layout
\begin_layout Section \begin_layout Section
@ -1497,10 +1505,6 @@ name "sec:Theory,-Simulation,-and"
Decoder Performance Evaluation Decoder Performance Evaluation
\end_layout \end_layout
\begin_layout Subsection
Simulated results on the AWGN channel
\end_layout
\begin_layout Standard \begin_layout Standard
Comparisons of decoding performance are usually presented in the professional Comparisons of decoding performance are usually presented in the professional
literature as plots of word error rate versus literature as plots of word error rate versus
@ -1514,8 +1518,8 @@ Comparisons of decoding performance are usually presented in the professional
. .
For weak-signal amateur radio work, performance is more conveniently presented For weak-signal amateur radio work, performance is more conveniently presented
as the probability of successfully decoding a received word versus signal-to-no as the probability of successfully decoding a received word plotted against
ise ratio in a 2500 Hz reference bandwidth, signal-to-noise ratio in a 2500 Hz reference bandwidth,
\begin_inset Formula $\mathrm{SNR}{}_{2500}$ \begin_inset Formula $\mathrm{SNR}{}_{2500}$
\end_inset \end_inset
@ -1536,12 +1540,36 @@ reference "sec:Appendix:SNR"
\end_inset \end_inset
. .
Examples of both types of plot are included in the following discussion,
where we describe a number of simulations carried out to compare performance
of the FT algorithm with others, and with theoretical expectations.
We have also used simulations to establish suitable default values for
the acceptance parameters
\begin_inset Formula $h_{0},$
\end_inset
\begin_inset Formula $d_{0},$
\end_inset
\begin_inset Formula $d_{1},$
\end_inset
and
\begin_inset Formula $r_{1}.$
\end_inset
\end_layout
\begin_layout Subsection
Simulated results on the AWGN channel
\end_layout \end_layout
\begin_layout Standard \begin_layout Standard
Results of simulations using the BM, FT, and KV decoding algorithms on the Results of simulations using the BM, FT, and KV decoding algorithms on the
JT65 (63,12) code are presented in terms of word error-rate vs JT65 code are presented in terms of word error rate versus
\begin_inset Formula $E_{b}/N_{o}$ \begin_inset Formula $E_{b}/N_{o}$
\end_inset \end_inset
@ -1556,9 +1584,9 @@ reference "fig:bodide"
For these tests we generated at least 1000 signals at each signal-to-noise For these tests we generated at least 1000 signals at each signal-to-noise
ratio, assuming the additive white gaussian noise (AWGN) channel, and processed ratio, assuming the additive white gaussian noise (AWGN) channel, and processed
the data using each algorithm. the data using each algorithm.
For word error-rates less than 0.1 it was necessary to process 10,000 or For word error rates less than 0.1 it was necessary to process 10,000 or
even 100,000 simulated signals in order to capture enough errors to make even 100,000 simulated signals in order to capture enough errors to make
the estimates of word-error-rate statistically meaningful. the measurements statistically meaningful.
As a test of the fidelity of our numerical simulations, Figure As a test of the fidelity of our numerical simulations, Figure
\begin_inset CommandInset ref \begin_inset CommandInset ref
LatexCommand ref LatexCommand ref
@ -1566,8 +1594,7 @@ reference "fig:bodide"
\end_inset \end_inset
also shows theoretical results (filled squares) for comparison with the also shows theoretical results for comparison with the BM results.
BM results.
The simulated BM results agree with theory to within about 0.1 dB. The simulated BM results agree with theory to within about 0.1 dB.
This difference between simulated BM results and theory is caused by small This difference between simulated BM results and theory is caused by small
errors in the estimates of time- and frequency-offset of the received signal errors in the estimates of time- and frequency-offset of the received signal
@ -1628,29 +1655,23 @@ Word error rates as a function of
\begin_inset Formula $E_{b}/N_{0},$ \begin_inset Formula $E_{b}/N_{0},$
\end_inset \end_inset
the signal-to-noise ratio per bit. the signal-to-noise ratio per information bit.
The single curve marked with filled squares shows a theoretical prediction Theory: theoretical prediction for the hard-decision BM decoder.
for the BM decoder. The remaining curves represent simulation results on an AWGN channel for
Open squares illustrate simulation results for an AWGN channel with the the BM, KV, and FT decoders.
BM, FT ( The KV algorithm was executed with complexity coefficient
\begin_inset Formula $T=10^{5}$
\end_inset
) and KV (
\begin_inset Formula $\lambda=15$ \begin_inset Formula $\lambda=15$
\end_inset \end_inset
) decoders used in program , the most aggressive setting historically used in the
\emph on \emph on
WSJT-X WSJT
\emph default \emph default
. programs.
The KV results are for decoding complexity coefficient The FT alrithm was run with timeout setting
\begin_inset Formula $\lambda=15$ \begin_inset Formula $T=10^{5}.$
\end_inset \end_inset
, the most aggressive setting that has historically been used in earlier
versions of the WSJT programs.
\end_layout \end_layout
@ -1702,15 +1723,15 @@ reference "fig:bodide"
\end_inset \end_inset
in this format along with additional FT results for in this format along with additional FT results for
\begin_inset Formula $T=10^{4},10^{3},10^{2}$ \begin_inset Formula $T=10^{4},\:10^{3},\:10^{2}$
\end_inset \end_inset
and and
\begin_inset Formula $10^{1}$ \begin_inset Formula $10$
\end_inset \end_inset
. .
The KV results are plotted with open triangles. The KV results are plotted with open squares.
It is apparent that the FT decoder produces more decodes than KV when It is apparent that the FT decoder produces more decodes than KV when
\begin_inset Formula $T=10^{4}$ \begin_inset Formula $T=10^{4}$
\end_inset \end_inset
@ -1747,24 +1768,19 @@ name "fig:WER2"
\end_inset \end_inset
Percent of JT65 messages copied as a function of SNR in 2.5 kHz bandwidth. Percent of JT65 messages copied as a function of SNR in 2500 Hz bandwidth.
Solid lines with filled round circles are results from the FT decoder with Solid lines with filled circles are results from the FT decoder; numbers
adjacent to the curves specify values of the timeout parameter
\begin_inset Formula $T=10^{5},10^{4},10^{3},10^{2}$ \begin_inset Formula $T.$
\end_inset \end_inset
and The dotted line with open squares is the KV decoder with complexity coefficient
\begin_inset Formula $10$
\end_inset
, respectively, from left to right.
The dashed line with open triangles is the KV decoder with complexity coefficie
nt
\begin_inset Formula $\lambda=15$ \begin_inset Formula $\lambda=15$
\end_inset \end_inset
. .
Results from the BM algorithm are also shown with filled triangles. Results from the BM algorithm are shown with a dashed line and crosses.
\end_layout \end_layout
\end_inset \end_inset
@ -1809,7 +1825,7 @@ reference "fig:N_vs_X"
\begin_inset Formula $X\le25$ \begin_inset Formula $X\le25$
\end_inset \end_inset
because all such words were successfully decoded by the BM algorithm. because all such words are successfully decoded by the BM algorithm.
Figure Figure
\begin_inset CommandInset ref \begin_inset CommandInset ref
LatexCommand ref LatexCommand ref
@ -1826,8 +1842,8 @@ reference "fig:N_vs_X"
with the number of errors in the received word. with the number of errors in the received word.
The variability of the decoding time also increases dramatically with the The variability of the decoding time also increases dramatically with the
number of errors in the received word. number of errors in the received word.
These results also provide insight into the mean and variance of the execution These results provide insight into the mean and variance of the execution
time for the FT algorithm, as execution time will be roughly proportional time for the FT algorithm, since execution time will be roughly proportional
to the number of required trials. to the number of required trials.
\end_layout \end_layout
@ -1859,13 +1875,21 @@ name "fig:N_vs_X"
\end_inset \end_inset
Number of trials needed to decode a received word versus Hamming distance Number of trials needed to decode a received word versus Hamming distance
\begin_inset Formula $X$
\end_inset
between the received word and the decoded codeword, for 1000 simulated between the received word and the decoded codeword, for 1000 simulated
frames on an AWGN channel with no fading. frames on an AWGN channel with no fading.
The SNR in 2500 Hz bandwidth is -24 dB ( The SNR in 2500 Hz bandwidth is
\begin_inset Formula $-24$
\end_inset
dB, which corresponds to
\begin_inset Formula $E_{b}/N_{o}=5.1$ \begin_inset Formula $E_{b}/N_{o}=5.1$
\end_inset \end_inset
dB). dB.
\end_layout \end_layout
@ -1880,7 +1904,7 @@ Number of trials needed to decode a received word versus Hamming distance
\end_layout \end_layout
\begin_layout Subsection \begin_layout Subsection
Simulated results for hinted decoding and Rayleigh fading Simulated results for Rayleigh fading and hinted decoding
\end_layout \end_layout
\begin_layout Standard \begin_layout Standard
@ -1904,9 +1928,11 @@ reference "fig:Psuccess"
We include three curves for each decoding algorithm: one for the AWGN channel We include three curves for each decoding algorithm: one for the AWGN channel
and no fading, and two more for simulated Doppler spreads of 0.2 and 1.0 and no fading, and two more for simulated Doppler spreads of 0.2 and 1.0
Hz. Hz.
For reference, we note that the JT65 symbol rate is about 2.69 Hz.
The simulated Doppler spreads are comparable to those encountered on HF The simulated Doppler spreads are comparable to those encountered on HF
ionospheric paths and for EME at VHF and lower UHF bands. ionospheric paths and for EME at VHF and lower UHF bands.
For reference, we note that the JT65 symbol rate is about 2.69 Hz.
(*** A little more description of hinted decoding is needed here, and new
data for the DS curves.***)
\end_layout \end_layout
\begin_layout Standard \begin_layout Standard
@ -1948,7 +1974,14 @@ Deep Search
\begin_inset Quotes erd \begin_inset Quotes erd
\end_inset \end_inset
) matched-filter algorithm. ) algorithm.
Numbers adjacent to the curves are the simulated Doppler spreads in Hz.
The curve labeled Sync illustrates the dependence of proper time and frequency
synchronization in the decoder presently implemented in
\emph on
WSJT-X
\emph default
.
\end_layout \end_layout
\end_inset \end_inset