Convolutional Codes
David Jiménez
Georgia Institute of Technology
This project was realized during the MSRI Mathematical Graphics Summer School 2005 at Reed College. Many many thanks for the computer help, suggestions, patience and good conversation to David Austin, Jim Fix and Bill Casselman, as well as most of the participants on the Summer School, at leat for the patience to listen to me when I needed to explain myself something.

In any case, if there is somebody to blame for the quality of the presentation, it should be me and nobody else (Sorry!).
General Coding Paradigm

When transmiting a digital signal trough a noisy channel, usually noise is added to the signal and it is necesary to recover the original signal to process. In an ideal world, the situation is the following,

where m is the original message, c is the codeword transmited, r is the word received, that is basically c plus some error introduced by the channel, and ideally the decoder will be able to recover the original message from the received word.

A code, for most of our pourpuses, is a function T from the set of finite strings of zeros and ones to itself such that T is injective (there is a more general setting on which this definition can be setted). The code is intended to add redundancy to the message to cancel (with a high probability) the efect of the noise in the channel.

Block Codes and Convolutional Codes.

In order to do an efficient and simple model of a code that can be used in practice, the idea is to reduce the space of messages to the space of strings of length exactly k, and the space of codewords of to the space of strings of length exactly n, k greater than n, and T most likely will be linear. The original message will be cutted into pieces of length k, after potentially a pad completition, and each of the blocks is usually coded independently. In this case the code is said to be a memoryless block code.

Now, the content itself of a file may give extra redundancy to the code, and therefore robustness. The basic idea of a convolutional code is that the encoding of each block of information is not independent from all the others. The encoding is going to depend on the block to be encoded and on the state of the encoder in the moment the encoding is performed. The special feature is that this state is updated each time than the encoder encodes a block of information. This, of course introduces an extra degree of complexity to the process.

Applet not currently working. Want to try to debug? code available!

The basic idea of the encoding process is the following: The open node pointing to the left is the input (one bit input), the open nodes pointing to the right are the output (two bits output), the little dots are splitters, the squares are memory cells and the adders perform the addition of the arguments (binary addition). The lines can be traversed on only one direction. The function of the memory cells is basically to storage a Memory Value that is originally zero, and each time it received a signal, it send the signal storaged as Memory Value and storage the received signal as the memory value.

State diagram

A way to analyze the situation of a convolutional code, a useful technique is to treat it as a directed graph where each possible state of the code (possible configuration of the memory cells) and the directed edges go from one node to another if it is possible to go after a run from one of the state to the other. The figure below shows the State diagram of the same coder than above.

This is a version a lot simplified of the model. Each of the edges has an input and an output value. The basic idea is that the coder starts in the null state (all the memory cells equal to zero) and, in order to encode the messages, you take the edge of the corresponding input given, and take the output of the same edge. On the next run, you start form the node where you finished the last run.

Trellis's Diagram and Vitterbi Algorithm Decoding.

Given this encoding method, the question should be: Is there a way to decode a received word in an efficient way? Andrew Viterbi answered this question in an heuristic, very efficient way.

Viterbi's Algorithm basically intents to maximize p(r|c), this means, assuming that the probability to make a mistake is less than 1/2, Viterbi's idea is to track all the possible paths in the Trellis's Diagram,

and choose the one with smallest Hamming distance to the received word. Of course, this is absolutely inpractical, given that the number of possible path grows exponentially, but it can be used as an starting point. As a solution, it is possible to fix a maximun number (preferiblely small) of paths that are going to be analyzed. In this case, all the paths are analized until exceding for the first time the maximun number of paths to analyze, and then simply erase the "surplus", deleting the ones whose Hamming distance to the wanted section of the received word. In case of a tie, choose randomly.

This algorithm is a lot more economical and gives with a very high probability a unique path corresponding to the most likely code word for the received word. The probability of error decreases as the size of the stack of paths and the size of the codeword increases.


If any, all the opinions expressed here are my own, or else satirical, or honest errors. They are not necessarily those of MSRI or Reed College.
More Formal Disclaimer
Any opinions expressed in these web pages are my own, and have not been reviewed nor approved by anyone else. In particular, they are not to be construed as policies of the Mathematical Science Research Institute nor of the Reed College.