The Hamming code explained
The Hamming code is a FEC, which stands for: Forward Error Correction. A FEC is a system in which the data gets coded in a way that enables the detection and correction of errors in the data flow. That way the receiver’s errors can be solved. This makes the transmittance of quite reliable information in a simplex system possible.
In this part I am going to elaborate on the use of a Hamming Code as a FEC. You will notice that we cannot apply the Hamming code directly; there happens to be a catch.
Earlier we have discussed the fact that a wireless channel is basically everything but an optimal transmission medium. We have to be aware of several errors that could block or distort the data that is coming from the transmitter. In most cases, the disruption either lasts for a couple of seconds or does not have an influence on the data flow. If the disruption takes longer or if its impact turns out to be bigger, an error could get created during the reconstruction of the data. We call this a bit error; a ‘0’ is seen as a ‘1’ or a ‘1’ is seen as a ‘0’. The chance, during which the bit error occurs, in relation to the amount of bits that gets correctly reconstructed is something we call the connection’s BER or Bit Error Rate.
In a simplex connection there is no way for the receiver to notify the transmitter that a bit error has been received. This means that it is no use to include a parity or CRC for such a simplex connection. When using a parity or CRC there has to be some mechanism that is capable of ‘telling’ the transmitter that an error has been detected. In that case, the transmitter can send the frame once more. When a parity or CRC is included in the system, the receiver can only detect an error yet not its location. The error could even be the parity itself. The Hamming code can partially solve this problem.
The Hamming code’s operation
One of the special qualities the Hamming code contains is that it provides data bits with a great amount of extra bits. These extra bits have been deducted from the data bits in a specific way. Due to the increase of bits, the payload will also get bigger and this is something we have to keep in mind. We will be showing how the code works by using an example. A Hamming (7,4) code is used in the example. We add 3 coding bits to 4 data bits. This code is rather inefficient in comparison to either the Hamming (11,7) or Hamming (20,15). The Hamming (7,4) has a whopping 43% redundancy, whereas the Hamming (20,15) only has a 25% redundancy.
Another quality of the Hamming code is that we can detect and correct 1 bit error, but when there are more errors there is no way of correcting them. In our case that is 1 bit error out of 7 bits: 4 data bits and 3 protection bits. This makes our protection ratio 1:7. The chance of the existence of a bit error is far smaller than when a Hamming (20,15) is used, because the protection ratio of that code is a mere 1:20. That is nearly 3 times smaller! The advantage of using a bigger Hamming code with a smaller redundancy is the reduced protection of the data bits. Another aspect of a bigger Hamming code is the size of the coder and decoder. Especially the decoder is a hard nut to crack. The Hamming Code includes a crafty mechanism, but we can assume the following; the bigger the Hamming code the more coding needs to be done.
The Hamming (7,4) code exists out of 4 data bits and 3 redundant bits. That is very convenient, because with two of these codes we can protect 8 bits. In order to protect 8 bits we will need 6 redundant bits.
The 3 redundant bits of the Hamming code are called parity; P1,P2 and P3.
P1 is the parity of data bits D0,D1 and D3.
P2 is the parity of data bits D0,D2 and D3.
P3 is the parity of data bits D1,D2 and D3.
The total series is shown in Table 1.
Two problems immediately occur. The Hamming code for the values 0 and F show a remarkable result.
It is up to the receiver to reconstruct the sent bits yet it can only do this when the data contains information about the timing speed. After all, there is no system, other than the data itself, to send a timing signal to the receiver. Sometimes it is possible for us to even the signal of the receiver out a bit, but we cannot possibly make it flawless. The receiver needs a timing frequency for the bits’ synchronization.
The receiver reconstructs the timing frequency, so we have to make sure the data contains a lot of different bits. For example, we prefer a maximum of 3 ‘1’s or 3 ‘0’s to appear after one another in the code. The results for 0 and F in combination with the results for 3 and C do not qualify to this preference.
The Hamming code’s operation is based upon the principal of derived redundancy. The code is this spectacular, because we did not just add a couple of random bits; the bits we added are deducted from the original. Every result we gain is unique and only a handful of codes, from the entire database of potential codes, are used.
For every 6 data bits 3 Hamming code bits get added. This makes the entire database of codes a total of 128 (2^7) However, the amount of Hamming codes is only 16. Due to the amount of Hamming codes, the Hamming distance remains as big as the amount of added bits: 3. In order to make the code similar to another code, at least 3 bits need to be reconstructed wrongly. We call this the Hamming distance.
The Hamming code for 8 [P1,P2,D0,P3,D1,D2,D3] = 1101001
The Hamming code for 9 [P1,P2,D0,P3,D1,D2,D3] = 0011001
Difference: 1110000 = 3 bits
The arrangement of the columns does not matter for this calculation, as long as they remain the same for each number.
The following graph displays the functioning of the Hamming code a lot better.
The histogram (Figure 1) shows the data of an image (jpg). It clearly indicates that the data behaves in a stochastic (random) manner. The graph looks quite a lot like noise. If a data stream with this content gets transmitted to a receiver it is impossible to detect, and even less possible to correct, the error.
Figure 2 shows the same amount of decoded data with the Hamming code. However, this time I added the same amount of rough data, as the amount of noise, to it. The image shows that the Hamming codes easily protrude the noise. The histogram teaches us that the decoder can be much simpler and more selective. With this we can assume that there some kind of amplification takes place as well.
Unfortunately, this is at the expense of the amount of data that can be transmitted per timing unit. After all, with a similar data rate we can only send about half of the data. It is worth noticing that the horizontal distance (in Figure 2) between the lines in no way relates to the selectivity or amplification.
The improved Hamming code.
One way of taking care of the Hamming code’s somewhat disappointing bit synchronization is a procedure called whitening. Whitening is done by manipulating the code in a way that causes the character (whether inverted or not) and the order in which the sent bits appear to more closely resemble the data’s ideal form. This ideal form exists out of a ‘0’ and a ‘1’ appearing alternately. If we transmit this code through a serial channel it will create a square wave. The existence of this square wave makes it easier for the receiver to reconstruct the timing signal. Transmitting a ‘0’ and a ‘1’ alternately, unfortunately keeps us from adding more information to the data. A middle ground has to be found between the most possible bit exchanges and the information in the form of the sent bits’ order.
Another form of whitening can be found in the Manchester coding. The Manchester coding uses a double data rate which fortunately results in the presence of an inherent timing signal in the data. However, the Manchester coding lacks the Hamming code’s qualities.
The original idea behind the Hamming code cannot be completed until the bits are placed in the right order. The order gives the code mathematical qualities, which in return enables us to detect and correct a bit error more efficiently. Please take my word for it, as I will not elaborate on this later on. The code happens to be too short to profit tremendously from this. In conclusion, the newly improved Hamming code only partially uses the original Hamming code. It only uses the parity.
In order to apply whitening we have to make a choice: either we add a couple of extra steps to the transmitter’s coder or we apply a different process of coding and decoding.
One of the possibilities of coding the data is to write an (en)coder in the microcontroller’s firmware. The coder will have to calculate each parity in order to be able to put the bits in the right order. The whitening process becomes applicable when we edit the complete byte with a XOR function. The decoder in the receiver will, firstly, have to edit the data with the exact same XOR function, recognize the errors and replace the erroneous bit. After that, the code can get decoded.
Another way of coding is with the use of a LUT, a Look Up Table. We can create the LUT by creating a list of every code for every single possibility. As a matter of fact, this LUT already exists. Table 1 contains the LUT for the transmitter. We only need to edit the code a bit in order to make the improved bit synchronization possible. The 4 data bits create the LUT’s entrance. The exit is 7 bits; the data bits and the Hamming parity bits.
The LUT in the receiver looks slightly bigger. The amount of possibilities happens to have gone from 16 (2^4) to 128 (2^7), by adding the bits. When transmitting 8 bits at once this number increases to 256 (2^8)! That makes for a considerable table!
It has taken us some time but, by putting the columns in the right order and inverting some of them (the XOR function), we have been able to take plenty of steps into the right direction. Unfortunately, we cannot prevent 4 ‘0’s or 4 ‘1’s from appearing after another. This creates a bigger chance for 8 ‘0s’ or 8 ‘1’s, in a data stream in which all the bits are placed after on another, to be transmitted in a row. For example, a byte with 4 ‘0’s at the end followed by a byte with 4 ‘0’s at the beginning. This increases the chance of the receiver losing the bit synchronization. If we could edit the table in a way that only a maximum of 3 ‘0’s or 3 ‘1’s could appear in a row, this chance would decrease dramatically.
We are using the Hamming code yet we are transmitting sets of 8 bits. This means that there is 1 bit left which does not have a proper function yet. We can use this bit to diminish the bit synchronization issue. We call this bit ‘S’ in Table 2.
There is not a whole lot left of the Hamming code’s mathematical qualities. I have shuffled the columns and inverted a couple of them. We can still use the Hamming code’s protecting qualities by using a LUT, which in turn provides us with a comparatively straightforward decoder.
The LUT for the transmitter is small in size. Table 2 displays the corresponding results in the right column. The table increases tremendously in size in the receiver. It increases by 16 times! In order to create this table we first have to calculate all the codes that contain 1 bit error in comparison with the proper code. The receiver LUT transmits for those values the proper value. After we have done all the calculations we enter the values in a 16×16 matrix.
The light gray cells in Table 3 stand for the locations in which the receiver has made no errors. The remaining cells with numbers and letters represent the locations in which 1 bit error has been made. The two dark gray cells represent the bytes that are used for the data connection layer to indicate a break (0xAA) or a command byte is following (0x55). Both numbers are excluded from the Hamming code or the reconstructed values that contain 1 bit error.
The transmitter has to transmit the decimal value 10. From the transmitter’s LUT we can see that for the value 10 (0x0A) a byte with value 0xCE will be send. This is the result of the LUT. The value (in this example) gets wrongly reconstructed, in the receiver, and turns into 0x8E. We can assume from this that bit 6 has not been received correctly and has been turned from a ‘1’ into a ‘0’. Table 3 reveals that the entrance that belongs to 0x8E (8 on the horizontal and E on the vertical line) contains the value A. The result of the decoder’s LUT is now A and that is the value 10. In the end, the receiver’s LUT has not only decoded the received byte but also corrected the bit error. It is that simple.
We are still capable of extending the receiver’s LUT. After all, we are only using half of the table. The decoder’s exist is a byte with a value between 0 and 15; the lowest 4 bits. De highest 4 bits are always ‘0’. We can add another quality to the decoder by defining one of these bits. By adding an extra bit from the exit value to every corrected value with 1 bit error, we will be able to count the amount of bit errors. The bit stays, as long as the exit value is correct, a ‘0’. We can achieve the right exit from the decoder by ignoring the highest bits.
When we count a message’s correctly received bytes and the amount of corrected bytes, the BER can get calculated. For example, if in a 1000 bytes message 3 values with 1 bit error get received and corrected, the BER will be 3×10-3.
The LUT also contains a lot of empty cells: cells that do not contain a value. These are the table values of the places in which 2 bit errors have been detected during the reconstruction in the receiver. A value has to be placed in these cells in order to indicate, in the decoder, that these exceptions have taken place. A good example value would be 0xFF. However, in this point in time, we cannot do anything besides registering the exceptions.
The final LUT belonging to the receiver is displayed in Table 4.