16.1 Error Checking in the Processor

Detecting Data Transmission Errors


The following procedure detects data transmission errors.

1. System A transmits a 64-bit doubleword together with 8 bits of SECDED ECC (see Figure 16-2).



Figure 16-2 Detecting ECC Errors: Transmitting Data and ECC

2. System B receives the data doubleword, together with the byte of ECC check code.

3. To verify proper transmission of the 64-bit doubleword and 8-bit ECC check code, system B generates its own 8-bit ECC check code from the 64-bit doubleword of System A, as shown in Figure 16-3.

4. System B executes an Exclusive-OR (XOR) on the check bits of System A with its own newly-generated ECC check bits, (see Figure 16-3). The output of this XOR is called the syndrome.



Figure 16-3 Detecting ECC Errors: Deriving the Syndrome

5. If the syndrome is 0000 00002, the data System B received, together with the newly-generated ECC check bits from System B, are the same as the data and check bits from System A. If the syndrome is any other value than 0000 00002, it is assumed either the received word or the received check bits are in error.

6. Using the data in Figure 16-1, it may be possible to correct either the data bit or check bit in error. Determine if the syndrome is in Figure 16-1 by counting the number on 1s in the syndrome.

If the syndrome is identical to any of the syndromes in the Figure 16-1, the column number of that data or check bit indicates the location of the bit in error. The bit that is in error is corrected by inverting its state (a 1 is changed to 0; a 0 is changed to 1).

The following sections show how to use the check matrices in Figure 16-1 for detecting:

Single Data Bit ECC Error

The following procedure detects and corrects a single data bit ECC error.

1. System A transmits:

Data(63:0) = 0x0000 0000 0000 0000

and

ECC(7:0) check code = 0000 00002

2. System B receives the following incorrect data:

Data(63:0) = 0x0000 0000 0000 0001

and

ECC(7:0) check code = 0000 00002

3. System B regenerates ECC for the received data. The correct ECC check code for:

Data(63:0) = 0x0000 0000 0000 0001

is

ECC(7:0) = 0001 00112

4. A syndrome is generated by the XOR of the System A check bits, 0000 00002, and the System B regenerated check bits, 0001 00112. The resulting syndrome is 0001 00112. Since the syndrome has three 1s, look for the column with three 1s in the parity check matrix table.

5. Searching the matrix (Figure 16-1) shows that the syndrome, 0001 00112, corresponds to data bit 0. This means the state of received data bit 0 is incorrect.

6. To correct the error, the system inverts the state of the received data bit 0 from a value of 1 to 0.

Single Check Bit ECC Error

The following procedure detects and corrects a single check bit ECC error.

1. System A transmits:

Data(63:0) = 0x0000 0000 0000 0000

and

ECC(7:0) check code = 0000 00002

2. System B receives the following incorrect check code:

Data(63:0) = 0x0000 0000 0000 0000

and

ECC(7:0) check code = 0000 00012

3. System B regenerates the ECC for the received data. The correct ECC check code for:

Data(63:0) = 0x0000 0000 0000 0000

is

ECC(7:0) = 0000 00002

4. A syndrome is generated by the XOR of the System A check bits, 0000 00012, and the System B regenerated check bits, 0000 00002. The resulting syndrome is 0000 00012.

Since the syndrome has a single 1, it is contained in the check matrix. Figure 16-1 shows that the syndrome, 0000 00012, corresponds to check bit 0. This indicates that the state of the received check bit 0 is incorrect. To correct the error, the system inverts the state of the received check bit 0 from a value of 1 to 0.

Double Data Bit ECC Errors

The following procedure detects double data bit ECC errors.

1. System A transmits:

Data(63:0) = 0x0000 0000 0000 0000

and

ECC(7:0) check code = 0000 00002.

2. System B receives the following incorrect data:

Data(63:0) = 0x0000 0000 0000 0011

and

ECC(7:0) check code = 0000 00002

3. System B regenerates the ECC for the received data. The correct ECC check code for:

Data(63:0) = 0x0000 0000 0000 0011

is

ECC(7:0) = 0011 00002

4. A syndrome is generated by the XOR of the System A check bits, 0000 00002, and the System B regenerated check bits, 0011 00002. The resulting syndrome is 0011 00002.

The syndrome of two 1s (or an even number of 1s) indicates that a double-bit error has been detected. Double-bit errors cannot be corrected.

Three Data Bit ECC Errors

The following procedure detects three data bit errors that occur within a nibble.

1. System A transmits:

Data(63:0) = 0x0000 0000 0000 0000

and

ECC(7:0) check code = 0000 00002

2. System B receives the following incorrect data:

Data(63:0) = 0x0000 0000 0000 0111

and

ECC(7:0) check code = 0000 00002

3. System B regenerates the ECC for the received data. The ECC check code for:

Data(63:0) = 0x0000 0000 0000 0111

is

ECC(7:0) = 0111 00112

4. A syndrome is generated by the XOR of the System A check bits, 0000 00002, and the System B regenerated check bits, 0111 00112. The resulting syndrome is 0111 00112.

The resulting syndrome has five 1s. Since no four of the 1s are contained in check bits (7:4) or check bits (3:0), three errors have occurred within a nibble. Triple-bit errors within a nibble cannot be corrected.

Four Data Bit ECC Errors

The following procedure detects four data bit errors that occur within a nibble.

1. System A transmits:

Data(63:0) = 0x0000 0000 0000 0000

and

ECC(7:0) check code = 0000 00002

2. System B receives the following incorrect data:

Data(63:0) = 0x0000 0000 0000 1111

and

ECC(7:0) check code = 0000 00002

3. System B regenerates the ECC for the received data. The ECC check code for:

Data(63:0) = 0x0000 0000 0000 1111

is

ECC(7:0) = 1111 00002

4. A syndrome is generated by the XOR of the System A check bits, 0000 00002, and the System B regenerated check bits, 1111 00002. The resulting syndrome is 1111 00002.

Since the resulting syndrome has four 1s (or an even number of 1s), this error is recognized as some variation of a double-bit error. A 4-bit error within a nibble cannot be corrected.



Copyright 1996, MIPS Technologies, Inc. -- 21 MAR 96

Generated with CERN WebMaker
statistics