6.7 FPU Instruction Pipeline Overview

FPU Pipeline Overlapping


Each of the three op units is controlled by an FPU resource scheduler, which issues instructions under constraints described in the following section. Table 6-15 lists the pipe stages used in each of the op units (although not all stages are used by each unit).

Table 6-15 FPU Operational Unit Pipe Stages

Instruction Scheduling Constraints

The FPU resource scheduler is kept from issuing instructions to the FPU op units (adder, multiplier, and divider) by the limitations in their micro-architectures. If any of the following constraints are violated, the op unit assumes the outstanding instruction in its pipe is discarded, and then continues operation on the most recently issued instruction.

FPU Divider Constraints

The FPU divider can handle only one non-overlapped division instruction in its pipe at any one time.

FPU Multiplier Constraints

The FPU multiplier allows up to two pipelined MUL.[S,D] instructions to be processed as long as the following constraints are met:

These figures are not meant to imply that back-to-back multiplications are allowed. Rather, as shown in Figure 6-11, instructions I2 and I3 are illegal and I5, I6, I7, and I8 are successive stages of I4, referenced to I1.

Figure 6-12 is similar, in that I6, I7, and I8 are successive stages of I5.



Figure 6-11 MUL.S Instruction Scheduling in the FPU Multiplier



Figure 6-12 MUL.D Instruction Scheduling in the FPU Multiplier

FPU Adder Constraints

Following are the constraints that must be met in the FPU adder op unit.

Cycle Overlap. The adder op unit must allow a clock cycle overlap between each newly issued instruction and the instruction being completed, as shown in Figure 6-13.



Figure 6-13 Instruction Cycle Overlap in FPU Adder

Resource Conflict. The adder must allow the cleanup stages (A, R) of a multiplication instruction to be pipelined with the execution of an ADD.[S,D], SUB.[S,D], or C.COND.[S,D] instruction, as long as no two instructions simultaneously attempt to use the same A and R pipe stages. For instance, Figure 6-14 shows a resource conflict between the mantissa add (A, stage 7) of instructions 1, 5, and 6. This figure also shows the resource conflict between result round (R), stage 8, of instructions 1, 5, and 6. The multiplication cleanup cycles (A, R) can neither overlap nor pipeline with any other instruction currently in the adder pipe.

Figures 6-14 through 6-17 show these constraints.



Figure 6-14 MUL.D and ADD.[S,D] Cycle Conflict in FPU Adder



Figure 6-15 MUL.S and ADD.[S,D] Cycle Conflict in FPU Adder



While there is no resource conflict in issuing this CMP.[S,D] instruction, the hardware does not allow it.

Figure 6-16 MUL.D and CMP.[S,D] Cleanup Cycle Conflict in FPU Adder



While there is no resource conflict in issuing this CMP.[S,D] instruction, the hardware does not allow it.

Figure 6-17 MUL.S and CMP.[S,D] Cleanup Cycle Conflict in FPU Adder

Prep and Cleanup Cycle Overlap. The adder does not allow the preparation (U stage) and cleanup cycles (N, A, R) of a division instruction to be pipelined with any other instruction; however, the adder does allow the last cycle of preparation or cleanup to be overlapped one clock by the following instruction's U stage (the CPU EX cycle). Figure 6-18 shows this process.



Figure 6-18 Adder Prep and Cleanup Cycle Overlap

Instruction Latency, Repeat Rate, and Pipeline Stage Sequences

Table 6-16 lists the latency and repeat rate between instructions, together with the sequence of pipeline stages for each instruction. For example, the latency of the ADD.[S,D] is 4, which means it takes four processor cycles to complete. The Repeat Rate column indicates how soon an instruction can be repeated; for example, an ADD.[S,D] can be repeated after the conclusion of the third pipeline stage.

Table 6-16 Latency, Repeat Rate, and Pipe Stages of FPU Instructions

Resource Scheduling Rules

The FPU Resource Scheduler issues instructions while adhering to the rules described below. These scheduling rules optimize op unit executions; if the rules are not followed, the hardware interlocks to guarantee correct operation.

DIV.[S,D] can start only when all of the following conditions are met in the RF stage:

Idle means an operation unit--adder, multiplier or divider--is either not processing any instruction, or is currently in its last execution cycle completing an instruction.

MUL.[S,D] can start only when all of the following conditions are met in the RF stage:

SQRT.[S,D] can start only when all of the following conditions are met in the RFstage:

CVT.fmt, NEG.[S,D] or ABS.[S,D] instructions can only start when all of the following conditions are met in the RF stage:

ADD.[S,D], SUB.[S,D] or C.COND.[S,D] can only start when all of the following conditions are met in the RF stage:



Copyright 1996, MIPS Technologies, Inc. -- 21 MAR 96

Generated with CERN WebMaker
statistics