pipe hazards
 memory, cache

Compiler optimizations, Avoid load-use stalls lw \$1, offset(\$base) before add \$3, \$1, \$2. and \$4, \$4, 0) E.G. - Move instruction between "Fill load-delay slot" - can find instruction w/o dependencies OK' if - Let hardware insert Nop

- Compiler fill w/ nop

HW doesn't have load-use detection

## Code Scheduling to Avoid Stalls

$$20\% \text{ BR} \implies CPI = (0.2)(1+3) + (0.8)1$$
  
= 0.8 + 0.8  
 $\implies$  slow down of 2!

Costly! missed prediction (predicted not-taken)



2. EQUALS test after register fetch.

1. early BR + hazard detection eh

$$CPI_{BR} = \begin{cases} 2, \text{ taken} \\ 1, \text{ not taken} \end{cases} \xrightarrow{P} \overline{CPI}_{BR} = (0.5)2 + (0.5)1 = 1.5 \end{cases}$$

3. Late BR + hazard detection 
$$lh$$
  
 $CPI_{BR} = \begin{cases} 4, \text{ taken} \\ 1, \text{ not taken} \end{cases} = \overline{CPI}_{BR} = (0.5 \cdot 4 + 0.5 \cdot 1) = 2.5$ 

4. Late BR, compiler NOPS la

LC4: CPIBR = 3 MIPS: CPI = 4 Cheap HW!

Which is best balance? Power? Area? Yeild?

Suppose trace has 25% BR instructions. Seh-lh = 5/3 ~ 67% Sec-lh = 5/4 ~ 25% Data Hazards for Branches

If a comparison register is a destination of 2<sup>nd</sup> or 3<sup>rd</sup> preceding ALU instruction



 $l_{BR} \Rightarrow 2 stalls$ 



Control Hazards: Exceptions, Traps, Interrupts memory - Something happens • I/O device sends signal: INTERRUPT Vector Table **D** 05 · CPU detects execution error : EXCEPTION Prog · Execution of a sys call : TRAP - Do something about it · Talk to device, get data, send data -> jump back to prog. · Send error message, terminate program · Jump to OS routine, do service -> jump back to prog. OPTIONS FOR Control Transfer Mem - Hordwired : always go to 8000 0180 dispatch 80000180 -figure out what routine to jump to (use cause' Reg.) - Maybe a few Targets hardwired: 80000080 INT 80000280 TRAP - Hardwired jump via jump Table (Vector Table) - Combination of these (dispatcher per vector) BR hazard Jump ≈

## **Precise Exceptions**

execution stream

no effects

LC 3

18 PC ++

DECODE

INSTR

execute

next

possible context switch,

MAR+PC

XCEPTION

ADD

AND

SUB

BR

TRAP

save

and

juwb

(START SERVICE)

exec.

Х

complete : reg + mem

terrupt **X** has Exception **Seffe**e

restarteble

completed

instruction fetch

INT=1

save

and

JUND

18

STAR T

SERVICE

- Definition: precise exceptions, as if
  - All previous instructions had completed
  - The faulting instruction was not started
  - None of the next instructions were started
    - No changes to the architecture state (registers, memory)
- Why are precise exceptions desirable by OS developers?
- With a single cycle machine, precise exceptions are easy
   Why?

We (05) need To know - What happened (INT, EXC, TRAP) - How to restart (PC, ...) - which instruction caused problem -what data caused problem - which device needs help



2

CAUSE REG: which (stage, exception) for all stages

> Bit vector = OR of stage-causes

## EPC:

PCs calculated from EPC.



- 1. Previous instructions complete, NULL following instructions
- 2. Save PC into EPC
- 3. Save exception code into EPC
- 4. NULL offending instruction (saves state = MEM+REG)
- 5. Jump to OS at 80000180 (or, freeze and start co-processor)
- ???---multiple exceptions?



Imprecise Exceptions - stop pipe: save state, let software figure it out, as possible



Shorten feedback through register file using neg. edge triggered FFs.



Data hazard detection can forward data without bubbles for operate instructions.



Load-use delay causes a bubble (unless compiler fills slot), then forwarding used.



Branch data hazard from operate instruction cause stall and one bubble, then uses forwarding. Almost the same as load-use delay. Inserts NOP if branch taken.



Branch data hazard from LW instruction in DMEM causes stall and one bubble, then uses forwarding. Same as load-use data hazard.



Branch data dependency with LW in EX causes two bubbles, then forwarding.



Exceptions, traps, and interrupts can cause many bubbles.



## Questions

- 1. What to do w/ multiple exceptions during same clock cycle?
- 2. What to do w/ exceptions for completing instructions?
- 3. How to know what happened?
- 4. What about nested exceptions; i.e., exceptions occuring during exception handling?