3 RISCV Single-Cycle Control and Pipelining
Basicsβ


Instruction Tableβ
| Inst[31:0] | BrEq | BrLT | PCSel | ImmSel | BrUn | ASel | BSel | ALUSel | MemRW | RegWEn | WBSel |
|---|---|---|---|---|---|---|---|---|---|---|---|
| add | * | * | +4 | * | * | Reg | Reg | Add | Read | 1 (Y) | ALU |
| sub | * | * | +4 | * | * | Reg | Reg | Sub | Read | 1 | ALU |
| (R-ROp) | * | * | +4 | * | * | Reg | Reg | (Op) | Read | 1 | ALU |
| addi | * | * | +4 | I | * | Reg | Imm | Add | Read | 1 | ALU |
| lw | * | * | +4 | I | * | Reg | Imm | Add | Read | 1 | Mem |
| sw | * | * | +4 | S | * | Reg | Imm | Add | Write | 0 (N) | * |
| beq | 0 | * | +4 | B | * | PC | Imm | Add | Read | 0 | * |
| beq | 1 | * | ALU | B | * | PC | Imm | Add | Read | 0 | * |
| bne | 0 | * | ALU | B | * | PC | Imm | Add | Read | 0 | * |
| bne | 1 | * | +4 | B | * | PC | Imm | Add | Read | 0 | * |
| blt | * | 1 | ALU | B | 0 | PC | Imm | Add | Read | 0 | * |
| bltu | * | 1 | ALU | B | 1 | PC | Imm | Add | Read | 0 | * |
| jalr | * | * | ALU | I | * | Reg | Imm | Add | Read | 1 | PC+4 |
| jal | * | * | ALU | J | * | PC | Imm | Add | Read | 1 | PC+4 |
| auipc | * | * | +4 | U | * | PC | Imm | Add | Read | 1 | ALU |
Single Cycleβ

Critical Path:
- R-TypeγArithmetic I-TypeγSB-Type: No DMem
- Load I-Type: DMem Read
- S-Type: DMem Write,No second Mux(WB)
- U-Type: No
Note: comparator is omitted because branch comparison is done in parallel with RegFile/ALU, which takes much longer time.
Control Logic
ImmSel:


BrUn,BrEq,BrLT:

ALUSel:

Pipelinedβ
Overviewβ

IF :
ID :
EX :
MEM :
WB :
Forwarding Path
Compare destination of older instructions in pipeline with sources of new instruction in decode stage

Hazardβ
Structural Hazardβ
Problem
Two or more instructions in the pipeline compete for access to a single physical resource
Solution
(1)Instructions take turns using resource, some instructions have to stall (wait)
(2)Add more hardware to machine
Example:
RegFile Hazard

Solution:
Double Pumping
Prepare to write during 1st half, write on falling edge, read during 2nd half of each clock cycle
Build RegFile with independent read and write ports
Memory

Solution:
Instruction and Data Caches

Data Hazardβ
Data dependency between instructions
R Type Instructions

Solution
(1)Stalling
Bubble: NOP(add x0 x0 x0)

(2)Forwarding(Bypass)

Load
Forwarding Problem

Solution
(1)Hardware Stall

(2)Stall

(3)Code Scheduling

Branch Predictionβ
Static branch predictionβ
Flush: penalty 2

Fast branch: have branch instrβs that can resolve in D, not X, penalty 1
- On taken branch, must flush one instr and βbypassβ from thedecode stage
- Must now have additional comparison instrβs (e.g., cmplt, slt) to support complex tests
Dynamic branch predictionβ
- Temporal correlation: The way a branch resolves may be a good predictor of the way it will resolve at the next execution
- Spatial correlation: Several branches may resolve in a highly correlated manner(a preferred path of execution)
Temporal Correlationβ
One bit Branch history table (BHT)
last fail -> invert the bit

Two bits Branch predictor
Change the prediction after two consecutive mis-predictions

Branch History Table

Spatial Correlationβ
Branch history register (BHR): A History register, records the direction of the last N branches executed by the processor
1-bit BHT+3-bit BHR

BTBβ
Limitations of BHT

Branch Target Buffer
