Skip to main content

2 RISCV Assembly

RISC-V Architecture

  • Register Size: 1 word = 32 bits = 4 bytes

  • 32 registers:

    x0(holds the value zero)-x31


  • Byte Ordering

    Little Endian(x86, ARM)

    Eg: 0x01234567


  • Sign Extend: Take the most-significant bit and copy it to the new bits

RISC-V Instructions

Assembly Syntax


no-op: an instruction that does nothing

add x0,x0,x0  # Writes to x0 are always ignored

Shifting Instructions

  • When using immediate, only values 0-31 are practical
  • When using variable, only lowest 5 bits are used (read as unsigned)

Data Transfer Instructions

  • sw/lw

    Store word: M[R[rs1]+imm](31:0) = R[rs2](31:0)//rs1:transfer的结果
    Load word: R[rd] = M[R[rs1]+imm](31:0)
  • sb/lb

    on sb, upper 24 bits are ignored

    On lb, upper 24 bits are filled by sign-extension

    Store Byte: M[R[rs1]+imm](7:0) = R[rs2](7:0)
    Load Byte: R[rd] = {24’bM[](7), M[R[Rs1]+imm](7:0)}
  • others

    On sh, upper 16 bits are ignored

    On lh, upper 16 bits are filled by sign-extension

    On l(b/h)u, upper bits are filled by zero-extension

For e.g., s0 = 0x00000180 (all 32 bits)

lb s1,1(s0) # s1 = 0x00000001
lb s2,0(s0) # s2 = 0xFFFFFF80
sb s2,2(s0) # *(s0) = 0x00800180


add x10,x11,4(x12) # 必须先lw再add


  • conditional branch
    • branch if equal (beq) or branch if not equal (bne)
    • branch if less than (blt) and branch if greater than or equal (bge)
  • unconditional branch
    j label # pseudo-code 
    jal dst label # Writes PC+4 to dst, set PC=lable
    jalr dst src imm # Writes PC+4 to dst, set PC=src+imm
    jr ra # pseudo-code:ret = jr ra = jalr x0, ra, 0


C Loop Mapped to RISC-V Assembly

int A[20];
int sum = 0;
for (int i=0; i<20; i++)
sum += A[i];
# Assume x8 holds pointer to A
# Assign x10=sum
add x10, x0, x0 # sum=0
add x11, x0, x8 # Copy of A
addi x12,x11, 80 # x12=80 + A
lw x13, 0(x11)
add x10, x10, x13
addi x11, x11, 4
blt x11, x12, loop

Calling a function


Callee Saved(registers are expected to be the same before and after a function call)

  • s0-s11 (saved registers)
  • sp (stack pointer)

Caller Saved(These registers can be freely changed by the calleE,caller must save those values before making a procedure call)

  • t0-t6 (temporary registers)
  • a0-a7 (function arguments)
  • ra (return address)
    • because ra will change if calleE invokes another function

Choosing Your Registers

  • Function does NOT call another function
    • just use t0-t6 and there is nothing to save
  • Function calls other functions
    • Values you need throughout go in s0-s11, others go in t0-t6
addi sp,sp, -framesize
sw ra, <framesize-4>(sp) # 次二高,随后逐级递减
#store other callee saved registers
#save other regs if need be

#restore other regs if need be
#restore other callee saved registers
lw ra, <framesize-4>(sp)
addi sp,sp, framesize
jr ra


int sumSquare(int x, int y) {
return mult(x,x)+ y;
# Prologue
addi sp,sp,-8 # make space on stack
sw ra, 4(sp) # save ret addr
sw a1, 0(sp) # save y
add a1,a0,x0 # set 2nd mult arg

# Body
jal mult # call mult

# Epilogue
lw a1, 0(sp) # restore y
add a0,a0,a1 # ret val = mult(x,x)+y
lw ra, 4(sp) # get ret addr
addi sp,sp,8 # restore stack
jr ra


Instruction Format

Iimm[11:0]rs1funct3rdopcodelw, jalr, slli
SB(Branch)imm[12,10:5]rs2rs1funct3imm[4:1, 11]opcodeimm: lowest bit of offset is always zero
UJimm[20,10:1,11, 19:12]rdopcodejal

I format

12-bit immediate must be sign-extended to 32 bits


SB format

212-2^{12} to 21222^{12}-2 = offset(byte)

U and UJ format

# Load Upper Immediate,clears the lower 12 bits
lui rd, immediate #rd = (immediate[31:12] << 12)
# Add Upper Immediate to PC
auipc rd, immediate #rd = pc+(immediate[31:12] << 12)

E.g. How to set 0xDEADBEEF?

# Wrong answer
lui x10, 0xDEADB # x10 = 0xDEADB000
addi x10, x10,0xEEF # if top bit of the 12-bit immediate is a 1, it will subtract -1 from upper 20 bits,thus x10 = 0xDEADAEEF

# Right
lui x10, 0xDEADC # x10 = 0xDEADC000
addi x10, x10,0xEEF # x10 = 0xDEADBEEF

# Call function at any 32-bit absolute address
lui x1, <hi 20 bits>
jalr ra, x1, <lo 12 bits>

# Jump PC-relative with 32-bit offset
auipc x1, <hi 20 bits>
jalr x0, x1, <lo 12 bits>