5  RISC-V Assembly Programming

Note

Please refer to [1] and [2] for more information.

5.1 Revision of RV32I Instructions

In order to write assembly programs, we must get familiar with basic RV32I instructions.

5.2 Common Pesudo-Instructions

  • Basic pseudo-instructions (li, mv, nop)
  • Control flow pseudo-instructions (call, ret, beqz)
  • How assembler expands pseudo-instructions

To write RISC-V assembly code more conveniently, there are some of instructions have their own alias, or so called pseudo-instruction. Typically, for example, we would write addi x1, x0, 0xFF when we want to load a immediate into register x1 with the value 0xFF. With the power of pesudo-instructions, however, we can simply write li x1, 0xFF to do the same thing.

For assembler, it will convert the pseudo-instructions into base instruction(s) before converting them into machine code.

The list below contains the common pseudo-instructions which might be useful when writing assembly programs.

Pseudo-Instructions Base Instruction(s) Meaning
nop addi, x0, x0, 0 No operation
li rd, immediate *Myriad sequence Load Immediate
mv rd, rs addi rd, rs, 0 Copy register
not rd, rs xori, rd, rs, -1 One’s complement
neg rd, rs sub rd, x0, rs Two’s complement
seqz rd, rs sltiu rd, rs, 1 Set if == zero
snez rd, rs sltu rd, x0, rs Set if != zero
sltz rd, rs slt rd, rs, x0 Set if < zero
sgtz rd, rs slt rd, x0, rs Set if > zero
beqz rs, offset beq rs, x0, offset Branch if == zero
bnez rs, offset bne rs, x0, offset Branch if != zero
blez rs, offset bge x0, rs, offset Branch if <= zero
bgez rs, offset bge rs, x0, offset Branch if >= zero
bltz rs, offset blt rs, x0, offset Branch if < zero
bgtz rs, offset blt x0, rs, offset Branch if > zero
bgt rs, rt, offset blt rt, rs, offset Branch if >
ble rs, rt, offset bge rt, rs, offset Branch if <=
bgtu rs, rt, offset bltu rt, rs, offset Branch if >, unsigned
bleu rs, rt, offset bgeu rt, rs, offset Branch if <=, unsigned
j offset jal x0, offset Jump
jal offset jal, ra, offset Jump and link
jr rs jalr x0, rs, 0 Jump register
jalr rs jalr ra, rs, 0 Jump and link register
ret jalr x0, ra, 0 Return from subroutine
call offset auipc ra, offset[31:12]
jalr ra, ra, offset[11:0]
Call far-away subroutine
tail offset auipc x6, offset[31:12]
jalr x0, x6, offset[11:0]
Tail call far-away subroutine

5.3 Assembly Programming Basics

5.3.1 Symbols and Labels

Tip

For more information about symbols and labels, please refer to [3]. Furthermore, chapter 7 in the book [4] is also a good reference.

According to [3]:

Symbols are a central concept: the programmer uses symbols to name things, the linker uses symbols to link, and the debugger uses symbols to debug.

TBD

In the perspective of binary representation for instructions, the jump/branch target addresses are just memory addresses, while we can write labels in the assembly program with jump/branch instructions. There are two type of labels, text label and numeric label.

We often use text labels when writing if-else statement and loops. According to [1]:

Text labels are used as branch, unconditional jump targets and symbol offsets. Text labels are added to the symbol table of the compiled module.

loop:
    j loop

The symbol loop above is exactly the text label we just mentioned.

Besides, the other type of label is called numeric label. According to [1] as well:

Numeric labels are used for local references. References to local labels are suffixed with ‘f’ for a forward reference or ‘b’ for a backwards reference.

1:
    j 1b

5.3.2 Addressing for Wide Immediates and Addresses

5.3.3 If-Then-Else Statement

Before we go deep into how to implement if-then-else statement in assembly language, we must recall some basic logic operations for number comparison and De Morgan’s Laws at first.

For number comparison within two numbers, we have the following properties:

\[ \neg (A > B) \equiv A \leq B \]

\[ \neg (A < B) \equiv A \geq B \]

\[ \neg (A \geq B) \equiv A < B \]

\[ \neg (A \leq B) \equiv A > B \]

\[ \neg (A == B) \equiv A \ne B \]

\[ \neg (A \ne B) \equiv A == B \]

Besides, according to De Morgan’s Laws:

\[ \neg (A \wedge B) = \neg A \vee \neg B, \space and \]

\[ \neg (A \vee B) = \neg A \wedge \neg B \]

These useful logical properties might help us to write more concise and straightforward assembly codes if we can use them wisely.

Among RV32I instructions, there are six different conditional branch instructions we can use to implement the if-else statement in C language. The main decision point to choose which branch instruction to use is that the condition(s) in if-else statements.

For example, consider the following if-then-else statement:

// suppose that a is stored in x1, and b is stored in x2
if (a > b) {
    // do job 1
} else {
    // do job 2
}
// exit

Intuitively, we would like to use branch greater than (bgt) instruction to implement this if-else statement. In RV32I, howerver, there is no suck instruction called bgt. Instead, we should simply rewrite the statement by changing the order.

if (b < a) {
    // do job 1
} else {
    // do job 2
}
// exit

Hence, we can implement the if-then-else statement by using blt or bltu instructions in RV32I, which depends on whether the number comparison is signed or unsigned.

    blt x2, x1, job_1
job_2:
    ...
    j exit
job_1:
    ...
exit:
    ...

Let’s consider a more complicated example with if-then-else-if-then-else statement.

if (a > b) {
    // do job 1
} else if (c > b) {
    // do job 2 
} else {
    // do job 3
}
// exit

Assume that a is stored in register x1, b is stored in x2, and c is stored in x3. We can implement this if-then-else statement as following:

    blt x2, x1, job_1
    blt x2, x3, job_2
job_3:
    ...
    j exit
job_2:
    ...
    j exit
job_1:
    ...
exit:
    ...

FInally, let’s consider another example which has complex condition guard:

if (!(a < b && a < c)) {
    // do job 1
} else {
    // do job 2
}

We can apply De Morgan’s Laws to get \(\neg ((a < b) \land (a < c)) \equiv (a \geq b) \lor (a \geq c)\).

if (a >= b && a >= c) {
    // do job 1
} else {
    // do job 2
}

We can view this version of C code as the code below:

if (cond_1 && cond_2) {
    // do job 1
} else {
    // do job 2
}

For if-else statements with multiple conditions in a row, we can use multiple branch instructions to implement it.

    blt x1, x2, job_2
    blt x1, x3, job_2
job_1:
    ...
    j exit
job_2:
    ...
exit:
    ...
Tip

5.3.4 For-Loops

To implement for-loops, we can use one register as a counter with conditional branch instructions.

For example, we would like to implement the for-loop below in RISC-V Assembly.

int a = 0;
for (int i = 0; i < 100; i++) {
    a++;
}

The corresponding RISC-V assembly can be implemented as the code segment below if we follow the two assumptions:

  1. The value of variable a is stored in register s0
  2. The value of for-loop local variable is stored in register s1
    li s0, 0 # initialize variable a
    li s1, 0 # initialize for-loop local variable i in register s0
    li t0, 100
for_label:
    bge s1, t0, exit
    addi s0, s0, 1
    addi s1, s1, 1
    j for_label
exit:
    ...

5.3.5 While-Loops

Basically, while-loops can be converted into for-loops seamless.

5.3.6 Switch-Case Statement

5.4 Supporting Procedure in Hardware

5.4.1 What is ABI, and Why does ABI matter?

Application Binary Interface, or ABI for short, is a core concept in Operating System Design, Compiler Designs, as well as writing assembly programs. The main factor why ABI was developed is interoperability and portability accoss computers.

5.4.2 Brief Introduction to RISC-V ABI

Note

Please refer to [5] for more information about RISC-V ABI.

5.4.3 Supporting Procedure Calling

introduce prologue and epilogue as well

5.4.4 RISC-V Calling Convention (Part of ABI)

Integer Register Convention

Register ABI Mnemonic Use by convention Preserved?
x0 zero hardwired to 0, ignores writes n/a
x1 ra return address for jumps no
x2 sp stack pointer yes
x3 gp global pointer n/a
x4 tp thread pointer n/a
x5 t0 temporary register 0 no
x6 t1 temporary register 1 no
x7 t2 temporary register 2 no
x8 s0 or fp saved register 0 or frame pointer yes
x9 s1 saved register 1 yes
x10 a0 return value or function argument 0 no
x11 a1 return value or function argument 1 no
x12 a2 function argument 2 no
x13 a3 function argument 3 no
x14 a4 function argument 4 no
x15 a5 function argument 5 no
x16 a6 function argument 6 no
x17 a7 function argument 7 no
x18 s2 saved register 2 yes
x19 s3 saved register 3 yes
x20 s4 saved register 4 yes
x21 s5 saved register 5 yes
x22 s6 saved register 6 yes
x23 s7 saved register 7 yes
x24 s8 saved register 8 yes
x25 s9 saved register 9 yes
x26 s10 saved register 10 yes
x27 s11 saved register 11 yes
x28 t3 temporary register 3 no
x29 t4 temporary register 4 no
x30 t5 temporary register 5 no
x31 t6 temporary register 6 no
pc (none) program counter n/a

Procedure Calling Convention for Integer

C/C++ Type Details

There are two conventions for C/C++ type sizes and alignments.

ILP32, ILP32F, ILP32D, and ILP32E
Type Size (Bytes) Alignment (Bytes)
bool/_Bool 1 1
char 1 1
short 2 2
int 4 4
long 4 4
long long 8 8
void * 4 4
+++__bf16+++ 2 2
_Float16 2 2
float 4 4
double 8 8
long double 16 16
float _Complex 8 4
double _Complex 16 8
long double _Complex 32 16
LP64, LP64F, LP64D, and LP64Q
Type Size (Bytes) Alignment (Bytes)
bool/_Bool 1 1
char 1 1
short 2 2
int 4 4
long 8 8
long long 8 8
+++__int128+++ 16 16
void * 8 8
+++__bf16+++ 2 2
_Float16 2 2
float 4 4
double 8 8
long double 16 16
float _Complex 8 4
double _Complex 16 8
long double _Complex 32 16

5.5 Advanced Examples

5.5.1 Bubble Sort

5.5.2 Factorial

5.5.3 Fibonacci Sequence

5.6 C-Assembly Hrbrid Programming

In my opinion, I think C-Assembly Programming is the best show case for why ABI matters.

This is the best “show-case” for applying ABI.

  • Calling C from assembly
  • Calling assembly from C