In the assignment part for Lab 2, you must write two assembly programs by yourself. The goal of this assignment is to examine whether you possess the comprehensive understanding of basic RISC-V assembly programming concepts, which includes usages of instructions and the rule of RISC-V calling convention.
In order to complete the two assembly programs, you must understand the algorithms to solve the problems at the first, then try to implement them in C language. After completing C codes, you can finally translate them into actual RISC-V assembly.
7.2 RISC-V Assembly Programs Assignment (60 pts)
7.2.1 Problem 0 - Compilation Test (10 pts)
Notice: You must pass the complation phase for your programs to get 10 pts.
7.2.2 Problem 1 - Integer Array Sorter with Merge Sort Algorithm (25 pts)
In problem 1, our mission is to implement the Merge Sort algorithm in pure RV32I instructions. The core idea of merge sort follows the concepts about divide-and-conqure.
The merge sort algorithm closely follows the divide-and-conquer method. In each step, it sorts a subarray A[p:r], starting with the entire array A[1:n] and recursing down to smaller and smaller subarrays. Here is how merge sort operates:
Divide the subarray A[p:r] to be sorted into two adjacent subarrays, each of half the size. To do so, compute the midpoint q of A[p:r] (taking the average of p and r), and divide A[p:r] into subarrays A[p:q] and A[q + 1:r]. Conqure by sorting each of the two subarrays A[p:q] and A[q + 1:r] recursively using merge sort. Combine by merging the two sorted subarrays A[p:q] and A[q + 1:r] back into A[p:r], producing the sorted answer.
The key operation of the merge sort algorithm occurs in the “combine” step, which merges two adjacent, sorted subarrays. The merge operation is performed by the auxiliary procedure MERGE(A, p, q, r) on the following page, where A is an array and p, q, and r are indices into the array such that \(p \le q < r\). The procedure assumes that the adjacent subarrays A[p:q] and A[q + 1:r] were already recursively sorted. It merges the two sorted subarrays to form a single sorted subarray that replaces the current subarray A[p:r].
The operation of the while loop in lines 8-18 in the call MERGE(A, 9, 12, 16), when the subarray A[9:16] contains the values <2, 4, 6, 7, 1, 2, 3, 5>.
The operation of merge sort on the array A with length 8 that initially contains the sequence <12, 3, 7, 9, 14, 6, 11, 2>.
In order to implement merge sort, we have firstly introduce a auxiliary function called MERGE. The function MERGE is used to combine two sorted array into one sorted array.
MERGE(A, p, q, r) nL = q − p + 1 // length of A[p : q] nR = r − q // length of A[q + 1 : r] let L[0 : nL − 1] and R[0 : nR − 1] be new arrays for i = 0 to nL − 1 // copy A[p : q] into L[0 : nL − 1] L[i] = A[p + i] for j = 0 to nR − 1 // copy A[q + 1 : r] into R[0 : nR − 1] R[j] = A[q + j + 1] i = 0 // i indexes the smallest remaining element in L j = 0 // j indexes the smallest remaining element in R k = p // k indexes the location in A to fill // As long as each of the arrays L and R contains an unmerged element, // copy the smallest unmerged element back into A[p : r]. while i < nL and j < nR if L[i] ≤ R[j] A[k] = L[i] i = i + 1 else A[k] = R[j] j = j + 1 k = k + 1 // Having gone through one of L and R entirely, copy the // remainder of the other to the end of A[p : r]. while i < nL A[k] = L[i] i = i + 1 k = k + 1 while j < nR A[k] = R[j] j = j + 1 k = k + 1
With theauxiliary function MERGE, we can now implement MERGE-SORT as following:
MERGE-SORT(A, p, r) if p ≥ r // zero or one element? return q = ⌊(p + r)/2⌋ // midpoint of A[p : r] MERGE-SORT(A, p, q) // recursively sort A[p : q] MERGE-SORT(A, q + 1, r) // recursively sort A[q + 1 : r] // Merge A[p : q] and A[q + 1 : r] into A[p : r]. MERGE(A, p, q, r)
In problem 2, you need to write a function called sudoku_solver() to solve a 2-by-2 Sudoku puzzle by filling the empty cells in the given 1D array.
A Sudoku solution must satisfy all of the following rules:
Each of the digits 1-4 must occur exactly once in each row.
Each of the digits 1-4 must occur exactly once in each column.
Each of the digits 1-4 must occur exactly once in each of the 4 2x2 sub-boxes of the grid.
The integer -1 in the array indicates empty cells.
An example Sudoku puzzle with empty cells
An example Sudoku puzzle with its solution
Basically, to solve the sudoku problem in the perspective of algorithm, you can implement a backtracking algorithm. The core idea is about trial and error. The pesudocode below is the general version of backtracking algorithm.
function backtrack(partial_solution): // 1. Check if 'partial_solution' is a complete solution (Base Case) if is_complete_solution(partial_solution): process_solution(partial_solution) // e.g., add to results list, print it return // Backtrack from this complete solution (or continue if finding all solutions) // 2. Iterate over all possible choices for the next step for each choice in possible_choices(partial_solution): // 3. Check if the current 'choice' is valid (Pruning) if is_valid(choice, partial_solution): // 4. Make the choice (Apply the choice to the partial solution) make_choice(choice, partial_solution) // 5. Recurse: explore further backtrack(partial_solution) // 6. Undo the choice (The "Backtrack" step) // This is necessary so the loop can try the next 'choice' // when returning from the recursive call. unmake_choice(choice, partial_solution)// Initial call to start the processinitial_solution = create_empty_solution()backtrack(initial_solution)
7.3 Assignment Report (40 pts)
The assignment report is divided into two parts. The first part is a question set which contains five questions, and the second part contains two syntesis questions which will ask you to exaplin how you implement the two assembly programming problems.
7.3.1 Question Set (10 pts)
The question set contains five questions, and it is 2 pts for each question.
Why does RV32I base integer instruction set only has beq, bne, blt(u), and bge(u) instructions for integer signed/unsigned comparison? Is it sufficient for expressing all possible conditions?
Why does calling convention matter when writing assembly programs? Please give a concrete example to illustrate what might go wrong if it is not followed.
When implementing a function in RISC-V assembly, how can you allocate and access a function local array?
When writing a function in pure assembly code, it is important to distinguish between caller and callee, because they have different responsibilities. (e.g., caller-saved/callee-saved registers) In the case of recursive function, however, it is hard to distinguish between caller and callee. How do you solve the problem? Please explain it briefly.
Refer to this video, please state the difference between stack and heap memory.
7.3.2 Synthesis Questions (30 pts)
In the second part, it contains two synthesis questions, and it is 15 pts for each question.
How do you implement merge sort in RISC-V assembly instructions? Please explain it in detail.
How do you implement a 2x2 sudoku solver in RISC-V assembly instructions? Please explain it in detail.
7.4 How to Compile and Run
7.4.1 Prepare Your Local Repository
First of all, you have to clone your own remote repository for assignment 2.
$ git clone <ssh_link_to_your_own_repo>
Different from assignment 1, you now must add your own ISA Simulator as a git submodle which is called iss in assignment 2. For example, the directory name for my assignment 2 is called lab-2-Haouo and the ssh link to my asignment 1 repository is git@github.com:Computer-Organization-at-NCKU-EE/lab-1-Haouo.git, then I need to type the commands below.
Please noet that you must change the commands above to use your own folder name and ssh link for your own remote repostory, instead of using the commands directly without any modification.
7.4.2 Compile
In order yo keep the directory for our project clean and readable, we would like to create a directory called build at the first and perform the whole building process inside build directory.
$ mkdir build
After creating build, we can now enter the directory and prepare for generating the Makefiles we need by using cmake.
$ cd build$ cmake ..
When the generating of Makefiles is done, we can then compile the programs with make.
$ make -j
The only thing you must do is to make it again when modifying source files, and cmake .. again is redundant and not necessary. When to cmake again depends on whether the CMakeLists.txt(s) are modified.
7.4.3 Run CTest
To run unit tests, you have to enter the build directory at the first, and then type ctest to run all test cases. There are two test cases, corresponding to problem 1 and 2 respectively.
Furthermore, you can also run a single test only at a time by using the following commands:
ctest -R "MergeSort"
or
ctest -R "Sudoku"
7.5 Tips for Debugging Your Assembly Programs
Recall what we mention in Section 5.6 and Section 6.2, the TA has implemented a simpler version of printf function which is called my_printf for you. Hence, you can call my_printf directly in your assembly programs with the power of C-Assembly Hybrid Programming.
[1]
T. H. Cormen and C. E. Leiserson, Introduction to algorithms, fourth edition. London, England: MIT Press, 2022.