Assignment 4

Due dates:

Milestone 1: due Friday, Nov 1st by 11pm

Milestone 2: due Friday, Nov 15th by 11pm

Note that this assignment is a “double” assignment, and is worth twice as much as Assignments 1, 2, or 3.

Overview

In this assignment, you will implement code generation, turning the semantic analyzer you implemented in Assignment 3 into an actual C compiler capable of compiling programs into x86-64 assembly language.

Note that the expectation for this assignment is generation of functioning code, but not generation of efficient code. In assignment 5 (the final assignment), you will implement code optimizations to improve the efficiency of the “baseline” code your compiler generates in this assignment.

Getting started

This assignment is a continuation of Assignment 3, so that code will be your starting point.

Grading breakdown

Your assignment grade will be determined as follows.

Milestone 1: high-level code generation

Your first task for this assignment is to translate input C source code into a “high-level” code-like intermediate representation, quite similar to the Iloc intermediate representation described in the textbook.

The goal of high-level code generation is to translate the input program (in the annotated-AST form produced by your semantic analyzer) into a form that precisely encodes

As an example: consider the following C code:

int main(void) {
  int n, i, sum;

  n = 11;
  i = 1;
  sum = 0;

  while (i <= n) {
    sum = sum + i;
    i = i + 1;
  }

  return sum;
}

This code could be translated into the following high-level IR code:

	.section .text

	.globl main
main:
	enter    $0
	mov_l    vr13, $11
	mov_l    vr10, vr13
	mov_l    vr13, $1
	mov_l    vr11, vr13
	mov_l    vr13, $0
	mov_l    vr12, vr13
	jmp      .L1
.L0:
	add_l    vr13, vr12, vr11
	mov_l    vr12, vr13
	mov_l    vr13, $1
	add_l    vr14, vr11, vr13
	mov_l    vr11, vr14
.L1:
	cmplte_l vr15, vr11, vr10
	cjmp_t   vr15, .L0
	mov_l    vr0, vr12
	jmp      .Lmain_return
.Lmain_return:
	leave    $0
	ret

A few things to note, to help explain the meaning of the high-level code:

In the translation above, the virtual registers vr10, vr11. and vr12 are allocated as storage for the int local variables n, i, and sum, respectively. In general, if a variable’s type is either integral or a pointer, and its address is not taken, its storage can be a virtual register. Arrays and structs will require storage in memory.

Important

Virtual registers vr10 and above are completely local to the function that uses them. For example, if a caller moves a value into vr10, and then calls a function, the called function would not be able to access the caller’s value by using the value in vr10.

Ok, how do I get started on this?

To generate high-level IR code, you will need to do two things:

  1. Implement the LocalStorageAllocation visitor class so that it can determine a storage location (memory or virtual register) for each parameter and local variable in a function
  2. Implement the HighLevelCodeGen visitor class so that it can construct an InstructionSequence containing high-level Instructions which completely describe the storage requirements and dynamic operations of a function

The idea with LocalStorageAllocation is fairly simple:

You will notice that in driver/main.cpp, in the codegen function, there is code which creates a LocalStorageAllocation object and calls its allocate_storage member function, passing a shared pointer to the Function object representing a function for which code needs to be generated. The goal of LocalStorageAllocation::allocate_storage is to assign either a virtual register or a storage offset to every local variable in the function to which it is applied. It will make sense to store information about storage allocation decisions in the symbol table entries (i.e., Symbol objects) for each variable. The idea here is that prior to high-level code generation, your compiler will know precisely where the value of each parameter and local variable is stored, so that it can generate instructions which use these values, and also handle assignments which store values to these variables.

Once you have local storage allocation working, you can move on to HighLevelCodeGen (which is in the include/highlevel_codegen.h and hl_codegen/highlevel_codegen.cpp header and source files.) HighLevelCodeGen’s goal is to append Instruction objects to an InstructionSequence representing the high-level instructions of a compiled function.

You can test high-level code generation using the -h command line option. For example, if $ASSIGN04_DIR is the directory containing your nearly_cc executable, then the command

$ASSIGN04_DIR/nearly_cc -h input/example02.c

would print a textual representation of the high-level code generated for the input program input/example02.c.

Operand, Instruction, and InstructionSequence

The Operand, Instruction, and InstructionSequence classes are used to for both high-level and low-level “code-like” (a.k.a. “linear”) intermediate representations of functions.

I recommend reading through the code of these classes, since that is likely the best way to learn how they work. As a very brief summary:

The starter code for the HighLevelCodeGen class also uses these classes, and that code should serve as a useful guide.

The high-level opcodes are defined in the files build/highlevel.h and build/highlevel.cpp, which are generated by a Ruby script called scripts/gen_highlevel_ir.rb. If you wanted to, you could modify this script to define additional high level instructions, or even change the set of high-level opcodes available. However, the opcodes provided are meant to be sufficient for the subset of C we will be targeting.

High-level opcodes

The following table summarizes the intended meaning of the various high-level opcodes, as generated by the scripts/gen_highlevel_ir.rb script provided in the starter code. Note that _sz is an operand size suffix, which will be one of the following:

Also note that for the uconv and sconv instructions, there are two operand size suffixes, describing a promotion from a less precise integer type to a more precise integer type. For example, the opcode sconv_bl would convert an 8-bit signed value to a 32-bit signed value.

Because the target architecture (x86-64) uses 64 bit pointers, you should consider all pointers to be 8 bytes (64 bits.)

Opcode Operands Meaning
HINS_nop   does nothing
HINS_add_sz dst, srcl, srcr add two integer values, store result in dst
HINS_sub_sz dst, srcl, srcr subtract srcr from srcl, store result in dst
HINS_mul_sz dst, srcl, srcr multiply two integer values, store result in dst
HINS_div_sz dst, srcl, srcr divide srcl by srcr, store quotient in dest
HINS_mod_sz dst, srcl, srcr divide srcl by srcr, store remainder in dest
HINS_cmplt_sz dst, srcl, srcr determine if srcl < srcr, set dst to 0 or 1
HINS_cmplte_sz dst, srcl, srcr determine if srcl ≤ srcr, set dst to 0 or 1
HINS_cmpgt_sz dst, srcl, srcr determine if srcl > srcr, set dst to 0 or 1
HINS_cmpgte_sz dst, srcl, srcr determine if srcl ≥ srcr, set dst to 0 or 1
HINS_cmpeq_sz dst, srcl, srcr determine if srcl = srcr, set dst to 0 or 1
HINS_cmpneq_sz dst, srcl, srcr determine if srcl ≠ srcr, set dst to 0 or 1
HINS_neg_sz dst, src store the negation of src in dst
HINS_not_sz dst, src store the logical negation of src in dst
HINS_mov_sz dst, src store src in dst
HINS_sconv_bw dst, src promote 8 bit src to 16 bit dest (signed)
HINS_sconv_bl dst, src promote 8 bit src to 32 bit dest (signed)
HINS_sconv_bq dst, src promote 8 bit src to 64 bit dest (signed)
HINS_sconv_wl dst, src promote 16 bit src to 32 bit dest (signed)
HINS_sconv_wq dst, src promote 16 bit src to 64 bit dest (signed)
HINS_sconv_lq dst, src promote 32 bit src to 64 bit dest (signed)
HINS_uconv_bw dst, src promote 8 bit src to 16 bit dest (unsigned)
HINS_uconv_bl dst, src promote 8 bit src to 32 bit dest (unsigned)
HINS_uconv_bq dst, src promote 8 bit src to 64 bit dest (unsigned)
HINS_uconv_wl dst, src promote 16 bit src to 32 bit dest (unsigned)
HINS_uconv_wq dst, src promote 16 bit src to 64 bit dest (unsigned)
HINS_uconv_lq dst, src promote 32 bit src to 64 bit dest (unsigned)
HINS_ret none return from function
HINS_jmp label unconditional jump to label
HINS_call label call to function named by label
HINS_enter immediate reserve specified number of bytes for local variables
HINS_leave immediate deallocate specified number of bytes for local variables
HINS_localaddr dst, immediate store pointer to local variable at given offset in dst
HINS_cjmp_t dst, label conditional jump to label if dst contains true val
HINS_cjmp_f dst, label conditional jump to label if dst contains false val

A few things to note:

Also note that because each two-source-operand arithmetic instruction has 4 variants for the various operand sizes, the following function is provided:

HighLevelOpcode get_opcode(HighLevelOpcode base_opcode,
                           const std::shared_ptr<Type> &type) {
  if (type->is_basic())
    return static_cast<HighLevelOpcode>(
      int(base_opcode) + int(type->get_basic_type_kind())
    );
  else if (type->is_pointer())
    return static_cast<HighLevelOpcode>(
      int(base_opcode) + int(BasicTypeKind::LONG)
    );
  else
    RuntimeError::raise(
      "attempt to use type '%s' as data in opcode selection",
      type->as_str().c_str()
    );
}

The intent of this function is that you can pass it the “_b” variant of an opcode (the “base” opcode), and the data type upon which you want the instruction to operate, and it will return the correct variant of the opcode for that type.

Important

The get_opcode function will only work for “normal” arithmetic instructions with two source operands.

Example high-level translations

Here are some example translations of test programs in the public test repository to high-level code:

Example program Example translation
example01.c example01.txt
example02.c example02.txt
example03.c example03.txt
example04.c example04.txt
example05.c example05.txt
example06.c example06.txt
example07.c example07.txt
example08.c example08.txt
example09.c example09.txt
example10.c example10.txt
example11.c example11.txt
example12.c example12.txt
example13.c example13.txt

These example translations are by no means the only reasonable translations possible for these programs, but they should serve as a source of ideas for your high-level code generator.

Hints and tips

Print debug output as code comments: When you are trying to understand the code generated by your high-level code generators, it can be very helpful to be able to see information about storage allocation decisions and similar details. A good way to do this is to print this information as comments starting with /* and ending with */. You will notice that the high-level (and low-level) code generation examples use this technique. Note that for the code generation examples, this output represents arbitrary (but reasonable) decisions made by the reference implementation, and it’s very likely that your compiler will make different decisions.

Allocating storage: As mentioned previously, the StorageCalculator class can be used to lay out the fields of a struct data type, but it can also be used to lay out local variables in a block of memory in the stack frame.

The reference implementation does use StorageCalculator, and also implements an optimization: specifically, variables which have non-overlapping lifetimes can share storage. For example, in the code

if (a) {
  int b1[1];
  b1[0] = 1;
  c = a + b1[0];
} else {
  int b2[1];
  b2[0] = 2;
  c = a + b2[0];
}

the arrays b1 and b2 cannot exist at the same time, and can use the same storage. You are not required to implement this optimization, but if you are interested, you can make use of the property that a StorageCalculator object has value semantics. So, if m_storage_calc is a StorageCalculator member variable representing the memory storage for variables that are currently in scope, then the following code could be used to handle variable definitions in a nested scope:

// enter nested scope
StorageCalculator save = m_storage_calc;

// ...handle variable declarations in nested scope, allocate their
//    storage using m_storage_calc...

// leave nested scope
m_storage_calc = save;

The idea is that when control leaves a nested scope, memory storage allocated for any variables in that scope will no longer be needed.

In general, you will need to make sure that the local memory area in the stack frame (as specified in the enter and leave instructions) is large enough to accommodate all of the local variables requiring memory storage.

Storing operands in expression nodes. One of the most important things the high-level code generator will need to know is, for variables, array elements, struct fields, and computeed values, the location serving as storage for that variable or value. A storage location is one of the following possibilities:

As the code generator recursively generates high-level code for expressions, it should store an Operand in the node indicating the storage location associated with that node.

For computed values (for example, the result of an addition), the code generator should allocate a temporary virtual register, and use that virtual register as the storage location. So, the Operand could be constructed like this:

int temp_vreg = /* allocate the next unused temporary vreg number */;
Operand operand(Operand::VREG, temp_vreg);

Any expression node that is an lvalue should be annotated with an operand that exactly represents the storage location if the variable, array element, or struct field that the lvalue represents. In the case of local variables whose storage is a virtual register, this is easy: the operand should just name that virtual register. For values located in memory, the operand should be a memory reference via an address stored in a virtual register. It is very possible that you will need to generate a sequence of instructions, possibly involving temporary virtual registers, to compute the exact address of the referenced lvalue. You can see this happening in example09.txt, because each array element reference requires an address computation.

Function calls. The model for function calls is as follows:

To call a function, generate code for each argument expression, and move its value into the appropriate argument register. Then emit a call instruction with the name of the function as the label operand. (We will not be supporting function calls by any mechanism other than directly naming the called function.) When the called function returns, the return value will be in vr0.

One important detail is how a function should manage its own arguments. It’s fairly straightforward.

  1. Parameters should have storage allocated just like any other local variable. Their storage is not the argument register containing the parameter value.
  2. At the beginning of the generated high-level code for a function, the value of each argument register should be copied into the storage location of the corresponding parameter variable.
  3. When the code generator needs to emit a function call, it can simply place the argument values in the appropriate argument registers.

Milestone 2: x86-64 code generation

Low-level code generation

Your task in Milestone 2 is to translate each high-level instruction produced by your high-level code generator into a sequence of low-level (x86-64) instructions.

The LowLevelCodeGen class, defined in the include/lowlevel_codegen.h and ll_codegen/lowlevel_codegen.cpp header and source files, implements this transformation. Specifically, the translate_instruction member function will be called for each high-level Instruction object, and should append one or more low-level Instruction objects to the ll_iseq (low-level instruction sequence) object which carry out the operation represented by the high-level instruction.

Low-level opcodes and register numbers are defined in include/lowlevel.h.

Note that there are four different Operand::Kind values for machine registers. Specifically, Operand::MREG8, Operand::MREG16, Operand::MREG32, and Operand::MREG64. The select_mreg_kind helper function is intended to make it easy to select the machine operand kind for a particular operand size. Also note that build/highlevel.h (automatically generated by the scripts/gen_highlevel_ir.rb script) declares two functions useful for determining the size of the source and destination operands of a high-level instruction. These functions are highlevel_opcode_get_source_operand_size and highlevel_opcode_get_dest_operand_size.

Many of the low-level instructions, like the high-level instructions, have 4 variations for the different operand sizes. The select_ll_opcode helper function is intended to select the appropriate variant of a low-level opcode.

You will see that translations of the HINS_enter, HINS_leave, and HINS_ret high-level opcodes are already provided (although you may need to fill in some details.)

To give you a sense of what code to implement a translation might look like, here is a possible implementation of a transformation of the HINS_mov_b/w/l/q opcodes:

  if (match_hl(HINS_mov_b, hl_opcode)) {
    int size = highlevel_opcode_get_source_operand_size(hl_opcode);

    LowLevelOpcode mov_opcode = select_ll_opcode(MINS_MOVB, size);

    Operand src_operand =
      get_ll_operand(hl_ins->get_operand(1), size, ll_iseq);
    Operand dest_operand =
      get_ll_operand(hl_ins->get_operand(0), size, ll_iseq);

    if (src_operand.is_memref() && dest_operand.is_memref()) {
      // move source operand into a temporary register
      Operand::Kind mreg_kind = select_mreg_kind(size);
      Operand r10(mreg_kind, MREG_R10);
      ll_iseq->append(
        new Instruction(mov_opcode, src_operand, r10)
      );
      src_operand = r10;
    }

    ll_iseq->append(
      new Instruction(mov_opcode, src_operand, dest_operand)
    );
    return;
  }

Note that the get_ll_operand (“get low-level operand”) function returns an low-level Operand naming the storage location for a given high-level operand. It is passed a std::shared_ptr to the low-level InstructionSequence because it might be necessary to emit one or more low-level instructions in order to create a meaningful low-level operand.

Also note that the generated low-level code uses the %r10 register (or one of its sub-registers) to store a temporary value, to avoid the possibility of emitting a mov instruction with two memory operands. The %r10 and %r11 registers are ideal for being used as short-term temporaries in the low-level code, since they are caller-saved, and are not used for arguments or return value in function calls.

Example low-level translations of test programs

The assign04 directory in the test repository, in addition to having example high-level code translations of the test programs, has example low-level translations in the example_lowlevel_code subdirectory.

Here are some example translations:

Example program Example HL translation Example LL translation
example01.c example01.txt example01.S
example02.c example02.txt example02.S
example05.c example05.txt example05.S
example09.c example09.txt example09.S
example10.c example10.txt example10.S
example12.c example12.txt example12.S
example16.c example16.txt example16.S
example23.c example23.txt example23.S
example25.c example25.txt example25.S
example27.c example27.txt example27.S

These low-level translations may be useful as a reference. Note that the test repository has more test programs and example translations.

Generating references to virtual registers

The high-level instructions will have many references to virtual registers.

vr0 is the return value register, so any references to vr0 should be translated to refer to %rax, or a sub-register (e.g., %eax) of the appropriate size.

vr1 through vr6 are the function argument registers, and these should be translated to refer to %rdi, %rsi, %rdx, %rcx, %r8, and %r9, respectively. Again, use the appropriate sub-register (e.g., %edi) depending on the operand size.

You can assume that vr7 through vr9 will not be used, because the test programs will not have any functions with more than 6 parameters.

Virtual registers vr10 and above will be used as storage for local variables, or as storage for temporary values. These should be allocated in memory. So, the local storage area will need to be large enough for both local variables allocated in memory, and the virtual registers used for locals and temporaries. You can reserve 8 bytes of local storage for each virtual register.

Testing your low-level code generator

The public test repository is intended to help you validate your low-level code generator.

Start by making sure that your clone of the test repository is up to date by running git pull, then change directory into the assign04 subdirectory.

Set the ASSIGN04_DIR environment variable to wherever your compiler project is located, e.g.

export ASSIGN04_DIR=~/compilers/assign04

The build.rb script is used to invoke your compiler on a C source file, and then assemble and link the code into an executable. You can run it using the command

./build.rb testname

where testname is the stem of one of the test programs. For example, use the command

./build.rb example01

to assemble and link the input/example01.c test program. The build.rb script will create a directory called out (if it doesn’t already exist), and use this directory to store the generated assembly languge source code. For example, if you are compiling the example01 test, the file out/example01.S will contain the low-level code generated by your compiler. If the executable is assembled and linked successfully, it will also be in the out directory. For the example01 test, the executable will be called out/example01, so you could run the executable using the command

./out/example01

As with the previous assignments, a script called run_test.rb runs a test and indicates whether the test passed or failed. So,

./run_test.rb example01

would invoke your compiler on input/example01.c, assemble and link the generated code, invoke the program (providing input if the test requires it), and then check the program exit code and generated output against the expected exit code and output. A successful test will look something like this:

$ ./run_test.rb example01
Generated code successfully assembled, output exe is out/example01
Test passed!

If the executable does not exit with the correct exit code (i.e., if it does not return the correct value as the return value from the main function), you will see something like this:

$ ./run_test.rb example01
Generated code successfully assembled, output exe is out/example01
Executable exited with exit code 0, expected 3

If the executable does not produce the expected output, you will see something like this:

$ ./run_test.rb example18
Generated code successfully assembled, output exe is out/example18
10a11
> 0
Output did not match expected output

When the output does not match the expected output, you will see output from the diff program summarizing which parts of the output did not match.

Function calls, “built-in” functions

Function calls should be fairly straightforward to handle, as long as the high-level code generator follows the recommendations mentioned previously. Each occurrence to the vr0 result register should be translated to become an occurrence to %rax or a sub-register, and each occurrence of the vr1 through vr6 argument registers should become occurrences of the appropriate x86-64 argument registers (%rdi, %rsi, etc.) or sub-registers.

call instructions in the high-level code should become call instructions in the low-level code, with the operand being a label naming the called function. In fact, it is sufficient to handle them in the low-level code generator as follows:

  if (hl_opcode == HINS_call) {
    ll_iseq->append(new Instruction(MINS_CALL, hl_ins->get_operand(0)));
    return;
  }

The build.rb script generates some “helper” functions for test programs to use to read input values and print output values. These have the following prototypes:

void print_str(const char *s);
void print_i32(int n);
int read_i32(void);
void print_i64(int n);
int read_i64(void);
void print_nl(void);
void print_space(void);

As long as your compiler can handle function declarations correctly, you will not need to do anything special to allow calls to these functions to work.

You can examine the code for the build.rb script to see the assembly code for these functions.

Other hints and tips

Development strategy. You should develop your low-level code generator incrementally. Start with the simplest test program, which is example01.c, and get it working. Then move on to more complicated ones. The test programs are numbered in a way that roughly indicates an order from less complex to more complex, although there are some exceptions. For example, example14 is a better place to start when working on arrays and array subscript operations than example09.

Think very carefully about how local memory storage is addressed. In this version of your compiler, most values will (ultimately) be stored in memory, because virtual registers will have their storage allocated in memory. You will need to allocate space in memory for the virtual registers, and make sure that it does not overlap the memory used for local variables allocated in memory (such as arrays and struct instances.)

Debugging. You can run gdb on an executable that is misbehaving. Step through the generated instructions line by line. Inspect values in memory. For example, if -48(%rbp) contains a 64-bit integer value, you could print it using either of the following commands:

print *((long *)($rbp - 48))
print/x *((long *)($rbp - 48))

The first command would print the value in decimal, and the second in hexadecimal.

Address computations should be 64 bit. Because pointers are 64 bits, all address computations should be 64 bit computations. You will need to think about how to ensure this. In a source program, it is quite common for an array index to be a 32-bit int; for example,

int n;
int arr[3];
n = 0;
arr[n + 1] = 42;

The reference implementation uses the strategy of generating code for an index value using whatever type is appropriate based on the original source code, and then promoting that value to 64 bits before it is used to compute an address. You will see movslq instructions in some of the example low-level outputs, where a 32-bit index value is promoted to 64 bits.

README.txt

For both milestones, please submit a README.txt with your submission which briefly discusses anything interesting or significant you want us to know about your implementation. If there are any features you weren’t able to get fully working, or if you implemented extra functionality, this is a place you could document that.

Submitting

Create a zipfile using the command

make solution.zip

Note: make sure your README.txt is in the top-level directory (i.e., the one that contains the Makefile.)

Upload the solution.zip zipfile to Gradescope as Assignment 4 MS1 or Assignment 4 MS2 (depending on which milestone you are submitting.)