Gettysburg College

CS 221
Computer Organization and Assembly Language Programming

Spring 2023

Assignment 11

Due: Thu, Apr 27, by 11:59pm

Readings

Description

This assignment will focus on Assembler and Translator programs. The goal is to build a Translator from ARM Assembly to Hack Assembly.

Create C++ project called ArmToHack with files:

Download the following files in the project (Right-Click, Save As). Read the comments inside token_io.h to get familiar with the API:

token_io.h | token_io.cpp

Make sure to include in your code have the relevant includes:

#include <string>
#include <map>
#include "token_io.h"

Testing

A number of example programs, named programX.arm, have been provided for testing the code.

ArmToHack Translator (No Jump)

Class ArmToHack will have the following data members:

const methods and const& params

Apply const and const& to methods and parameters, respectively, where appropriate.
file streams

File streams for the current input and output programs:
  • ifstream: the ARM Assembly input file
  • ofstream: the Hack Assembly output file

Both streams use the same include:

#include <fstream>
line number

Current line number in Hack program.
lookup tables

Hash Map for associating registers with their addresses in RAM:
  • R0..R15 in memory cells 0..15
  • FP≡R12, SP≡R13, LR≡R14, PC≡R15 in memory cells 12..15
    (special registers have two names that map to the same RAM cell)

To check if a C++ hash map has a given key:

if ( myMap.count(myKey) == 0 ) {
    // key was not found
}

Here is the list of required methods:

constructor()

Does the appropriate initialization. At this point there is nothing to be done with the two streams.
reset()

Clears the relevant data members in preparation for another translation. Still nothing to be done with the two streams.
write_line(line)

Writes a single complete line of Hack Assembly.

Important: There should not be any direct writes in the code. All writes to the output should be done by calling this method. (Why? Needs to also do something important that otherwise is easy to overlook.)

Writing to a file stream in C++ is the same as writing to the screen.

void translate(in-filename, out-filename)

For now mostly calls translateFirstPass.
void translateFirstPass(in-filename, out-filename)

Opens the two streams with the given filenames and carries out the translation.

At this point there will be no Branch instructions.

Skip empty lines and don't forget to close the streams.

Here is how to process a file line at a time:

myInputStream.open( filename );
while ( myInputStream ) {
    // get a single line from the input
    // do something with the line
}
myInputStream.close();
void translate(line)

Simply dispatches to the relevant translator for the given line.
void translateXXX(line)

Translates the given line, which is an ARM Assembly instruction, to the sequence of Hack Assembly instructions.

At this point should support the following (using only registers; #N done later):

MOV, ADD, SUB, RSB, CMP, END

Note that END involves unconditional jump to the same line. Even though jumps are not supported yet, the code for END can be written, since we know the current line.

testing

Here are simple test programs:

program1.arm | program2.arm | program3.arm

ArmToHack Translator (Numeric Constants)

Add support for numeric constants for any instructions that allow Operand2 (see ARM Quick Reference).

You could consider adding a method write_oper2(token) that writes the corresponding Hack Assembly code for one of the possible tokens:

Rz, #N, #+N, #-N

Recall that the handout code has strip function that can remove characters, and you can use [i] to access a character in C++ string.

Here is a simple test program:

program4.arm

ArmToHack Translator (With Back Jumps)

Add support for basic do-while jumps via the following data members and methods:

map1

Hash Map for associating ARM BXX mnemonics with Hack JXX mnemonics.
map2

Hash Map for associating ARM labels with their addresses in Hack programs.

See below for more detailed explanation.

translateJumps(line)

Translates all BXX commands (ignore BL for now). Our conventions is to emit the value of D register and use that for the jump decision.

Since we are only handling back jumps the address of the jump is known at the point in the translation process.

translateFirstPass(...)

Modify this method so that when labels are encountered they are associated with the corresponding line in the Hack program.

Our convention is that labels are always written on separate lines (only comments allowed). Thus any line that has no second component is considered a label.

(There is one exception to the rule above.)

testing

Here is a simple test program. Upon completion should see R0=5, R1=15, R2=6.

program5.arm

More on map2

Here is the meaning of the second map. Upon reading LABEL in the ARM program, it will have stored map2["LABEL"]=8 to indicate that LABEL is associated with ADD which happens to start on line 8 in the Hack program.

Both ARM programs given below should produce the same Hack program. Blank lines are simply ignored and have no effect.

The line numbers are given only for reference. They are not part of the input/output.

ARM Code                Hack Code
-----------------       ---------------
0: MOV Rx, Ry            0: code for MOV
1: SUB Rx, Rz, Ry        1: code for MOV
2: LABEL                 2: code for MOV
3: ADD Rx, Ry, Rz        3: code for SUB
4: CMP Rx, Ry            4: code for SUB
5: BGT LABEL             5: code for SUB
                         6: code for SUB
                         7: code for SUB
ARM Code                 8: code for ADD
------------------       9: code for ADD
0: MOV Rx, Ry           10: code for ADD
1:                      11: code for ADD
2: SUB Rx, Rz, Ry       12: code for ADD
3:                      13: code for CMP
4: LABEL                14: code for CMP
5:                      15: @8 (go back to ADD)
6: ADD Rx, Ry, Rz       16: code for BGT
7: CMP Rx, Ry
8:
9: BGT LABEL

ArmToHack Translator (With Full Jumps)

Add support for full jump functionality via the following data members and methods:

Here is the meaning of the new map:

ARM Code                Hack Code
-----------------       ---------------
0: MOV Rx, Ry            0: code for MOV
1: SUB Rx, Rz, Ry        1: code for MOV
2: CMP Rx, Rz            2: code for MOV
3: BEQ LABEL             3: code for SUB
4: ADD Rx, Ry, Rz        4: code for SUB
LABEL                    5: code for SUB
5: RSB                   6: code for SUB
                         7: code for SUB
                         8: code for CMP
                         9: code for CMP
                        10: @-1 (unknown address)
                        11: code for JXX
                        12: code for ADD
                        13: code for ADD
                        14: code for ADD
                        15: code for ADD
                        16: code for ADD
                        17: code for RSB
                        18: code for RSB
                        19: code for RSB
                        20: code for RSB
                        21: code for RSB
After translateFirstPass the maps will have:

translateJumps(line)

Modified to load A register with the not yet known address of the jump. For now:
  • write the invalid instruction @-1
  • associate the current line in the Hack program with the ARM label
translateSecondPass(in-filename, out-filename)

This method simply reads the input file one line at a time and simply prints each line to the output file.

The only exception are lines whose index appears in the map. The text for these lines will currently be @-1 but now we can fix them by getting the correct line number from the other map.

The input file is assumed to be in Hack Assembly, i.e the partially created final output.

void translate(in-filename, out-filename)

Modify this method to do the following:

translateFirstPass(in-filename, tmp-filename)
translateSecondPass(tmp-filename, out-filename)

Here tmp-filename is in-filename with the added suffix ".tmp"

testing

Here is a simple test program. It is the same as program5.arm but uses a while loop, so should see same values in RAM.

program6.arm

ArmToHack Translator (Jumps Misc.)

Make the necessary updates to handle the following:

void translateJumps(line)

Handles BL which is a jump that also stores the address of the next instruction in LR.

Here is a simple test program. It is the same as program6.arm but should also see a value in LR that corresponds to the address where END begins.

program7.arm
void translateXXX(line)

Update any ARM instruction that has destination register to handle writing to PC.

These instructions proceed as before but if the destination register is PC-equivalent, a jump is carried to the value of PC.

Consider adding method write_pcjump(regRd). If the given register is PC-equivalent, writes the relevant instructions; otherwise does nothing. The modifications the existing code should be fairly minimal.

Here are sample test programs. The programs should jump back and forth and conclude with R0=1 in RAM.

program8.arm | program9.arm | edit for other cases

Final Test

Convert the following program and execute it on the Computer. The program adds the values computed during the iteration of the 3n+1 problem.

When the program is done, should see R5=518.

Upload a screenshot named 3n1.png of Hardware Simulator / CPU Emulator that shows the contents of the RAM

program10.arm


What to turn in

Upload ArmToHack.h, main.cpp, 3n1.png to the Moodle dropbox.