Because Assembly Language must be a good language for compiling or interpreting many different languages, it is intentionally very simple. It provides all of the necessary capabilities required for high level languages, but does so using simple instructions that can be combined in many different ways to support different languages. This is called FLEXIBILITY.
See table on page 44 of textbook for details.
Psuedo-operations (psuedo-ops) - These instructions do not
generate any machine instructions, but only provide some information
to the assembler. They usually start with a period (.) e.g.
.global (for label to be accessible outside the function),
.word (for initializing a merory location), etc.
How does assembly work??
All "C" programs have a .c extension (source code)
The compilation results in the .o file (object file)
The .o files are linked and we get the executable file, a.out
The compilation is a two step process:
Step1: C program converted into .s file (assembly language) Step2: .s file converted to .o file (using the assembler)
To convert a C program into the corresponding assembly
language, we use the instruction
gcc -S program.c
To execute the .s file,
gcc -g file.s -o <filename>
So, the entire process of creating an executable can be summed up as :
Source Code (filename.c) || || (compiler, such as gcc -S filename.c \/ Assembly Language (filename.s) || || (assembler, such as gcc -c filename.s) \/ Object Code (filename.o) || || (Linker/Loader, ld filename.o \/ Binary file (executable, such as a.out)
Opcode (the name of the instruction)
Operands (the values or data manipulated by the instruction)
The opcode is necessary in every instruction, but the operand may or maynot be present.
Most of the instructions in SPARC have three operands: three registers, or two registers and a literal constant.
opcode operand 1 (optional), operand 2 (optional), operand 3 (optional) add %r1 , %r2 , %r3 sub %r1 , %r2 , %r3 mov %r1 , %r2
The contents of the first register are combined with those of the second register or literal constant, and the result is stored in the third register.
Some instructions are:
clr r1 (clear register r1, r1 has to be a valid register name)
mov r1(or c), r2 (move the contents of r1 or the literal constant c to r2)
add r1, r2(or c), r3 (add r1 and r2(or c), store result in r3)
sub r1, r2(or c), r3 (subtract r2(or c) from r1, store result in r3)
The format of an instruction is:
op reg, reg_or_imm, reg
clr reg ! sets a register to all zeros mov reg_or_imm reg ! copy data from one reg to another add reg1 reg2_or_imm reg3 ! reg1 + reg2_or_imm = reg3 sub reg1 reg2_or_imm reg3 ! reg1 - reg2_or_imm = reg3Note: Anything after the "!" on a line is ignored by the assembler, and is treated as a comment.
An immediate value is data encoded right in the instruction. Like this:
add %o1, 13, %o2
When the program is run, some memory has to be allocated for
it. This is done using the command
save %sp, -64, %sp
Some other simple instructions:
labels -- why are they used, how are they defined?
e.g. _main: save %sp, -64, %spHere, _main is the label for the main function. All programs need to have the main function, as control starts from this function.
We can compile this code by using the gcc compiler (GNU C compiler). To get the assembly code corresponding to this code, we use the -S option. i.e.
/* first.c */ void main() { int x , y ; y = (x - 1) * (x - 7) / (x - 13) ; exit(0) ; }
gcc -S first.c
We now get a file called first.s in the current directory, which is the assembly file. This is what it looks like:
This code does look a lot complicated, so we will take up a simplified version of it, and go through it step by step. Actually, this simplified version of the code does the same thing, i.e. calculate the value of the expression, but is well commented, and easier to understand.
/* first.s */ gcc2_compiled.: ___gnu_compiled_c: .text .align 4 .global _main .proc 020 _main: !#PROLOGUE# 0 save %sp, -112, %sp !#PROLOGUE# 1 call ___main,0 nop ld [%fp-12], %o1 add %o1,-1, %o0 ld [%fp-12], %o2 add %o2,-7, %o1 call .umul, 0 nop ld [%fp-12], %o2 add %o2, -13, %o1 call .div, 0 nop st %o0, [%fp-16] mov 0, %o0 call _exit, 0 nop L1: ret restore
For convenience, we first create a file called first.m which is not the assembly code, but very close to it. It is the code written using some macros, and needs to be processed by a macro-processor before it can be converted into assembly language code.
p.s. If you are wondering that Macro's are, remember #define in C/C++ ??
Look at the following file:
This is not actually assembly language, as there is no instruction called define in assembly language. But the define statements are used by the to just give a symbolic name to the constants, i.e. a2 stands for 1, a1 stands for 7, etc. in the code given.
/* first.m */ /* This programs computes the expression: y = (x - 1) * (x - 7) / (x -13) for x = 9 The polynomial coefficients are: */ define(a2, 1) define(a1, 7) define(a0, 11) /* Variables x and y are stored in %l0 and %l1 */ define(x_r, l0) define(y_r, l1) .global _main _main: save %sp, -64, %sp mov 9, %x_r !initialize x sub %x_r, a2, %o0 !(x - a2) into %o0 sub %x_r, a1, %o1 !(x - a1) into %o1 call .mul nop !result in %o0 sub %x_r, a0, %o1 !(x - a0) into %o1, the divisor call .div nop !result in %o0 mov %o0, %y_r !store it in y mov 1, %g1 !trap dispatch ta 0 !trap to system
To convert this code into assmbly language, we need to pass it through a M4 macro-processor. To do that, we use the following command:
m4 < first.m > first.s
This creates the new assembly language file from the .m file. The assembly language file is given below:
If you notice, all the define statements are gone, and the variables a0, a1, a2 are all replaced by their corresponding values in this code. The only disadvantage is that this file is a lot more difficult to read and debug as compared to the previous file, especially in the case when a lot of variables have the same values, i.e. if a0, a1, a2 had the same value, say 1, you would not know which variable is being referred to.
/* first.s */ /* This programs computes the expression: y = (x - 1) * (x - 7) / (x -13) for x = 9 The polynomial coefficients are: */ /* Variables x and y are stored in %l0 and %l1 */ .global _main _main: save %sp, -64, %sp mov 9, %l0 !initialize x sub %l0, 1, %o0 !(x - 1) into %o0 sub %l0, 7, %o1 !(x - 7) into %o1 call .mul nop !result in %o0 sub %l0, 11, %o1 !(x - 11) into %o1, the divisor call .div nop !result in %o0 mov %o0, %l1 !store it in y mov 1, %g1 !trap dispatch ta 0 !trap to system
Also, x_r and y_r are replaced by l0 and l1 in the entire file. We will look at the program in more detail in the next class. For now, lets just compile it to get the executable. To do that, type the following command:
gcc first.s -o first
This just produces the executable called first. To execute the file, just type the filename (first) on the command prompt.
What happened? Well, your program ran successfully, but there was no output, the reason being that there are no input/output statements in your code. So it just executed the segment of code, calculated the value of y, and then ended. So how do we know if we did the right thing or not? Well, to find out, we use something called a debugger. We'll talk about it in more detail in one of the later classes. For now, this much is enough!! Bye :-)
colossus>first colossus>