Microprocessor assembly language programming pdf
Code print a message 'hello world' Assembly - Comparing Numbers: Comparing two numbers and outputing the lowest one. When printing one character on the old screen, you have to write 2 bytes. Multi Writing your 1 st 80x86 Assembly language program step 1:Get into your text editor and type in the following lines that make up the source code of program HELLO. Height H. It seems that you're in Germany.
We have a dedicated site for Germany. There are three main reasons for writing this book. While several assembly language books are on the market, almost all of them cover only the processor-a bit processor Intel introduced in A modem computer organization or assembly language course requires treatment of a more recent processor like the Pentium, which is a bit processor in the Intel family.
INC' ;include an assembly library. We will first define what the Macros mean, how they are useful, and how they are implemented in the assembly language program for the microprocessor? They also can be used to manipulate presentation of a program to make it easier to read and maintain. Data sections There are instructions used to define data elements to hold data and variables. They define the type of data, the length and the alignment of data. Some assemblers classify these as pseudo-ops.
Macros Many assemblers support predefined macros, and others support programmer-defined and repeatedly re-definable macros involving sequences of text lines in which variables and constants are embedded. This sequence of text lines may include opcode or directives. Once a macro has been defined its name may be used in place of a mnemonic. When the assembler processes such a statement, it replaces the statement with the text lines associated with that macro, and then processes them as if they existed in the source code file.
Since macros can have 'short' names but expand to several or indeed many lines of code, they can be used to make assembly language programs appear to be far shorter, requiring fewer lines of source code, as with higher level languages.
They can also be used to add higher levels of structure to assembly programs, optionally introduce embedded debugging code via parameters and other similar features. Code section In the code section we will write actual program containing instructions to perform the required task. Compiler Dependencies Emu is an microprocessor emulator and disassembler. The problems are that Emu can opens only one file at time, and that I dislike its text editor… I used another editor for writing my program and then I opened the main file of my project in Emu every time I wanted to compile it.
That's boring and repetitive, so I made a build environment that uses the standard make command for building my program.
It is easy to use and efficient. Runtime Dependencies For running your application, you will need to install few supporting software.
Bugging Dependency Debugging also depends on system to system and os to os. Different debuggers are required for debugging on the basis of OS. Linking-Dependencies The Emu Building repository 1. For launching emu The extension depends of your code. If your code is made for being a flat binary, you have to set this variable to com; if you have different segments in your code, you have to set exe.
For Linux and Ubuntu you have to set. Assembling, Linking and Executing 1. Assembling: - Assembling converts source program into object program if syntactically correct and generates an intermediate. Linking: - This involves the converting of. OBJ module into. EXE executable module i.
MAP files among which last two are optional files. Loading and Executing: - It Loads the program in memory for execution. All assembler directives begin with a period. Some of them are:. Integer must be a positive integer expression and must be a power of 2. If specified, pad is an integer bye value used for padding. The default value of pad for the text section is 0x90 nop ; for other sections, the default value of pad is zero 0.
When issued with arguments, the. Integer must be positive. Each byte must be an 8-bit value. The storage is referenced by the identifier name. Size is measured in bytes and must be a positive integer. Name cannot be predefined. Alignment is optional. If alignment is specified, the address of name is aligned to a multiple of alignment. String specifies the name of the source file associated with the object file.
Each symbol is either defined externally or defined in the input file and accessible in other files. A global symbol definition in one file satisfies an undefined reference to the same global symbol in another file. Multiple definitions of a defined global symbol are not allowed.
If a defined global symbol has more than one definition, an error occurs. All references to symbol within a dynamic module bind to the definition within that module.
Symbol is not visible outside of the module. String is any sequence of characters, not including the double quote ". The storage is referenced by the symbol name, and has a size of size bytes. Name cannot be predefined, and size must be a positive integer. If alignment is specified, the address of name is aligned to a multiple of alignment bytes. If alignment is not specified, the default alignment is 4 bytes. Each symbol is defined in the input file and not accessible to other files.
Default bindings for the symbols are overridden. Symbols declared with the. Because local symbols are not accessible to other files, local symbols of the same name may exist in multiple files. Each expression must be a 32—bit value and must evaluate to an integer value. Each expression must be a bit value, and must evaluate to an integer value. If section does not exist, a new section with the specified name and attributes is created. If section is a non-reserved section, attributes must be included the first time section is specified by the.
Expression can be any legal expression that evaluates to a numerical value. Outside of the module, symbol is treated as global. Default bindings of the symbol are overridden by the. A weak symbol definition in one file satisfies an undefined reference to a global symbol of the same name in another file. Unresolved weak symbols have a default value of zero. The link editor does not resolve these symbols. If a weak symbol has the same name as a defined global symbol, the weak symbol is ignored and no error results.
Machine language instruction is divided into two parts: - an operation code or op code and an operand. Assembly language is a low level language which is easily understandable by hardware. A program written in assembly language consists of a series of processor instructions and meta- statements known as directives, pseudo-instructions and pseudo-ops, comments and data.
An x86 instruction can have zero to three operands. Assembly language is an example of low level language. A short code mnemonics is written for each instruction in assembly language programming.
Assembly language programming is just abbreviation of machine language, so it also not user friendly. Programmers yet have to write long codes for small program. But many programs are written in assembly language as it is closer to machine language and execution time is faster. Again, assembly language is also processor dependent or incompatible for different machine. Assemblers translates menomics into machine language or binary codes.
Advantages of assembly language Since mnemonics replace machine instruction it is easy to write, debug and understand is comparison to machine codes. Useful to write lightweight application in embedded system like traffic light because it needs fewer codes than high level language. Disadvantages of assembly language Mnemonics are in abbreviated form and in large number, so they are hard to remember. Program written in assembly language are machine dependent, so are incompatible for different type of machines.
A program written in assembly language is less efficient to same program in machine language. Mnemonics can be different for different machines according to manufacturers.
Assembly language does not provide such complex control structures. It instead uses the infamous goto and used inappropriately can result in spaghetti code! How-ever, it is possible to write structured assembly language programs. The basic procedure is to design the program logic using the familiar high level control structures and translate the design into the appropriate assembly language much like a compiler would do.
Comparisons Control structures decide what to do based on comparisons of data. The 80x86 provides the CMP instruction to perform comparisons. The operands are subtracted and the FLAGS are set based on the result, but the result is not stored anywhere. The zero flag is set 1 if the resulting difference would be zero. The carry flag is used as a borrow flag for subtraction.
Consider a comparison like: cmp vleft, vright The difference of vleft - vright is computed and the flags are set accord-ingly. For signed integers, there are three flags that are important: the zero ZF flag, the overflow OF flag and the sign SF flag. The overflow flag is set if the result of an operation overflows or underflows. The sign flag is set if the result of an operation is negative.
In other words, they act like a goto. There are two types of branches: unconditional and conditional. If a conditional branch does not make the branch, control passes to the next instruction. The JMP short for jump instruction makes unconditional branches. Its single argument is usually a code label to the instruction to branch to.
The assembler or linker will replace the label with correct address of the instruction. It is important to realize that the statement immediately after the JMP instruction will never be executed unless another instruction branches to it! This jump is very limited in range. It can only move up or down bytes in memory. The advantage of this type is that it uses memory than the others. It uses a single signed byte to store the displacement of the jump.
The displacement is how many bytes to move ahead or behind. The displacement is added to EIP. This jump is the default type for both unconditional and conditional branches; it can be used to jump to any location in a segment. Actually, the supports two types of near jumps.
One uses two bytes for the displacement. This allows one to move up or down roughly 32, bytes. The other type uses four bytes for the displacement, which of course allows one to move to any location in the code segment. The four byte type is the default in protected modes. This jump allows control to move to another code segment. This is a very rare thing to do in protected modes. Valid code labels follow the same rules as data labels.
Code labels are defined by placing them in the code segment in front of the statement they label. A colon is placed at the end of the label at its point of definition. The colon is not part of the name.
There are many different conditional branch instructions. They also take a code label as their single operand. The simplest ones just look at a single flag in the FLAGS register to determine whether to branch or not. See the above table for a list of these instructions. PF is the parity flag which indicates the odd or evenness of the number of bits set in the lower 8-bits of the result.
Fortunately, the 80x86 provides additional branch instructions to make these type of tests much easier. There are signed and unsigned versions of each. The next shows these instructions. Each of the other branch instructions have two synonyms. Table: Signed and Unsigned Comparison Instructions Using these new branch instructions, the pseudo-code above can be translated to assembly much easier. Each of these instructions takes a code label as its single operand.
A procedure is a set of instructions that compute some value or take some action such as printing or reading a character value.
The definition of a procedure is very similar to the definition of an algorithm. A procedure is a set of rules to follow which, if they conclude, produce some result.
An algorithm is also such a sequence, but an algorithm is guaranteed to terminate whereas a procedure offers no such guarantee. That is, some code calls a procedure, the procedure does its thing, and then the procedure returns to the caller. The calling code calls a procedure with the call instruction, the procedure returns to the caller with the ret instruction. A simple procedure may consist of nothing more than a sequence of instructions ending with a ret instruction.
They do not generate any code. To the 80x86, the last two examples are identical; however, to a human being, latter is clearly a self-contained procedure, the other could simply be an arbitrary set of instructions within some other procedure.
The 80x86 microprocessor returns from a procedure by executing a ret instruction, not by encountering an endp directive. DoFFs endp Without the ret instruction at the end of each procedure, the 80x86 will fall into the next subroutine rather than return to the caller. After executing ZeroBytes above, the 80x86 will drop through to the DoFFs subroutine beginning with the mov cx, instruction. The procedure name must be on the both proc and endp lines. The procedure name must be unique in the program.
Every proc directive must have a matching endp directive. Failure to match the proc and endp directives will produce a block nesting error. Near calls and returns transfer control between procedures in the same code segment.
Far calls and returns pass control between different segments. The two calling and return mechanisms push and pop different return addresses. You generally do not use a near call instruction to call a far procedure or a far call instruction to call a near procedure. The proc directive handles that chore.
The proc directive has an optional operand that is either near or far. Near is the default if the operand field is empty. The assembler assigns the procedure type near or far to the symbol. Whenever MASM assembles a call instruction, it emits a near or far call depending on operand. Therefore, declaring a symbol with proc or proc near, forces a near call. Likewise, using proc far, forces a far call. If a procedure has the near operand, then all return instructions inside that procedure will be near.
MASM emits far returns inside far procedures. That is, one procedure definition may be totally enclosed inside another. Whenever you nest one procedure within another, it must be totally contained within the nesting procedure. That is, the proc and endp statements for the nested procedure must lie between the proc and endp directives of the outside, nesting, procedure. The following is not legal: OutsideProc proc near.
InsideProc proc near. Outside Proc endp. Functions The difference between functions and procedures in assembly language is mainly a matter of definition. The purpose for a function is to return some explicit value while the purpose for a procedure is to execute some action. All the rules and techniques that apply to procedures apply to functions.
From here on, procedure will mean procedure or function. Unfortunately, there is a subtle bug that causes it to print 40 spaces per line in an infinite loop. The main pro-gram uses the loop instruction to call PrintSpaces10 times. PrintSpaces uses cx to count off the 40 spaces it prints. PrintSpaces returns with cx containing zero.
Preserving a register means you save it upon entry into the subroutine and restore it before leaving. Had the PrintSpaces subroutine preserved the contents of the cx register, the program above would have functioned properly. Also, note that this code pops the registers off the stack in the reverse order that it pushed them.
The operation of the stack imposes this ordering. Either the caller the code containing the call instruction or the callee the subroutine can take responsibility for preserving the registers. In the example above, the callee pre-served the registers. The following example shows what this code might look like if the caller preserves the registers: mov cx, 10 Loop0: push ax push cx call PrintSpaces pop cx pop ax putcr loop Loop0.
If the caller saves the values in the registers, the program needs a set of push and pop instructions around every call. Not only does this make your programs longer, it also makes them harder to maintain. Remembering which registers to push and pop on each procedure call is not something easily done.
On the other hand, a subroutine may unnecessarily preserve some registers if it pre-serves all the registers it modifies. The first loop Loop0 only preserves the cx register. Immediately after the first loop, this code calls PrintSpaces again.
Since the final loop Loop1 uses ax and cx, it saves them both. One big problem with having the caller preserve registers is that your program may change.
You may modify the calling code or the procedure so that they use additional registers. Such changes, of course, may change the set of registers that you must preserve.
Worse still, if the modification is in the subroutine itself, you will need to locate every call to the routine and verify that the subroutine does not change any registers the calling code uses.
You can also push and pop variables and other values that a subroutine might change. Since the 80x86 allows you to push and pop memory locations, you can easily preserve these values as well. Parameters Although there is a large class of procedures that are totally self-contained, most procedures require some input data and return some data to the caller. Parameters are values that you pass to and from a procedure.
There are many facets to parameters. Questions concerning parameters include: Where is the data coming from?
How do you pass and return data? There are six major mechanisms for passing data to and from a procedure, they are Pass by value, Pass by reference, Pass by result, and Pass by name. You also have to worry about where you can pass parameters. Finally, the amount of data has a direct bearing on where and how to pass it.
The following sections take up some these issues. Pass by Value A parameter passed by value is just that — the caller passes a value to the procedure. Pass by value parameters are input only parameters. That is, you can pass them to a procedure but the procedure cannot return them. In HLLs, like Pascal, the idea of a pass by value parameter being an input only parameter makes a lot of sense.
Since you must pass a copy of the data to the procedure, you should only use this method for passing small objects like bytes, words, and double words. Passing arrays and strings by value is very inefficient since you must create and pass a copy of the structure to the procedure. Pass by Reference To pass a parameter by reference, you must pass the address of a variable rather than its value. In other words, you must pass a pointer to the data. The procedure must dereference this pointer to access the data.
Passing parameters by reference is useful when you must modify the actual parameter or when you pass large data structures between procedures. Passing parameters by reference can produce some peculiar results. This is because the parameters i and j are pointers to the actual data and they both point at the same object.
Pass by reference is usually less efficient than pass by value. You must dereference all pass by reference parameters on each access; this is slower than simply using a value.
However, when passing a large data structure, pass by reference is faster because you do not have to copy a large data structure before calling the procedure. Pass by Value-Returned Pass by value-returned also known as value-result combines features from both the pass by value and pass by reference mechanisms.
You pass a value-returned parameter by address, just like pass by reference parameters. However, upon entry, the procedure makes a temporary copy of this parameter and uses the copy while the procedure is executing.
When the procedure finishes, it copies the temporary copy back to the original parameter. The Pascal code presented in the previous section would operate properly with pass by value- returned parameters. Of course, when Bletch returns to the calling code, m could only contain one of the two values, but while Bletch is executing, i and j would contain distinct values. In some instances, pass by value-returned is more efficient than pass by reference, in others it is less efficient.
On the other hand, if the procedure uses this parameter often, the procedure amortizes the fixed cost of copying the data over many inexpensive accesses to the local copy. You pass in a pointer to the desired object and the procedure uses a local copy of the variable and then stores the result through the pointer when returning. The only difference between pass by value-returned and pass by result is that when passing parameters by result you do not copy the data upon entering the procedure.
Pass by result parameters are for returning values, not passing data to the procedure. Therefore, pass by result is slightly more efficient than pass by value-returned since you save the cost of copying the data into the local variable. Pass by Name Pass by name is the parameter passing mechanism used by macros, text equates, and the define macro facility in the C programming language.
This parameter passing mechanism uses textual substitution on the parameters. However, implementing pass by name using textual substitution in a com-piled language like ALGOL- 68 is very difficult and inefficient.
Basically, you would have to recompile a function every time you call it. So, compiled languages that support pass by name parameters generally use a different technique to pass those parameters. Consider the following Panacea procedure: PassByName: procedure name item: integer; var index: integer ; begin PassByName; foreach index in Were you to substitute the pass by name parameter item you would obtain the following code: begin PassByName; foreach index in High level languages like ALGOL and Panacea compile pass by name parameters into functions that return the address of a given parameter.
So in one respect, pass by name parameters are similar to pass by reference parameters insofar as you pass the address of an object. The major difference is that with pass by reference you compute the address of an object before calling a subroutine; with pass by name the subroutine itself calls some function to compute the address of the parameter. So what difference does this make? Well, reconsider the code above. Had you passed A[I] by reference rather than by name, the calling code would compute the address of A[I] just before the call and passed in this address.
Inside the PassByName procedure the variable item would have always referred to a single address, not an address that changes along with I. With pass by name parameters, item is really a function that computes the address of the parameter into which the procedure stores the value zero. It is worth noting that most HLLs supporting pass by name parameters do not call thunks directly like the call above. Generally, the caller passes the address of a thunk and the subroutine calls the thunk indirectly.
This allows the same sequence of instructions to call several different thunks corresponding to different calls to the subroutine. Passing Parameters in Registers Having touched on how to pass parameters to a procedure, the next thing to discuss is where to pass parameters. Where you pass parameters depends, to a great extent, on the size and number of those parameters. If you are passing a small number of bytes to a procedure, then the registers are an excellent place to pass parameters.
The registers are an ideal place to pass value parameters to a procedure. If you are passing a single parameter to a procedure you should use the following registers for the accompanying data types: Data Size Pass in this Register Byte: al Word: ax Double Word: dx:axor eax if or better This is, by no means, a hard and fast rule.
If you find it more convenient to pass 16 bit values in the sior bx register, by all means do so. However, most programmers use the registers above to pass parameters. If you need more than six words, perhaps you should pass your values elsewhere.
The UCR Standard Library package provides several good examples of procedures that pass parameters by value in the registers. Likewise, puti expects the value of a signed integer in the ax register. You can pass 32 bit segmented addresses dx:ax like other double word parameters. However, you can also pass them in ds:bx, ds:si, ds:di, es:bx, es:si, or es:di and be able to use them without copying into a segment register.
The UCR Stdlib routine puts, which prints a string to the video display, is a good example of a subroutine that uses pass by reference. It wants the address of a string in the es:di register pair. It passes the parameter in this fashion, not because it modifies the parameter, but because strings are rather long and passing them some other way would be inefficient. As another example, consider the following strfill str,c that copies the character c passed by value in al to each character position in str passed by reference in es:di up to a zero terminating byte: ; strfill- copies value in al to the string pointed at by es:di ; up to a zero terminating byte.
Inside the procedure you would copy the value pointed at by this register to a local variable value-returned only. Just before the procedure returns to the caller, it could store the final result back to the address in the register. The following code requires two parameters. The first is a pass by value-returned parameter and the subroutine expects the address of the actual parameter in bx. The sec-ond is a pass by result parameter whose address is in si.
This routine increments the pass by value-result parameter and stores the previous result in the pass by result parameter: ; CopyAndInc- BX contains the address of a variable. This routine ; copies that variable to the location specified in SI ; and then increments the variable BX points at.
If you are willing to trade a little space for some speed, there is another way to achieve the same results as pass by value- returned or pass by result when passing parameters in registers. Consider the following implementation of CopyAndInc: CopyAndInc proc mov cx, ax ;Make a copy of the 1st parameter, inc ax ; then increment it by one.
Both versions increment I and store the pre- incremented version into J. Clearly the latter version is faster, although your program will be slightly larger if there are many calls to CopyAndInc in your program six or more. You can pass a parameter by name or by lazy evaluation in a register by simply load-ing that register with the address of the thunk to call. Consider the Panacea PassByName procedure One implementation of this procedure could be the following: ;PassByName- Expects a pass by reference parameter index ; passed in si and a pass by name parameter, item, ; passed in dx the thunk returns the address in bx.
To pass parameters on the stack, push them immediately before calling the subroutine. The subroutine then reads this data from the stack memory and operates on it appropriately. This is one of the reasons the disp[bp], [bp][di], [bp][si], disp[bp][si],and disp[bp][di] addressing modes use the stack segment rather than the data segment.
The following code segment gives the standard procedure entry and exit code: StdProc proc near push bp mov bp, sp. In the CallProc procedure there were six bytes of parameters pushed onto the stack so ParmSize would be six. Take a look at the stack immediately after the execution of mov bp, sp in StdProc. Summary In an assembly language program, all you need is a call and ret instruction to implement procedures and functions.
This chapter begins with a review of what a procedure is, how to implement procedures with MASM, and the difference between near and far procedures on the 80x Functions are a very important construct in high level languages like Pascal. Detailed memory maps assist the reader with tricky areas of code. Math routines are carefully dissected to enhance understanding of minute code changes. Appendices are provided on basic math routines to supplement the readers' background.
This book is written for an audience with a broad range of skill levels, relevant to both the absolute beginner and the skilled C embedded programmer. A supplemental appendix on 'Working with a Consultant' provides advice on working with consultants, in general, and on selecting an appropriate consultant within the microchip design consultant program.
With this book you will learn: the symbols and terminology used by programmers and engineers in microprocessor applications; how to program using assembly language through examples and applications; how to program a microchip microprocessor, selecting the processor with minimal memory, and therefore minimal cost options; how to locate resources for more in-depth material content; and how to convert higher level language ICs to a lower level language.
Teaches how to start writing simple code, e. Topics covered range from assembly language and microprocessor design to the Motorola , programming techniques, control of peripheral devices, and high-level languages.
Emphasis is given to the computer-like aspects of microprocessors. This text is comprised of 12 chapters; the first of which provides a general overview of microprocessors, differences between hardwired and programmed devices, and different kinds of microprocessors. The reader is then introduced to the basic types of information inside a microprocessor, including Boolean information, numerical information, character codes, and the machine code.
The chapters that follow focus on the intellectual and practical tools that the designer of a microprocessor system will need. The basic structure of a microprocessor is analyzed, with particular reference to a simple hypothetical computer and some programs for this machine.
This book also discusses assembly language; some of the features that give microprocessors their flexibility as well as generality and power; and the Motorola microprocessor as an example of machine architecture.
0コメント