4. Using ML

Now that we've gone through the process of using DEBUG, let's walk though the creation of the second program when using ML.

Using ML and LINK

Although you can create a finished program with a single DOS command, using ML, there are actually two programs at work. These are ML.EXE and LINK.EXE. ML will assemble a source code file written by you into what is called an "object file." This object file is not the completed program, though. It's an intermediate step.

The reason that an intermediate step is used is because there are many times when programmers will prefer to write a program using more than one source file -- one program, many source files. So when your source code is first assembled, it is assembled into an object file with the extension of .OBJ. As a second step, the linker (LINK.EXE) is then notified about how many object files are supposed to be combined into a single program and it does it. This process of compiling into object files and then later linking those object files into the final program is called "separate compilation." Some languages support it, some don't. Assembly coding almost always does support it.

When you use ML, by default it invokes LINK.EXE to finish up the process and make the final program. If you don't want that, you have to add a command line option, "/c", to the DOS command you use to start ML.EXE. But for short programs (and particularly, for your first programming efforts), it's a lot easier to simply let ML invoke the linker after it is done assembling the source code file.

To create the source file, I could have specified some fancy editor. But that would have involved requiring more programs and having to document yet another program. I'm lazy, so I just used a standard DOS command called COPY to create the file. To tell COPY to use the keyboard as input for the file, the special name CON: is used. COPY recognizes this special name. I then also specified the name of the file I wanted this input written into -- in this case, lesson02.asm. The extension .ASM is commonly used for assembly source code files.

C>copy con: lesson02.asm «

That starts the file. We now start entering in the source code. I'll examine these, line by line:

.model tiny «

This first line tells the assembler I'm planning on writing a .COM type program. There's another kind of program called a .EXE program, which DOS versions 2.0 and above understand. But .COM programs are the easiest to understand, so that's a good place to start. If we'd wanted to write a .EXE file, we could have specified "small" or "medium" or "compact" or "large" or "huge" here. (There are a number of kinds of .EXE files.) If I were writing protected mode code (Windows code, for example), I could also use "flat" here. But we are making a .COM, so I used "tiny."

.code «

This second line informs the assembler that I want to start writing some code, instead of (for example) writing data. A fuller explanation of this kind of assembly statement will have to wait. But for now, it simply means I want to start writing code. The assembler merely makes a 'mental note' of this.

.startup «

This third line tells the assembler that I want to start entering my code where DOS normally starts the program. Since DOS normally starts .COM programs at offset address 0100h, this instruction simply tells the assembler to place any code following this directive starting at 0100h. Nothing magic, really. Just a handy way to make sure the code goes where you want it.

mov dx, OFFSET msg «
mov ah, 9 «
int 21h «

Rather than repeat what I've just said on the prior page, these three lines set up the DX and AH registers and call DOS. I've already discussed this and included documentation on the DOS function call, earlier. The key difference to note here is that I didn't specify an exact number to put into the DX register, this time. Instead, I used what is called a "label." The keyword OFFSET used just before this label, msg, tells the assembler that I want to put the offset address of msg into the DX register, not the value located at msg. There is a difference. In this case, I haven't yet even told the assembler about the label msg, so the assembler must simply take it on faith that there will be a label given, later on.

.exit «

Now, I use a special directive that the assembler has pre-programmed into it to cause the program to exit back to DOS. The assembler will generate the same two instructions we used in DEBUG, earlier, here. It's just a different way of saying it. We could have, if we wanted to, written this just like we did in DEBUG. But I decided to use this special directive, just so you'd know about it, too.

.data «

The above line tells the assembler that we want to stop writing code and want to switch to data, now. We need to do this so that we can enter in our literal string. Again, the assembler just makes a mental note.

msg DB "Hello out there!", 13, 10, '$' «

This line is very similar to the one we used in DEBUG. However, note that I've added a label at the beginning. Also, note that I used 13 and 10 here, instead of 0D and 0A, because the assembler normally interprets numbers in decimal notation while DEBUG uses only hexadecimal notation. These values are actually exactly the same, just in different notation. They represent the carriage return and line feed characters, by the way. Finally, the last character on this line tells DOS where the literal string ends, so that DOS Function 09h will know when to stop printing.

end «

The above line tells the assembler that this is the end of the source code. Every assembler source code file needs this directive placed at the end. Without it, ML will complain.

^Z
        1 file(s) copied

C>

The DOS COPY command, when given CON: to say to use the keyboard for input, needs to know when you intent to stop typing lines into the file. It uses the special character, control-Z, to inform it you are done. When you type that special character, the COPY command displays ^Z and tells you it did the job and finally returns back to the DOS command line prompt.

That's it for the source code file. It's created. What's left to do is to actually assemble (and link) it:

C>ml /Fl /Sa lesson02.asm «

Let's look at the above command line closely. I've told DOS to run ML, obviously. But I've added some "switch options" to the command line, before the actual source code filename. The first one, /Fl, tells the assembler I want it to generate a listing file. The listing file provides some interesting details about what the assembler actually did, when assembling the source code. You should definitely look at this file. It will go by the name lesson02.lst, in this case. The second switch option, /Sa, just says to make the listing file very detailed. This makes it all the more useful to read. (If you want to see all the options and a terse description of them, you can use the switch option, /?, on the command line with ML.)

Then, the assembler starts up and displays this:

Microsoft (R) Macro Assembler Version 6.15.8803
Copyright (C) Microsoft Corp 1981-2000.  All rights reserved.

That's just the program banner telling you what the program actually is and what version it is. You can use this to double check that you are running the right program.

 Assembling: lesson02.asm

The above line is then shown, telling you that it is assembling the file you mentioned. Good news -- t's the file we wanted!

That's it for the source code file. It's created. What's left to do is to actually assemble (and link) it:

Microsoft (R) Segmented Executable Linker  Version 5.60.339 Dec  5 1994
Copyright (C) Microsoft Corp 1984-1993.  All rights reserved.

At this point, we are now informed of the LINK.EXE banner. This tells you that the ML.EXE program has automatically called up the LINK.EXE program to finish making a program. If we didn't want to see this, we'd need to add the /c switch option to the ML command line. But we didn't do that, so ML invoked the linker after it was done assembling the source code.

Object Modules [.obj]: lesson02.obj /t

This line shows the normal LINK.EXE prompt for the object filename(s). ML has automatically placed the name of the object file it generated here. It also added the /t switch option, which stands for /TINY, to tell the linker that it should build a .COM program. The ML program figured this out because we'd used the ".model tiny" directive in our source code.

Run File [lesson01.com]: "lesson02.com"

This above line shows the normal LINK prompt for the final program name. Again, ML has automatically filled in the usual name for this program.

List File [nul.map]: NUL

Here, LINK is asking if a map-file is desired. NUL is the special name that says "no map file." ML automatically filled that in, too. A map file can, at times, be handy. It goes hand and hand with listing files. At some point, you might want to actually generate one and see how it can be of use. But for now, we didn't ask for it so the ML program told the LINK program to simply not produce one.

Libraries [.lib]:

Libraries are essentially fancy collections of object (.OBJ) files. The hold pre-compiled/pre-assembled source code that may be generally useful. You can look at a library as a kind of ZIP file of object files, though it's usually not compressed. It's more of a simple archive, in a sense.

In any case, our program doesn't need a library and ML tells the LINK program, here, this fact.

Definitions File [nul.def]:

This line is a little more complex to discuss. For most DOS programming, this line will always be blank. It's useful for making Windows routines, though.

Well, that's about it. At this point, LINK.EXE knows what to do and it reads in the object file and creates the final program. Which is the same as the one made using DEBUG.EXE.

Last updated: Tuesday, January 18, 2005 02:03