|
| |


I've probably been moving pretty quickly, for some. Maybe even way
too quickly. If it is only a little bit too fast, perhaps just take a break and let things
soak in. Otherwise, look at some other web pages, now that you know some 'keywords' to try
out in a search. I may, at some other point in time, try to expand the previous
explanations but not for now. So regardless, I'm going to proceed to .EXE type programs
under DOS -- with apologies to those I may leave behind.
DOS .EXE Programs
Let's take the second program, the one that printed a short message,
and turn in into a DOS .EXE program. We can then see the necessary differences in the
source code. It's not much.
Here's the original source code:
.model tiny
.code
.startup
mov dx, OFFSET msg
mov ah, 9
int 21h
.exit
.data
msg DB "Hello out there!", 13, 10, '$'
END
You can see some differences in what I entered to the COPY command,
earlier. The differences are in spacing, not words. I've reformatted it a little here
because I think it might be a little easier to see the pieces, this way. At this point,
it's probably a good idea that you start using an editor of some kind for your assembly
source code. You can find a good editor for free in the MASM32 installation mentioned on
my PC Tools web page, called QEDITOR. It works just fine under Windows and it's pretty
easy to use as an editor. It includes features for writing Windows programs in its menus,
but it's still a fine program for writing DOS assembly programs. Or just use whatever text
editor you are used to using.
By the way, you can also see that it often doesn't matter what case
you use in writing your programs. 'END' is the same as 'end' and so on. For the most part,
your choice of CASE is your own. An obvious exception is in your literal ASCII text, of
course, where the case you use is the case you will see displayed. Another exception is
the labels, such as 'msg'. The linker needs to match up some of these labels when it is
busy linking a final program together and if the case is different and if the linker
believes that a label upper case labels aren't the same as lower case labels (this is a
switch option for the linker, usually) then the linker will simply not match up labels
that are different in their case.
So let's see what the source code would look like when writing the
same program, but for making a .EXE instead of a .COM program.
.model small
.code
.startup
mov dx, OFFSET msg
mov ah, 9
int 21h
.exit
.data
msg DB "Hello out there!", 13, 10, '$'
.stack
END
Can you find the differences? They are pretty small, to be sure --
no pun intended. The .model directive has been changed to specify a small model
program. These are always .EXE programs. The other change is to add a .stack
directive at the bottom.
Normally, .COM programs are entirely contained within a single
memory segment. This means that the PSP, the code for the program, its constants, data and
variables, and its stack are all in the same memory segment -- all fitting within 65536
bytes of memory. The size of the stack for .COM programs is simply whatever is in between
the end of the code and data and the end of the memory segment. For example, if the code
and data for the .COM program takes up 1000 bytes of memory and the PSP takes (by
definition) 256 bytes, then the stack has assigned to it the difference, or simply 65536 -
1000 - 256 = 64280 bytes. In other words, one doesn't really need to specify a stack or
its size for .COM programs. The stack gets the left-overs, by default.
In the case of an .EXE program, the stack needs to exist and you
need to tell the assembler to allocate some or else the linker will fail to do so and will
probably complain about it, as well. You can just use a .stack directive without
specifying a size or you can add a size to that directive, as in ".stack 100" to
allocate 100 bytes of stack space. Part of the reason for all this is that DOS tries to
allocate only the memory actually needed by an .EXE program, unlike the case with the .COM
program where DOS usually allocates everything available. So DOS needs to know how much
stack you really need. And to tell DOS, the linker needs to know, too. And to tell the
linker, you need to tell the assembler. So there it is.
The registers that DOS sets up for .EXE programs are
mentioned here, below. The EXE header structure is mentioned in these entries and
documented at the bottom of this page.
- CS
- This segment register is loaded with the starting segment address
of the code, as indicated in the EXE header structure and then adjusted by DOS once it
selects an available memory segment.
- IP
- The starting offset address of the EXE program, taken from the
EXE header structure without modification.
- SS
- This segment register is loaded with the segment address of the
stack, as indicated in the EXE header structure and then adjusted by DOS once it selects
an available memory segment.
- SP
- The starting offset address of the EXE program, taken from the
EXE header structure without modification.
- DS, ES
- These two segment registers are initialized by DOS (when used
with an offset of 0) to point to the beginning of the Program Segment Prefix (PSP.)
- BX:CX
- This register pair is usually set to the size of the starting
segment for the .EXE program, treating BX as the upper 16-bits of a 32-bit value. I don't
think these values are of much use, though.
- AX, DX, SI, DI
- These registers are set to 0, when the .EXE program starts. I
wouldn't rely on this behavior, though. Just set them as you need them and don't count on
them being zero when DOS starts the .EXE program.
Oh, a final note. The directive, .startup, will actually
generate some code this time. If you get a chance, compare the listing files for this
third lesson and the second one. It'll be interesting to note, if not entirely clear.
How are .EXE Programs Different?
Well, let's to a quick test. Enter the new program shown above using
an editor or, if you want, just use the COPY command. Name it lesson03.asm. Now,
enter the following command:
C>ml /Fl /Sa lesson03.asm «
Microsoft (R) Macro Assembler Version 6.15.8803
Copyright (C) Microsoft Corp 1981-2000. All rights reserved.
Assembling: lesson03.asm
Microsoft (R) Segmented Executable Linker Version 5.60.339 Dec 5 1994
Copyright (C) Microsoft Corp 1984-1993. All rights reserved.
Object Modules [.obj]: lesson03.obj
Run File [lesson01.com]: "lesson03.com"
List File [nul.map]: NUL
Libraries [.lib]:
Definitions File [nul.def]:
C>
Notice that the "/t" option is now missing
from the linker's "Object Modules" prompt. This is because ML knows now that
this isn't a .COM program. So it takes away that switch option. The result should be a new
file called lesson03.exe, among some others. This is the new executable program.
But it's format is definitely different. Notice the size??
.EXE programs are usually bigger than .COM programs,
even when they do exactly the same thing. Part of the reason is that an .EXE program has a
special header section -- the first part of the program file is reserved for some
information that DOS will need once it tries to run the program. DOS will use this
information and, in some cases (most of them, actually), DOS will also modify the program
code after it loads it into memory. This header takes up some space. Some of the early
linker programs from Microsoft would attempt to keep this header to a minimum size, but
their more modern linkers will always set aside 512 bytes for the header. After this, the
code and data follows. This is why .EXE programs are almost always larger than 512 bytes
in size, regardless of how short the actual program is.
You can test out the new .EXE program by running it. Be
absolutely sure that there is no .COM file sitting in your directory with the same name,
though. DOS will prefer to execute the .COM file over the .EXE file, if both have the same
name.
By the way, here are the .EXE header details. I won't
go into a deep explanation of them all, but this list will give you an idea of what DOS
needs when it tries to run a .EXE program.
EXE
Header Fields Table
|
|
Name |
Description |
|
|
exeSignature |
EXE Header Signature This
value is set to the two initials of an MS-DOS developer, 'MZ'. This word value is 0x5A4D,
since this is a little-endian machine. This is just a "magic" value that is
placed at the beginning of every .EXE file. If the file isn't identified with these two
bytes, then it probably isn't an .EXE file and DOS will not load it (hopefully.)
|
|
|
exeExtraBytes |
Last Page Byte Count Each
disk block (or page) of the EXE file is an exact 512 bytes in size. EXE programs are not,
however, neatly divisible by 512. They might be 100 bytes or 10,000 bytes long. But rarely
do they work out to an exact multiple of 512 bytes. This value specifies how many bytes in
the last block (or page) are valid, if the value is other than zero. If zero, then the
entire last block is considered valid.
|
|
|
exePages |
Page Count of EXE This
specifies how many blocks (pages) are used by the entire EXE program. This value includes
the size of the header, itself. This should be equal to: FLOOR( (exefilesize+511) / 512 ).
|
|
|
exeRelocItems |
Pointer Count in Relocation Table This is number of entries in the relocation table, provided elsewhere in
the EXE file.
|
|
|
exeHeaderSize |
Header Size This value is
the size, in paragraphs (16-byte "chunks"), of the EXE header. Even though the
fixed size part of the header is 28 bytes, this value allows the EXE file to include other
information after the 28-byte header, but before the beginning of the program, itself. For
example, the relocation entries may be located directly after the 28-byte header.
|
|
|
exeMinAlloc |
Minimum Memory Allocation This
is the minimum number of memory paragraphs, beyond the amount required to actually load
the program. Often, this value is 0. DOS will not load the program if there isn't enough
memory available for both the actual program size plus this additional amount beyond that
actual value.
|
|
|
exeMaxAlloc |
Maximum Memory Allocation This
is the maximum number of memory paragraphs to allocate for the program. DOS will allocate
this much, if available, falling back to the minimum allocation, if less is available.
This value helps accommodate stack and heap memory space desired by the program.
|
|
|
exeInitSS |
Initial SS Value This is
the initial value of the SS segment register. The DOS loader will adjust this value by the
base segment value of the memory allocated to run the program.
|
|
|
exeInitSP |
Initial SP Value This is
the initial value of the SP stack pointer. This value isn't changed by the DOS loader.
|
|
|
exeChecksum |
Checksum This may have
originally been intended to provide DOS with a further check on the validity of a program,
before trying to run it. However, it was never implemented. Any value may be placed here,
including zero.
|
|
|
exeInitIP |
Initial IP Value This is
the initial value of the IP register. Basically, this sets the starting point for an EXE
program. This value isn't changed by the DOS loader.
|
|
|
exeInitCS |
Initial CS Value This is
the initial value of the CS segment register. The DOS loader will adjust this value by the
base segment value of the memory allocated to run the program.
|
|
|
exeRelocTable |
Relocation Table Offset This
is the byte position, within the EXE file, of the relocation table. Set this to the
address just at the end of the 28-byte header, usually, even if the relocation table is
empty.
|
|
|
exeOverlay |
Overlay Number This is
usually 0, for resident programs. This value isn't always included in descriptions of EXE
header structures and isn't used by the DOS program loader, I believe. |
Last updated: Tuesday, January 18, 2005 02:14
|