

Well, we've covered a first-time production process of making of a
.COM file in some detail. But what exactly IS a .COM program?
DOS .COM Programs
The .COM type of program was the first (and only) type
of program that DOS 1.0 understood. It wasn't until DOS 2.0 that a new type of program
called a .EXE was invented. DOS 1.0 was a floppy-based system and these floppies didn't
hold a lot of data (about 320kbyte), so there was no real worry about very big programs at
that point. When DOS 2.0 arrived and supported a hard disk in the IBM PC/XT, there was
more memory on a typical PC and the disk space had been expanded to some 10Mbyte and the
need for a new type of program had become clearer.
In DOS 1.0, a very simple set of rules applied to .COM
programs. When a program started, it owned all of the available memory in the computer. If
it tried to use more than was present, it wouldn't run right, of course. But there was no
memory management in DOS 1.0 and so any program running could simply do whatever it wanted
to do, use whatever memory it wanted to use. DOS 1.0 simply kept track of the first
available memory location it could use to load up programs. When DOS would run a .COM
program, it would simply prepare a special area called the Program Segment Prefix (PSP) in
the first 256 bytes of this available memory and then blindly load the entire .COM file
starting just after that point. After loading the .COM file (as simple data bytes) into
the memory just after the PSP, DOS would set up a few register values, place a return
address (to return to DOS) onto the stack, and then jump straight into the program --
always starting that program at the first byte after the PSP. Since 256 bytes corresponds
to 0100h in hexadecimal, this is the offset address that was always used. The CS (code
segment) register would always point to the beginning of the available memory, though,
which was at the beginning of the PSP.
When DOS 2.0 arrived, it included memory management
features. To remain compatible with DOS 1.0's style of running .COM programs, DOS 2.0
would first allocate the largest available block of memory (often, the only one) and
assign it to the .COM program. It would also prepare the first 256 bytes of that block as
the PSP and would load the .COM file exactly after that point, quite similarly to DOS 1.0.
The main issue with .COM programs running on new versions of DOS is that all of the
available memory (or at least, the largest block of memory available) was allocated to the
.COM program. So, if the .COM program wanted to avail itself of any of the newer DOS
memory management functions to allocate more memory, it usually gets a "no way"
response. So, .COM programs will sometimes return all of the unused part of their
allocation back to DOS when they first start up. We didn't do that in our early examples,
because that process would have greatly complicated my explaining them.
The registers that DOS sets up for .COM programs are:
- CS, DS, ES, SS
- These are the four segment registers available to the earlier x86
CPUs (DOS programs do not usually deal with the newer FS and GS segment registers.) They
are all initialized by DOS (when used with an offset of 0) to point to the beginning of
the Program Segment Prefix (PSP.)
- IP
- This is the instruction pointer and represents the offset address
used to run programs. For .COM programs, this starting address is 0100h, so the IP is
always set to 0100h when a .COM program first starts. (The IP register is always paired
with the CS segment register to determine the complete address for the running code.)
- SP
- This is the stack pointer and represents the offset address of
the current point in the stack area. For .COM programs, this value is usually 0FFFEh when
the program starts, being set to the highest number possible one a word of 0 has been
pushed. However, if DOS knows that less than 65536 bytes are available for the .COM
program, it will set the SP register to the end of the actual available memory before
pushing a word of 0 and then running the .COM program. So the SP register isn't always
0FFFEh -- it just usually is in these days with very large DOS program areas.
- BX:CX
- This register pair is usually set to the size of the .COM file,
treating BX as the upper 16-bits of a 32-bit value. However, I really haven't tested what
happens when the .COM file is near or larger than 65536 bytes, so I'd recommend testing
this before relying on it.
- AX, DX, SI, DI
- These registers are set to 0, when the .COM program starts. I
wouldn't rely on this behavior, though. Just set them as you need them and don't count on
them being zero when DOS starts the .COM program.
Summary of .COM Programs
DOS 1.0 only knew how to run .COM programs. In this
first version of DOS, there were also no memory management functions and no concept of
allocated or free blocks of memory. .COM programs were free to use any or all of the
available memory, when they ran. Their stack was simply set up to the end of their program
segment (there was only one such segment) or the end of memory, which ever came first, and
then they were then simply started. DOS just loaded the .COM program into the first
available memory segment and ran them.
DOS 1.0 would function on a machine with only 16k of RAM -- the first PC from IBM provided
a minimum of 16k, with an option to increase this to 64k provided on the motherboard and
up to 256k with a memory expansion card.
With the advent of the IBM PC/XT, standard RAM was increased to 128k, with an option to
increase this to 256k on the motherboard and up to 640k, with memory expansion cards. The
XT also added a 10Mb hard disk. Microsoft came out with DOS version 2.0 to accomodate this
new machine and included a number of features, including extended support for larger disks
with the FAT16 format, the .EXE program type, and memory allocation functions used to
support them.
For backward compatibility, while also now comforming to the requirements of the new
memory allocation functions, DOS versions from 2.0 and beyond allocate the largest
(usually, this means all of memory) memory block when starting .COM programs. This was the
safer way to proceed, under the new guidelines.
.COM programs which want to free up unused memory to DOS, at least those versions from 2.0
and beyond, need to add code designed for that purpose. For example, if a .COM program
wishes to use DOS to load and run another program, it will need to free up some of the DOS
memory, first.
The .COM file format is really quite simple. DOS
doesn't interpret it, it just loads it up into memory and starts it. The first byte in the
.COM file is the first byte of the program that starts, when DOS runs it. It's quite a
simple process.
It's time to talk a little about the Program Segment
Prefix (PSP.)
Last updated: Thursday, July 08, 2004 15:02