and instruction
to selectively force certain bits to zero in a value without affecting other
bits. This is called masking out bits. and instruction for this purpose.and instruction to force selected
bits to zero, you can use the or instruction to force selected
bits to one. This operation is called masking in bits. and
instruction? In that example we wanted to convert an ASCII code for
a digit to its numeric equivalent. You can use the or instruction
to reverse this process. That is, convert a numeric value in the range 0..9
to the ASCII code for the corresponding digit, i.e., '0'..'9'. To do this,
logically or the specified numeric value with 30h.
B:packed array[0..31] of boolean;
requires only four bytes when packed one value per bit. When packed one
value per byte, this array requires 32 bytes. and,
or, and shift instructions. The first step is to mask out the
corresponding bit in the destination operand. Use an and instruction
for this. Then the source operand is shifted so that it is aligned with
the destination position, finally the source operand is or'd into the destination
operand. For example, if you want to insert bit zero of the ax register
into bit five of the cx register, the following code could
be used:
and cl, 0DFh ;Clear bit five (the destination bit)
and al, 1 ;Clear all AL bits except the src bit.
ror al, 1 ;Move to bit 7
shr al, 1 ;Move to bit 6
shr al, 1 ;move to bit 5
or cl, al
This code is somewhat tricky. It rotates the data to the right rather than
shifting it to the left since this requires fewer shifts and rotate instructions.cx register
leaving the single boolean value in bit zero of the ax register,
you'd use the following code:
mov al, cl
shl al, 1 ;Bit 5 to bit 6
shl al, 1 ;Bit 6 to bit 7
rol al, 1 ;Bit 7 to bit 0
and ax, 1 ;Clear all bits except 0
To test a boolean variable in a packed array you needn't extract the bit
and then test it, you can test it in place. For example, to test the value
in bit five to see if it is zero or one, the following code could be used:
test cl, 00100000b
jnz BitIsSet
Other types of packed data can be handled in a similar fashion except you
need to work with two or more bits. For example, suppose you've packed five
different three bit fields into a sixteen bit value as shown below:
If the ax register contains the data to pack into value3,
you could use the following code to insert this data into field three:
mov ah, al ;Do a shl by 8
shr ax, 1 ;Reposition down to bits 6..8
shr ax, 1
and ax, 11100000b ;Strip undesired bits
and DATA, 0FE3Fh ;Set destination field to zero.
or DATA, ax ;Merge new data into field.
Extraction is handled in a similar fashion. First you strip the unneeded
bits and then you justify the result:
mov ax, DATA
and ax, 1Ch
shr ax, 1
shr ax, 1
shr ax, 1
shr ax, 1
shr ax, 1
shr ax, 1
This code can be improved by using the following code sequence:
mov ax, DATA
shl ax, 1
shl ax, 1
mov al, ah
and ax, 07h
Additional uses for packed data will be explored throughout this book.if statement converts the character variable character
from lower case to upper case if character is in the range 'a'..'z'. The
80x86 assembly language code that does the same thing is
mov al, character
cmp al, 'a'
jb NotLower
cmp al, 'z'
ja NotLower
and al, 05fh ;Same operation as SUB AL,32
NotLower: mov character, al
Had you buried this code in a nested loop, you'd be hard pressed to improve
the speed of this code without using a table look up. Using a table look
up, however, allows you to reduce this sequence of instructions to just
four instructions:
mov al, character
lea bx, CnvrtLower
xlat
mov character, al
CnvrtLower is a 256-byte table which contains the values 0..60h at indices
0..60h, 41h..5Ah at indices 61h..7Ah, and 7Bh..0FFh at indices 7Bh..0FFh.
Often, using this table look up facility will increase the speed of your
code.
Via computation:
mov al, character
cmp al, 'a'
jb NotLower
cmp al, 'z'
ja NotLower
and al, 05fh
jmp ConvertDone
NotLower: cmp al, 'A'
jb ConvertDone
cmp al, 'Z'
ja ConvertDone
or al, 20h
ConvertDone:
mov character, al
The table look up code to compute this same function is:
mov al, character
lea bx, SwapUL
xlat
mov character, al
As you can see, when computing a function via table look up, no matter what
the function is, only the table changes, not the code doing the look up.
lea bx,table / xlat) above. The only thing that ever changes
is the look up table. xlat instruction cannot be (conveniently) used to compute
a function value once the range or domain of the function takes on values
outside 0..255. There are three situations to consider:
xlat instruction, functions falling into this class are the
most efficient. The following Pascal function invocation,
B := Func(X);where Func is
function Func(X:word):byte;consists of the following 80x86 code:
mov bx, X
mov al, FuncTable [bx]
mov B, al
This code loads the function parameter into bx, uses this value
(in the range 0..??) as an index into the FuncTable table,
fetches the byte at that location, and stores the result into B.
Obviously, the table must contain a valid entry for each possible value
of X. For example, suppose you wanted to map a cursor position
on the video screen in the range 0..1999 (there are 2,000 character positions
on an 80x25 video display) to its X or Y coordinate on the screen. You could
easily compute the X coordinate via the function X:=Posn mod 80
and the Y coordinate with the formula Y:=Posn div 80 (where
Posn is the cursor position on the screen). This can be easily
computed using the 80x86 code:
mov bl, 80
mov ax, Posn
div bx
; X is now in AH, Y is now in AL
However, the div instruction on the 80x86 is very slow. If you need to do
this computation for every character you write to the screen, you will seriously
degrade the speed of your video display code. The following code, which
realizes these two functions via table look up, would improve the performance
of your code considerably:
mov bx, Posn
mov al, YCoord[bx]
mov ah, XCoord[bx]
If the domain of a function is within 0..255 but the range is outside this
set, the look up table will contain 256 or fewer entries but each entry
will require two or more bytes. If both the range and domains of the function
are outside 0..255, each entry will require two or more bytes and the table
will contain more than 256 entries. Address := Base + index * sizeIf elements in the range of the function require two bytes, then the index must be multiplied by two before indexing into the table. Likewise, if each entry requires three, four, or more bytes, the index must be multiplied by the size of each table entry before being used as an index into the table. For example, suppose you have a function, F(x), defined by the following (pseudo) Pascal declaration:
function F(x:0..999):word;You can easily create this function using the following 80x86 code (and, of course, the appropriate table):
mov bx, X ;Get function input value and
shl bx, 1 ; convert to a word index into F.
mov ax, F[bx]
The shl instruction multiplies the index by two, providing
the proper index into a table whose elements are words.This says that the (computer) function SIN(x) is equivalent to the (mathematical)
function sin x where
As we all know, sine is a circular function which will accept any real
valued input. The formula used to compute sine, however, only accept a small
set of these values.
This range limitation doesn't present any real problems, by simply computing
SIN(X mod (2*pi)) we can compute the sine of any input value.
Modifying an input value so that we can easily compute a function is called
conditioning the input. In the example above we computed X
mod 2*pi and used the result as the input to the sin
function. This truncates X to the domain sin needs
without affecting the result. We can apply input conditioning can
be applied to table look ups as well. In fact, scaling the index to handle
word entries is a form of input conditioning. Consider the following Pascal
function:
function val(x:word):word; begin case x of 0: val := 1; 1: val := 1; 2: val := 4; 3: val := 27; 4: val := 256; otherwise val := 0; end; end;This function computes some value for
x in the range 0..4 and
it returns zero if x is outside this range. Since x
can take on 65,536 different values (being a 16 bit word), creating a table
containing 65,536 words where only the first five entries are non-zero seems
to be quite wasteful. However, we can still compute this function using
a table look up if we use input conditioning. The following assembly language
code presents this principle:
xor ax, ax ;AX := 0, assume X > 4.
mov bx, x
cmp bx, 4
ja ItsZero
shl bx, 1
mov ax, val[bx]
ItsZero:
This code checks to see if x is outside the range 0..4. If
so, it manually sets ax to zero, otherwise it looks up the
function value through the val table. With input conditioning, you can implement
several functions that would otherwise be impractical to do via table look
up.This states that x is an integer in the range 0..359 and
r is an integer. The computer can easily compute this with
the following code:
mov bx, X
shl bx, 1
mov ax, Sines [bx] ;Get SIN(X)*1000
mov bx, R ;Compute R*(SIN(X)*1000)
mul bx
mov bx, 1000 ;Compute (R*(SIN(X)*1000))/1000
div bx
Note that integer multiplication and division are not associative. You cannot
remove the multiplication by 1000 and the division by 1000 because they
seem to cancel one another out. Furthermore, this code must compute this
function in exactly this order. All that we need to complete this function
is a table containing 360 different values corresponding to the sine of
the angle (in degrees) times 1,000. Entering a table into an assembly language
program containing such values is extremely boring and you'd probably make
several mistakes entering and verifying this data. However, you can have
the program generate this table for you. Consider the following Turbo Pascal
program:
program maketable; var i:integer; r:integer; f:text; begin assign(f,'sines.asm'); rewrite(f); for i := 0 to 359 do begin r := round(sin(I * 2.0 * pi / 360.0) * 1000.0); if (i mod 8) = 0 then begin writeln(f); write(f,' dw ',r); end else write(f,',',r); end; close(f); end.This program produces the following output:
dw 0,17,35,52,70,87,105,122 dw 139,156,174,191,208,225,242,259 dw 276,292,309,326,342,358,375,391 dw 407,423,438,454,469,485,500,515 dw 530,545,559,574,588,602,616,629 dw 643,656,669,682,695,707,719,731 dw 743,755,766,777,788,799,809,819 dw 829,839,848,857,866,875,883,891 dw 899,906,914,921,927,934,940,946 dw 951,956,961,966,970,974,978,982 dw 985,988,990,993,995,996,998,999 dw 999,1000,1000,1000,999,999,998,996 dw 995,993,990,988,985,982,978,974 dw 970,966,961,956,951,946,940,934 dw 927,921,914,906,899,891,883,875 dw 866,857,848,839,829,819,809,799 dw 788,777,766,755,743,731,719,707 dw 695,682,669,656,643,629,616,602 dw 588,574,559,545,530,515,500,485 dw 469,454,438,423,407,391,375,358 dw 342,326,309,292,276,259,242,225 dw 208,191,174,156,139,122,105,87 dw 70,52,35,17,0,-17,-35,-52 dw -70,-87,-105,-122,-139,-156,-174,-191 dw -208,-225,-242,-259,-276,-292,-309,-326 dw -342,-358,-375,-391,-407,-423,-438,-454 dw -469,-485,-500,-515,-530,-545,-559,-574 dw -588,-602,-616,-629,-643,-656,-669,-682 dw -695,-707,-719,-731,-743,-755,-766,-777 dw -788,-799,-809,-819,-829,-839,-848,-857 dw -866,-875,-883,-891,-899,-906,-914,-921 dw -927,-934,-940,-946,-951,-956,-961,-966 dw -970,-974,-978,-982,-985,-988,-990,-993 dw -995,-996,-998,-999,-999,-1000,-1000,-1000 dw -999,-999,-998,-996,-995,-993,-990,-988 dw -985,-982,-978,-974,-970,-966,-961,-956 dw -951,-946,-940,-934,-927,-921,-914,-906 dw -899,-891,-883,-875,-866,-857,-848,-839 dw -829,-819,-809,-799,-788,-777,-766,-755 dw -743,-731,-719,-707,-695,-682,-669,-656 dw -643,-629,-616,-602,-588,-574,-559,-545 dw -530,-515,-500,-485,-469,-454,-438,-423 dw -407,-391,-375,-358,-342,-326,-309,-292 dw -276,-259,-242,-225,-208,-191,-174,-156 dw -139,-122,-105,-87,-70,-52,-35,-17Obviously it's much easier to write the Turbo Pascal program that generated this data than to enter (and verify) this data by hand. This little example shows how useful Pascal can be to the assembly language programmer!