Day 15: Advanced Math
Multiplication
Way, way back on Day 5, you learned a rudimentary form of multiplying. It wasn’t a very versatile technique, since you were always multiplying by the same number. Now that we know about loops, we can create more dynamic routines.
Simple Multiplication
The arithmetic operation of multiplying is simply repeated addition, so we can use a DJNZ loop to add a number repeatedly.
D_Times_E: ; HL = D times E
LD HL, 0 ; Use HL to keep track of the product
XOR A ; Need to check if either factor is zero
OR D
RET Z
OR E
RET Z
LD B, D ; Store one of the factors in the loop counter
LD D, H ; Clear D so DE hold the other factor
Loop:
ADD HL, DE
DJNZ Loop
RET
This looks like a nice, simple routine that does the job. However, it has a little inconvenience. When D is a very large number, there are a lot of additions and the product is calculated very slowly.
Fast Multiplication
The fast method of assembly multiplication is, suprisingly, a nearly direct translation of the everyday base-10 method. Since it’s a mostly automatic process for us, I’ll give an explanation of the algorithm.
$$ \begin{array}{rrl} & 579 \\ \times & 163 \\ \hline & 1728 & = 579 \times 3 \\ & 34740 & = 579 \times 60 \\ + & 57900 & =579 \times 100 \\ \hline & 94368 \end{array} $$Each digit of the multiplicand is multiplied by each digit of the multiplier. The partial products are then added together to give the result.
If we do this in base-2, we follow the same procedure, but it looks less complicated:
$$ \begin{array}{rrl} & 0000\mathbf{1101}_2 & (13) \\ \times & 00000110_2 & (6) \\ \hline & 00000000_2 \\ & 000\mathbf{1101}0_2 \\ & 00\mathbf{1101}00_2 \\ + & 00000000_2 \\ \hline & 01001110_2 & (78) \end{array} $$You can see that the multiplicand is being multiplied by either zero or one, so each partial product is either zero, or the original multiplicand itself, shifted an appropriate amount.
To convert this procedure for assembly:
- Shift the multiplier right to check the least-significant bit.
- If the carry flag is set, add the multiplicand to our running total.
- Regardless of whether there was an addition, shift the original multiplicand left.
- Repeat for each bit in the multiplier.
A possible routine to do this:
.module DE_Times_A
DE_Times_A: ; HL = DE × A
LD HL, 0 ; Use HL to store the product
LD B, 8 ; Eight bits to check
_loop:
RRCA ; Check least-significant bit of accumulator
JR NC, _skip ; If zero, skip addition
ADD HL, DE
_skip:
SLA E ; Shift DE one bit left
RL D
DJNZ _loop
RET
This routine will run much faster than the previous one, since the speed
isn’t based on the value of the multiplier, but rather the amount of
1
s.
If we limit the factors to 8 bits, we can make an even faster routine by storing the multiplier and the product in one register:
.module H_Times_E
H_Times_E: ; HL = H × E
LD D, 0 ; Zero D and L
LD L, D
LD B, 8
_loop:
ADD HL, HL ; Get most-significant bit of HL
JR NC, _skip
ADD HL, DE
_skip:
DJNZ _loop
RET
You know from Day 9 that ADD HL, HL
effectively shifts HL one bit to
the left. We are therefore checking the multiplier (H) from its
most-significant end rather than the least-significant. In other words,
we perform DE×128, DE×64… instead of DE×1, DE×2…
So, if we add DE on the first iteration, the result will get shifted over 7 times, i.e. DE×27.
You might have an uneasy feeling that repeated addition of DE will corrupt our factor in H. However, this is impossible because the result of an 8-bit number plus another 8-bit number can never be more than 9 bits. By the time this can happen, the multiplier has vacated the lower two bits of H:
For example, if we tried 2552 (the original value of H is blue):
Iteration | Command | Binary Value of HL |
---|---|---|
1 | ADD HL, HL ADD HL, DE |
11111110 00000000 11111110 11111111 |
2 | ADD HL, HL ADD HL, DE |
11111101 11111110 11111110 11111101 |
3 | ADD HL, HL ADD HL, DE |
11111101 11111010 11111110 11111001 |
4 | ADD HL, HL ADD HL, DE |
11111101 11110010 11111110 11110001 |
And so on.
Division
In the longhand version of division,
$$ \require{enclose} \begin{array}{r} 8026 \\[-3pt] 12 \enclose{longdiv}{96315} \\[-3pt] \underline{-96\phantom{000}} \\[-3pt] \hline 03\phantom{00} \\[-3pt] - 0 \phantom{00} \\[-3pt] \hline 31\phantom{0} \\[-3pt] -24\phantom{0} \\[-3pt] \hline 75 \\[-3pt] -72 \\[-3pt] \hline 3 \end{array} $$We take the first digit of the dividend and subtract the largest multiple of the divisor that will fit. We then take the next digit of the dividend and weld it to the remainder. This is repeated until we run out of digits.
If done in binary, we only have to subtract zero or the divisor:
$$ \begin{array}{r} 00101010_2 \\[-3pt] 101_2 \enclose{longdiv}{11010110_2} \\[-3pt] -101_2\phantom{00000} \\[-3pt] \hline 110_2\phantom{000} \\[-3pt] -101_2\phantom{000} \\[-3pt] \hline 111_2\phantom{0} \\[-3pt] -101_2\phantom{0} \\[-3pt] \hline 100_2\phantom{0} \end{array} $$The general algorithm, in English, for dividing a number in n bits by a number in m bits,
- Shift the dividend left one bit.
- Shift the carry out into a temp area of size m+1 bits.
- See if the value of the temp area is greater than or equal to the divisor.
- If it is, subtract the divisor from the temp area and set the lsb of the dividend.
- Repeat n times.
The result is a quotient in the former dividend, and the remainder in the temp area.
.module Div_HL_D
Div_HL_D: ; HL = HL ÷ D, A = remainder
XOR A ; Clear upper eight bits of AHL
LD B, 16 ; Sixteen bits in dividend
_loop:
ADD HL, HL ; Do a SLA HL
RLA ; This moves the upper bits of the dividend into A
JR C, _overflow
CP D ; Check if we can subtract the divisor
JR C, _skip ; Carry means D > A
_overflow:
SUB D ; Do subtraction for real this time
INC L ; Set bit 0 of quotient
_skip:
DJNZ _loop
RET
The dividend is HL and the temp area is A. This is not strictly in keeping with the requirement that the temp area be m+1 bits (unless the divisor is restricted to seven or fewer bits), so in cases where D > L > $80, there will be an overflow.
For example, if HL = $8C00 and D = $90,
Iteration | Commands | Binary value of HL | Binary value of A |
---|---|---|---|
1 | ADD HL, HL RLA |
0001100000000000 | 00000001 |
2 | ADD HL, HL RLA |
0011000000000000 | 00000010 |
3 | ADD HL, HL RLA |
0110000000000000 | 00000100 |
4 | ADD HL, HL RLA |
1100000000000000 | 00001000 |
5 | ADD HL, HL RLA |
1000000000000000 | 00010001 |
6 | ADD HL, HL RLA |
0000000000000000 | 00100011 |
7 | ADD HL, HL RLA |
0000000000000000 | 00100011 |
8 | ADD HL, HL RLA |
0000000000000000 | 01000110 |
8 | ADD HL, HL RLA |
0000000000000000 | 10001100 |
9 | ADD HL, HL RLA SUB D INC L |
0000000000000000 0000000000000001 |
00011000 10001000 |
10 | ADD HL, HL RLA SUB D INC L |
0000000000000010 0000000000000011 |
00010000 10000000 |
11 | ADD HL, HL RLA SUB D INC L |
0000000000000110 0000000000000111 |
00010000 10000000 |
12 | ADD HL, HL RLA SUB D INC L |
0000000000001110 0000000000001111 |
00000000 01110000 |
13 | ADD HL, HL RLA SUB D INC L |
0000000000011110 0000000000011111 |
10100000 00010000 |
14 | ADD HL, HL RLA |
0000000000111110 | 0010000 |
15 | ADD HL, HL RLA |
0000000001111100 | 0100000 |
15 | ADD HL, HL RLA |
0000000011111000 | 1000000 |
Multiprecision Arithmetic
If there’s one thing about HLLs that’s really annoying, it’s that you can never process an integer with more than 4 bytes, you have to use slow, inaccurate floating-point numbers. Wouldn’t it be nice if you could do arithmetic on an integer of any arbitrary size?
Multiprecision Addition
We need a new instruction:
ADC A, { reg8 | imm8 | (HL) }
: Adds the operand and the carry flag to the accumulator.ADC HL, reg16
: Adds reg16
and the carry flag to HL
.- S
- affected
- Z
- affected
- P/V
- detects overflow
- C
- affected
To begin with, let’s add 7695 and 2182 on “paper”:
$$ \begin{array}{r} 1\phantom{00} \\[-3pt] 7695 \\[-3pt] + 2182 \\[-3pt] \hline 9877 \end{array} $$In the tens position, 9 + 8 = 17, which “overflowed”. So you write down ‘7’ and carry the ‘1’. Add 6 + 1 with the carry to compensate, and everything works out all right. This is exactly how ADC is meant to work (amazing, eh? All those years in elementary school, you were learning assembly and didn’t even know it). In an assembly implementation, you work on bytes or words instead of digits, but the theory is the same. So let’s try it.
Example: Add 32-bit number dword1 with 32-bit number dword2.
LD HL, (dword1) ; Get least-significant word of dword1
LD DE, (dword2) ; Get least-significant word of dword2
ADD HL, DE ; Add them
LD (result), HL ; Store least-significant result
LD HL, (dword1 + 2) ; Get most-significant word of dword1
LD DE, (dword2 + 2) ; Get most-significant word of dword2
ADC HL, DE ; Add them with the carry from the previous addition
LD (result + 2), HL ; Store most-significant result
RET
dword1: .DB $B3, $90, $12, $32 ; Each dword is stored with the least-significant
dword2: .DB $F1, $15, $06, $B8 ; bytes first. You could just as easily have stored
result: .DB $00, $00, $00, $00 ; them in big-endian, but because of how registers are
; loaded from RAM, it wouldn't work.
This will end up adding $321290B3 + $B80615F1.
Multiprecision Subtraction
As you’d probably figured, you need the subtraction equivalent of ADC. Why look, it’s our old friend from Day 5!
SBC A, { reg8 | imm8 | (HL) }
: Subtracts the operand and the carry flag from the accumulator.SBC HL, reg16
: Subtracts reg16
and the carry flag from HL
.- S
- affected
- Z
- affected
- P/V
- detects overflow
- C
- affected
Again, we’ll start with subtracting on paper:
$$ \begin{array}{r} 7\phantom{0}17 \\[-3pt] 1876\phantom{0} \\[-3pt] - 691\phantom{0} \\[-3pt] \hline 1185\phantom{0} \end{array} $$Ok, 6 - 1 = 5. Next is a problem, can’t do 7 - 9. Solution: add 10 and
subtract 1 from the next pair to compensate. SBC works in the same way.
When a subtraction result is negative, 256 is effectively added to the
byte in the minuend and the carry flag is set.
Interestingly enough, the addition routine will work fine if you just
replace ADC with SBC.
LD HL, (dword1) ; Get least-significant word of dword1
LD DE, (dword2) ; Get least-significant word of dword2
SBC HL, DE ; Add them
LD (result), HL ; Store least-significant result
LD HL, (dword1 + 2) ; Get most-significant word of dword1
LD DE, (dword2 + 2) ; Get most-significant word of dword2
SBC HL, DE ; Add them with the carry from the previous addition
LD (result + 2), HL ; Store most-significant result
RET
dword1: .DB $B3, $90, $12, $32
dword2: .DB $F1, $15, $06, $B8
result: .DB $00, $00, $00, $00
This routine looks okay, but it has a subtle bug in it, and maybe the
more observant of you have noticed. Maybe you’re thinking, “Gee, Sean,
what happens if the carry flag is set at the start?” and the answer to
that is, “The answer’s gonna be off by one.” And now maybe you’re
thinking “Gee, Sean, doesn’t that make us screwed?” and the answer to
that is, “Yes, it does.”
Hmmm… it seems that the best way to fix this problem is to ensure
that the carry flag is always reset before going into the loop. How do
we do that? Maybe you’d like a hint?
Ah. It appears that boolean operations will reset the carry. An OR A before should set things right.
Multiprecision Compare (Unsigned)
There is no such thing as a “compare with carry” instruction, but since CP and SUB perform the same operation, you’d figure that you could use the multiprecision subtraction procedure to compare two numbers. This would work, but there is a much better way.
Take the two values $38A4 and $9B4C. Just by comparing the MSBs tells you which one is bigger. In fact, only when the MSBs are the same do you need to compare both bits, and the carry is reset in such a case.
; Do a jump to "success" if the dword at HL is greater than the dword at DE
LD DE, dword1
LD HL, dword2
LD B, 4
CmpLoop:
LD A, (DE)
CP (HL)
JR C, success
JR NZ, failure
INC HL
INC DE
DJNZ CmpLoop
failure:
; Code here deals with (DE) >= (HL)
Multiprecision Compare (Signed)
If you want a multiprecision signed compare, then naturally you have a lot more work to do.
; Do a jump to "success" if the dword at HL is greater than the dword at DE
LD DE, dword1
LD HL, dword2
LD B, 4
CmpLoop:
LD A, (DE)
SUB (HL)
JP PO, $+5
XOR $80
JP M, success
; This code snippet restores the Z flag that got changed by XOR
JP PO, $+5
XOR $80
JR NZ, failure ; Since the byte (DE) is >= the byte (HL),
; then an inequality means we failed
INC HL
INC DE
DJNZ CmpLoop
failure:
; Code here deals with (DE) >= (HL)
Multiprecision Boolean
Boolean operations (and one’s complement) are the simplest. Just perform the operation, and store the value to memory. No messing with flags, shifts, or other crap.
LD HL, qword1
LD DE, qword2
LD B, 8
BoolLoop:
LD A, (DE)
AND (HL)
LD (DE), A
INC HL
INC DE
DJNZ BoolLoop
Multiprecision Negation
Probably the simplest way to negate a multibyte integer is to subtract each element from zero.
#define MAX 4 ; Number of bytes
LD HL, dword
AND A ; Clear carry
LD B, MAX - 1
NegLoop:
LD A, 0 ; Cannot use XOR A because it would disturb carry
SBC A, (HL)
LD (HL), A ; Store result
INC HL ; Next byte
DJNZ NegLoop
Multiprecision Shifting
A shift across many bytes is done with a combination of shift instructions and rotate instructions. Keep in mind that the entire number must be shifted no more than one bit at a time.
LD B, 3 ; Number of bits to shift
ShiftLoop:
LD HL, dword
SRL (HL)
INC HL
RR (HL)
INC HL
RR (HL)
INC HL
RR (HL)
DJNZ ShiftLoop
dword: .DB $B3, $90, $12, $32
Multiprecision Rotation
The code to do a rotation depends on the type of rotation wanted.
For RL-type rotation:
LD HL, dword+3
RL (HL)
DEC HL
RL (HL)
DEC HL
RL (HL)
DEC HL
RL (HL)
dword: .DB $B3, $90, $12, $32
For RLC-type rotation:
LD HL, dword+3
PUSH HL
SLA (HL)
DEC HL
RL (HL)
DEC HL
RL (HL)
DEC HL
RL (HL)
POP HL
JR NC, $+3
INC (HL) ; Set last bit of (HL) if carry was set
Multiprecision Multiplication
The process of a multiprecision multiplication is similar to that for the other multiprecision operations. The trickiest thing is that you have to perform multiprecision additions (on all the partial products) at the same time as you do the multiplications.
The routine that is given below is certainly not the most efficient, only the most general. Regardless, it is probably one of the most complicated pieces of coding in this entire guide.
.module XMul
;B = Size of multiplier
;C = Size of multiplicand
;DE = Address of multiplier
;HL = Address of multiplicand
;IX = Address of product buffer (B + C bytes, you can use logarithms to see why this is so.)
;
;All registers including IY are destroyed
XMul:
LD IYH, B
LD IYL, C
First of all, we will have to use the size counters multiple times, so we save them into IY.
XOR A
PUSH IX
_Clear1:
LD (IX), A
INC IX
DJNZ _Clear1
LD B, C
_Clear2:
LD (IX), A
INC IX
DJNZ _Clear2
POP IX
Now we initialize the product area of memory by setting it all to zeros.
_LoopA:
LD C, (HL)
LD B, IYH
PUSH HL
PUSH IX
PUSH DE
Get one byte of the multiplicand into C. Then restore the size of the multiplier into B. Finally save all the pointers.
_LoopB:
LD A, (DE)
LD H, A
CALL mul_hc
LD A, L
ADD A, (IX)
LD (IX), A
LD A, H
ADC A, (IX + 1)
LD (IX + 1), A
Okay, we get one byte of the multiplier into H and multiply H by C using the routine below, getting the product in HL. Now we take this partial product and integrate it into the current full product.
JR NC, _EndB
PUSH IX
POP HL
INC HL
INC HL
_CyLoop:
INC (HL)
INC HL
JR Z, _CyLoop
Now this definately requires some explanation. We have just added HL to two bytes of the product, but there might have been a carry out of this addition, so the next byte of the product is incremented. However the product could be something like $12FFFFFFFFFF, so we need to keep propagating the carry as far as necessary. This brings up a slight problem in that INC does not affect the carry flag. But this can be remedied with a little trick. Imagine for a second that INC actually did affect carry. You should quickly discover that the only circumstance under which carry will be set is when the incremented byte goes from $FF to $00, and in this case zero will be set! Sooooo, all we need to do is to just blindly increment bytes as long as INC (HL) sets Z.
_EndB:
INC IX
INC DE
DJNZ _LoopB
We’re done with the current byte of the multiplier so we increment the pointer, and we increment the product pointer to work in the next partial product. And do our calculations again.
POP DE
POP IX
POP HL
INC HL
INC IX
DEC IYL
JP NZ, _LoopA
RET
So we’re finished with the entire multiplier at this point and we pop the original values of all our pointers back. Now we increment our multiplicand pointer to get the next byte, and increment the product pointer (analogous to indenting a partial product when multiplying on paper).
.module mul_hc
mul_hc:
PUSH BC
LD L, 0
LD B, L
LD A, 8
_loop:
ADD HL, HL
JR NC, _skip
ADD HL, BC
_skip:
DEC A
JP NZ, _loop
POP BC
RET
This is the H_Times_E routine from the beginning of this chapter with some modifications to work in this particular context.
Multiprecision Division
An extended precision division for integers of arbitrary sizes cannot be built up from a basic division routine like extended precision multiplication can. That must be done by taking the logic behind the fast division routine algorithm and extending it. When you consider that such a method would involve a multiprecision shift, rotate, compare, and subtract, it becomes apparent that it would be extremely messy and slow. What is possible is the division of an arbitrary-size integer by an eight-bit or sixteen-bit divisor. This is very easy.
- Store the remainder of the previous division into the MSB of the dividend.
- Store a byte from memory into the LSB of the dividend.
- Divide by the divisor.
- Store the LSB of the quotient into memory (because you can never get a 16-bit quotient).
- Repeat until done.
For the first time you divide, the remainder is considered zero. Note that you need to start from the most-significant byte of the number.
.module XDiv
;IX = Address of dividend
;BC = Size of dividend
;E = Divisor
XDiv:
LD D, 0
_loop:
LD H, D
LD L, (IX)
CALL DivHLByE
LD (IX), L
LD D, A
DEC IX
DEC BC
LD A, B
OR C
JR NZ, _loop
RET
.module DivHLByE
DivHLByE:
PUSH BC
XOR A
LD B, 16
_loop:
ADD HL, HL
RLA
JR C, _overflow
CP E
JR C, _skip
_overflow:
SUB E
INC L
_skip:
DJNZ _loop
POP BC
RET
Signed Multiplication and Division
All of the multiplication and division routines that have been presented will only calculate correct results if the inputs are unsigned. To perform a signed operation takes a little more work, but fortunately the same routines can be used. And hey, you don’t need to code for an overflow of A when dividing. Bonus!
- Take the absolute value (i.e. negate if the sign bit is set) of both inputs.
- Perform the multiplication or division.
- Based on the signs of the inputs, modify the sign of the result
You can find the sign of the result by taking the XOR of the signs of the original inputs.
Sign-Extension
Here is how to sign extend 8-bit and 16-bit registers into 16 and 32 bits.
; Sign-extends E into DE
LD A, E
RLCA ; Move sign bit into carry flag
SBC A, A ; A is now 0 or 11111111, depending on the carry
LD D, A
; Sign-extends DE into HLDE
LD H, D
LD L, E
ADD HL, HL ; Move sign bit into carry flag
SBC HL, HL ; HL is now 0 or 11111111 11111111, depending
Fixed-Point Arithmetic
There are many programming tasks for which pure integers are just not sufficient, and we are forced to delve into the sordid world of the real numbers. The usual option in such cases is to use floating-point numbers. However, this is only feasible for ultra-fast computers with coprocessors. The reason is that floating-point calculations are very complex and usually have very elaborate error detection so as to maintain a high precision and numeric range. In 99 percent of the cases likely to be encountered, this high precision goes wasted, so maybe there is a way to use fractions with an almost indiscernable speed loss? The answer is a resounding yes! Since time immemorial, programmers have used a computational trick to use integers to simulate fractions.
So what exactly is fixed-point? It is a form of computer math that uses an integer to contain both the characteristic and the mantissa by means of scaling. To fully understand how this works, let us take a short detour into some more place value theory.
Take the decimal number 605.916. We know without thinking that the 605 is really a shorthand way to depict 6×102 + 0×101 + 5×100. The decimal half can easily be figured by taking geometric progression of the 10ns and extending it:
$$ 6×10^2 + 0×10^1 + 5×10^0 + 9×10^{-1} + 1×10^{-2} + 6×10^{-3} $$Since the fractional part of the number is just more terms of place value, we can actually apply it to any radix. We might, therefore, encounter a duodecimal (base 12) number such as 12B0.293. This can be converted to a more familiar decimal number:
$$ 1×12^3 + 2×12^2 + 11×12^1 + 0×12^0 + 2×12^{-1} + 9×12^{-2} + 3×12^{-3} = 2149.123699 $$How does this relate to programming. Let’s consider for a moment, that for the binary base there is an infinite continuum of place values that can model all binary real numbers, and that an eight-bit register can be superimposed on this continuum to reveal an 8-place “snapshot” of a binary number, and as has normally been the case, that snapshot is of the first eight positive places:
2∞ … 29 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 2-1 2-2 2-3 2-4 … 2-∞ |
---|---|---|---|---|---|---|---|---|---|
Zeros continuing to infinity | 0 | 1 | 1 | 0 | 0 | 1 | 1 | 1 | Zeros continuing to infinity |
Now if we were to apply a shift operation, we could view it as a translation of the snapshot instead of a rudimentary arithmetic operation.
2∞ … | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 2-1 | 2-2 | 2-3 | 2-4 | … 2-∞ |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Zeros continuing to infinity | 0 | 1 | 1 | 0 | 0 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | Zeros continuing to infinity |
And as is clearly seen, we now have a piece of the mantissa portion in our register snapshot! This is the fundamental mechanic of fp, by shifting a number, we gain a part of the fractional part at the expense of a part of the integer. We would actually never use just eight bits (unless we could get away with it), sixteen gives us a nice foundation.
Now we should set up some ground rules so that we are all on the same wavelength. First, while we could place the binary point in any of the 17 positions, I will set it to between bits 7 and 8. This decision has two clear advantages: we can easily extract either the whole or fractional part, and it provides the most balanced compromise between numerical range and fractional precision. Rule #2 is more of a procedure, but we can convert a number to fixed point by either a left shift by 8 or a multiplication by 256.
Operations on Fixed-Point Numbers
Our fixed-point format uses an internal scaling of 256, so we will algebraically represent an FP number as 256(a) so as to simplify analysis of arithmetic operations.
If we were to add (or subtract) two FP numbers, the internal calculation would be 256(a) + 256(b) = 256(a + b). The result is valid since the scaling factor stayed constant. Thus, fixed point numbers may be added or subtracted using ordinary integer arithmetic instructions.
Suppose we were to get a product of two FP numbers: 256(a) × 256(b) = 65536(ab). The answer has a scaling factor of 65536, which means we get only the fractional part of the result. We must correct this somehow:
- Shift a factor right by 8 bits to calculate 1(a) × 256(b) = 256(ab).
- Shift the product right by 8 bits to calculate 65536(ab) ÷ 256 = 256(ab).
- Calculate a full 32-bit product and use bits 8 through 24.
Options 1 and 2 are too extreme since they will destroy the accuracy of the result, though we might compromise on (1) by scaling both factors down by 16. But option 3 is the best, albeit the slowest (and the routine to do the multiplication is left as an exercise to the reader :)
Contrary to multiplication, division suffers an almost inverse scaling problem: 256(a) ÷ 256(b) = 1(a÷b). The solutions to this problem are similar to multiplication, but again the best method may well be to scale the dividend to 32 bits and craft a routine to divide it by a 16-bit divisor.
Constant Division
I remember from a while back that I promised to show you a way to divide by a constant number perfectly, so here it comes. You really need to know about fixed point if you want to have any hope of understanding this dreck.
The whole premise is based on the fact that if you have a number x, then division by x is the same operation as multiplication by 1⁄x. This ain’t that pansy quotient of 256 that gave only approximate results, but the bona fide x-1 multiplicative inverse. The reciprocal is in, you guessed it, fixed point.
E.g., let’s use a divisor of 15. The reciprocal is 1⁄15, or 0.0666.
Now I’m asking you, how many bits should we use so that (a) we get a
result that is accurate enough, and (b) involves no more than the
absolute minimum of arithmetic?
- Given the constant divisor, d, find the exact value of r = 1⁄d.
- Find the integer z such that 0.5 < ( r × 2z ) + 1. z is the number of leading zeros between the binary point and the first one in the mantissa.
- For an n-bit dividend, take the first n+z+1 bits after the binary point, and round.
Continuing, z = 3, because 0.6 × 2~3~ = 0.53, and so we use only the first 12 bits of the fixed point number, which is 0.000100010001.
We are now going to calculate the product as this:
q = x >> 4;
q = (q + x) >> 4;
q = (q + x) >> 4;
Did you get all that? Heh-heh-heh, okay, here’s a blow-by-blow analysis of each operation.
q = x >> 4
Okay, when we start, q = x × m, where m is initially 1 and will eventually become the reciprocal. A right shift by four will make m = .0001.
q = (q + x) >> 4
If we add the original number, we get a multiplier m = 1.0001, see? If this is then shifted right by four, where the first shift is shifting in the carry from (q + x), m now equals 0.00010001. If we do it again, we get a multiplier of 0.000100010001. Isn’t that just magickal?
Were you the cautious type you might want to quit while we’re ahead, but forget it! The next step is to go the whole nine yards and get the remainder too. It’s straightforward enough, remultiply the remainder by 15.
r = (q << 4) - q;
r = x - r;
Enough of all this C, let’s see it in assembler already.
Div_15:
; IN A dividend
; OUT C quotient
; A remainder
LD B, A
RRA
RRA
RRA
RRA
AND $0F ; A = q * 0.0001
ADD A, B ; A = q * 1.0001
RRA
RRA
RRA
RRA
AND $1F ; A = q * 0.00010001
ADD A, B ; A = q * 1.00010001
RRA
RRA
RRA
RRA
AND $1F ; A = q * 0.000100010001
LD C, A
ADD A, A
ADD A, A
ADD A, A
ADD A, A
SUB C ; A = r * 15
SUB B
NEG ; A = -(r * 15 - x) = x - r * 15
RET