Update 1st March 2021: This post has been updated to reflect refined understanding of TRAP 64 arising from analysis of the floating point division code.
In this post I will be exploring how integer-to-ASCII conversion is done in PDP-11 BASIC. Surprisingly, this turned out to be very complicated indeed, and so the topic warrants an entire post on its own.
This post will describe the operation of TRAP 64, TRAP 14 and TRAP 12. TRAP 12 is the integer-to-ASCII conversion subroutine, and the other two TRAPs support its operation.
For context and a list of other posts on this topic, see the PDP-11 BASIC reverse engineering project page.
TRAP 64
The purpose of TRAP 64 is to divide the 32 bit value passed in R2/R3, by the 32 bit value passed in R4/R5.
This code is also used for dividing the significands of floating point numbers, in which case the significand of the dividend is passed in the value R0/R1/R2/R3. The significand of the dividend will have been rotated right by one bit before being passed into this TRAP, so R3 will be zero and R2 may contain one bit of the significand in the highest position but is otherwise zero. The significand of the divisor is passed in the value R4/R5.
The resulting value is stored in R2/R3 and any remainder from the division is stored in R0/R1.
The basics of dividing binary numbers is well explained elsewhere. If you are not already familiar with it, spending a few minutes to understand the basic concepts of binary division will really help you appreciate what's going on here.
Here's the code for TRAP 64:
012036 012746 MOV #40, -(SP)
012042 010446 MOV R4, -(SP)
012044 010546 MOV R5, -(SP)
012046 005466 NEG 2(SP)
012052 005416 NEG (SP)
012054 005666 SBC 2(SP)
012060 061601 ADD (SP), R1
012062 005500 ADC R0
012064 066600 ADD 2(SP), R0
012070 103445 BCS 12204
012072 005046 CLR -(SP)
012074 006103 ROL R3
012076 006102 ROL R2
012100 006101 ROL R1
012102 006100 ROL R0
012104 005716 TST (SP)
012106 001410 BEQ 12130
012110 005016 CLR (SP)
012112 066601 ADD 2(SP), R1
012116 005500 ADC R0
012120 005516 ADC (SP)
012122 066600 ADD 4(SP), R0
012126 000404 BR 12140
012130 060501 ADD R5, R1
012132 005500 ADC R0
012134 005516 ADC (SP)
012136 060400 ADD R4, R0
012140 005516 ADC (SP)
012142 005716 TST (SP)
012144 001401 BEQ 12150
012146 005203 INC R3
012150 005366 DEC 6(SP)
012154 003347 BGT 12074
012156 006003 ROR R3
012160 103404 BCS 12172
012162 060501 ADD R5, R1
012164 005500 ADC R0
012166 060400 ADD R4, R0
012170 000241 CLC
012172 006103 ROL R3
012174 062706 ADD #10, SP
012200 000242 CLV
012202 000207 RTS PC
012204 062706 ADD #6, SP
012210 104773 TRAP 373
012212 000262 SEV
012214 000207 RTS PC
Let's see how all this works.
012036 012746 MOV #40, -(SP)
Firstly the value 40 (32 in decimal) is pushed onto the stack. This value is used as a loop counter so that we know when we have finished iterating through all of the bits in R2 and R3, the dividend (which are treated as a single 32 bit value by this code).
012042 010446 MOV R4, -(SP)
012044 010546 MOV R5, -(SP)
R4 and R5, the high and low words of the divisor respectively, are pushed onto the stack.
012046 005466 NEG 2(SP)
012052 005416 NEG (SP)
012054 005666 SBC 2(SP)
The two values just pushed onto the stack are now negated. So, for example, if the divisor passed in R5 was 12 (i.e. decimal 10) then this value on the stack will now be -10 (decimal). In case negating the value in SP (i.e. R5) causes a carry flag to be set, which it will unless a negative value was passed, the carry is subtracted from SP+2, which is the value passed from R4. This additional step is required to properly negate the 32 bit value stored across two, effectively independent, 16 bit words.
012060 061601 ADD (SP), R1
012062 005500 ADC R0
012064 066600 ADD 2(SP), R0
012070 103445 BCS 12204
The negated value of the divisor, currently stored in two words on the stack, is added to the values in R1 and R0. This functionality appears to be used when dividing floating point numbers but when dividing integers, R0 and R1 values always appear to be zero.
If this assignment causes the carry flag to be set, this is an error, and so control jumps to the following error code:
012204 062706 ADD #6, SP
012210 104773 TRAP 373
012212 000262 SEV
012214 000207 RTS PC
Three words are popped off the stack by adding 6 to the stack pointer and a non-fatal error is generated. The overflow flag is set and control returns from the subroutine.
012072 005046 CLR -(SP)
A zero word is now pushed onto the stack.
This concludes the setup instructions and we now move into the loop for cycling through the bits of the dividend, which are stored in R2 and R3.
The loop that follows performs the following steps:
Bits are moved from R3 into R2 and from R2 into R1 until the value of R0/R1 exceed the value of the divisor.
This is detected by adding the divisor to the value of R0/R1 each time around the loop and checking for an overflow. Overflow is detected by adding the carry bit to the zero word at the top of the stack.
If the value of R0/R1 is sufficient to cause an overflow, the value at R3 is incremented, indicating that the divisor has been divided into the dividend at this bit position. The remainder of the division will be left in R0/R1.
The loop counter (stored at SP+6) is decremented and tested to make sure we have not shifted all 32 bits yet.
If we have, the loop exits.
Otherwise, the loop continues with the next bit.
Let's look at the code in detail.
012074 006103 ROL R3
012076 006102 ROL R2
012100 006101 ROL R1
012102 006100 ROL R0
Firstly all four registers are rotated left. The highest bit, which is rotated out of the word by the ROL instruction, is moved into the carry bit and then rotated in as the lowest bit in the next rotate instruction. So, in other words, bits are successively rotated upwards from R3 into R2, and then from R2 into R1, and from R1 into R0.
012104 005716 TST (SP)
012106 001410 BEQ 12130
The value at the top of the stack is then tested. If it is equal to zero that means we did not cause an overflow the last time around the loop. Otherwise, we did cause an overflow with the bit from the previous iteration, in which case the following code is executed:
012110 005016 CLR (SP)
012112 066601 ADD 2(SP), R1
012116 005500 ADC R0
012120 005516 ADC (SP)
012122 066600 ADD 4(SP), R0
012126 000404 BR 12140
The value at the top of the stack (the overflow flag) is cleared and then the negation of the divisor (now stored at SP+4 and SP+2 respectively) are added to R0 and R1, which will contain any remainder from the previously overflowing calculation, plus the bit that has been shifted in as part of the current loop iteration. The carry bits are appropriately managed to make sure the two words act as a single 32 bit value. Control then jumps to address 12140 to test whether another overflow occurs with the new bit that has been shifted in this loop iteration.
Otherwise, no overflow occurred on the previous loop iteration, in which case the following code is executed:
012130 060501 ADD R5, R1
012132 005500 ADC R0
012134 005516 ADC (SP)
012136 060400 ADD R4, R0
012140 005516 ADC (SP)
The divisor (stored in R5 and R4) is added to the value stored in R1 and R0, with appropriate management of the carry bit and setting the value at the top of the stack pointer to 1 if there has been an overflow.
If the value in R0/R1 is less negative than the divisor, then when the divisor is added, the value will flip from being negative to being positive, which will cause a carry bit to be set and therefore the value at the top of the stack to be incremented. In addition, when the value flips from negative to positive, the resulting positive value is the remainder of dividing by the divisor.
012142 005716 TST (SP)
012144 001401 BEQ 12150
012146 005203 INC R3
The value at the top of the stack is now tested. If it is not equal to zero, the value in R3 is incremented, representing the fact that the divisor has been successfully divided into the divident at this bit position. If the value at the top of the stack is equal to zero, that means no overflow occurred so the bit in R3 is not set by skipping ahead to 12150.
012150 005366 DEC 6(SP)
012154 003347 BGT 12074
The loop counter (stored at SP+6) is now decremented and if it is still greater than zero, control branches back up to the top of the loop to handle the next bit of the dividend.
012156 006003 ROR R3
012160 103404 BCS 12172
012162 060501 ADD R5, R1
012164 005500 ADC R0
012166 060400 ADD R4, R0
012170 000241 CLC
012172 006103 ROL R3
When the loop counter has decremented to zero, there may be some bits left in R0/R1. This will happen in cases where the dividend is not evenly divisible by the divisor. In order to return the remainder correctly, an additional number of steps are carried out.
Firstly R3 is rotated right, and if it equals 1 we jump ahead to 12172 and rotate R3 right. This situation arises when we incremented R3 on the last iteration through the loop, representing the situation where the dividend is evenly divisible by the divisor.
Otherwise, there may be a remainder left in R1 and R0 so the value of the divisor is added to R1 and R0 (with appropriate handling of the carry bit) and then R3 is rotated left again. The clear carry flag is set before the rotate left of R3 because we know for certain if this code is executing that the bit that was rotated out of R3 was zero, so we don't want to accidentally rotate in the incorrect value due to the carry bit being set.
012174 062706 ADD #10, SP
012200 000242 CLV
012202 000207 RTS PC
Finally, five values are popped off the stack by adding 10 onto the stack pointer, the overflow flag is cleared, and then control returns from the subroutine.
TRAP 14
TRAP 14 is used to perform integer-to-ASCII conversion. The absolute value of the input integer is stored in the memory location pointed to by R1. This is followed by a second word, at 2(R1), that is zero if the value at (R1) is positive, and -1 if the value at (R1) is negative.
The output from this TRAP if a fixed-width (total 12 characters), whitespace padded ASCII representation of the input number, stored at memory locations starting with R0, with one whitespace after the number.
This TRAP is only invoked by TRAP 12 (see below).
Here's the code:
011566 010546 MOV R5, -(SP)
011570 010046 MOV R0, -(SP)
011572 005046 CLR -(SP)
011574 012103 MOV (R1)+, R3
011576 011102 MOV (R1), R2
011600 002004 BGE 11612
011602 005402 NEG R2
011604 005403 NEG R3
011606 005602 SBC R2
011610 005216 INC (SP)
011612 012705 MOV #12, R5
011616 005004 CLR R4
011620 012746 MOV #177777, -(SP)
011624 005000 CLR R0
011626 005001 CLR R1
011630 104464 TRAP 64
011632 010146 MOV R1, -(SP)
011634 050200 BIS R2, R0
011636 050300 BIS R3, R0
011640 005700 TST R0
011642 001370 BNE 11624
011644 010605 MOV SP, R5
011646 005204 INC R4
011650 005725 TST (R5)+
011652 002375 BGE 11646
011654 005304 DEC R4
011656 012703 MOV #13, R3
011662 160403 SUB R4, R3
011664 005303 DEC R3
011666 016500 MOV 2(R5), R0
011672 005703 TST R3
011674 003404 BLE 11706
011676 112720 MOVB #40, (R0)+
011702 005303 DEC R3
011704 000772 BR 11672
011706 005715 TST (R5)
011710 001403 BEQ 11720
011712 112720 MOVB #55, (R0)+
011716 000402 BR 11724
011720 112720 MOVB #40, (R0)+
011724 062716 ADD #60, (SP)
011730 112620 MOVB (SP)+, (R0)+
011732 005716 TST (SP)
011734 002373 BGE 11724
011736 112710 MOVB #40, (R0)
011742 062706 ADD #6, SP
011746 012605 MOV (SP)+, R5
011750 000207 RTS PC
Let's see how this works.
011566 010546 MOV R5, -(SP)
011570 010046 MOV R0, -(SP)
011572 005046 CLR -(SP)
Firstly, R5, R0 and a zero word are pushed onto the stack.
011574 012103 MOV (R1)+, R3
011576 011102 MOV (R1), R2
R1 points at the integer value to be converted to ASCII, so this is moved to R3. R1 is then incremented, so it now points a word representing the sign of the value to be converted, and that value is moved to R2. If R2 is 0 then the value in R3 is positive, but if R2 is -1, then the value in R3 is negative.
011600 002004 BGE 11612
011602 005402 NEG R2
011604 005403 NEG R3
011606 005602 SBC R2
011610 005216 INC (SP)
If the value in R2 is greater than or equal to zero, then these lines are skipped, but otherwise the lines above are executed.
The value at R2 is negated, so it will contain the two's complement value of -1, which will be 1. Then the value at R3 is negated, so it will be the two's complement of the value to be displayed. The negation of R3 will lead to the carry flag being set which is subtracted from R2, leaving R2 with the value zero.
The value at the top of the stack, currently zero, is then incremented.
011612 012705 MOV #12, R5
011616 005004 CLR R4
011620 012746 MOV #177777, -(SP)
011624 005000 CLR R0
011626 005001 CLR R1
These lines set up the registers for the call to TRAP 12. The value 12 (decimal 10) is stored in R5 and the value in R4 is cleared. Minus one is pushed onto the stack, and the values in R0 and R1 are cleared.
011630 104464 TRAP 64
TRAP 64 is then used to divide the value contained in R2/R3 by the value in R4/R5 (which is decimal 10). In other words, the value in R2/R3 is divided by 10. Afterwards, the result will be in R2/R3 and the remainder, i.e. the value of the lowest digit of the original number, will be in R1.
011632 010146 MOV R1, -(SP)
The value from R1 is then pushed onto the stack.
011634 050200 BIS R2, R0
011636 050300 BIS R3, R0
011640 005700 TST R0
011642 001370 BNE 11624
Words R2 and R3 are AND'd together into R0 and then the value in R0 is tested. This checks whether we have divided up the entire number into individual digits on the stack, or whether there are still digits remaining. If the value of R0 is non-zero, control branches up to 11624 and the loop continues, dividing by 10 and pushing the resulting digit onto the stack, until R0 equals zero.
When the value in R0 is zero, that means that the entire number has been broken up into digits. For example, if the value to be converted to ASCII was 123 (decimal) then the stack would look like this:
157410: 000001
157412: 000002
157414: 000003
157416: 177777
In this example, SP would point at 157410.
011644 010605 MOV SP, R5
011646 005204 INC R4
011650 005725 TST (R5)+
011652 002375 BGE 11646
011654 005304 DEC R4
Now we count the number of characters we need to display. First, the stack pointer is moved into R5. Then, R4 is incremented and the value at R5 is tested, with R5 being auto-incremented afterwards. If the value at R5 is greater than or equal to zero, control branches back up to 11646 to count another digit. When this loop ends, R4 will contain one more than the number of digits to be displayed, so R4 is decremented.
011656 012703 MOV #13, R3
011662 160403 SUB R4, R3
011664 005303 DEC R3
The total length of the resulting string, 13 (octal) or 11 (decimal) is moved into R3. The number of digits, stored in R4, is subtracted from this value. This gives the number of non-digit characters required to pad the string to the fixed length. At this point we have not yet checked whether a minus sign is required to indicate that the number is negative, so the value in R3 is decremented by one in case a minus sign needs to be added to the string.
011666 016500 MOV 2(R5), R0
The location for the output string, pushed onto the stack at the beginning of this TRAP, is moved back into R0.
011672 005703 TST R3
011674 003404 BLE 11706
011676 112720 MOVB #40, (R0)+
011702 005303 DEC R3
011704 000772 BR 11672
Now we loop, padding the output string with whitespace. The value in R3 is tested. If it is less than or equal to zero, then control jumps out of this loop to 11706. Otherwise, another space is added to the string.
A byte with value 40 (the ASCII code for the space character) is moved to R0, and then R0 is incremented.
The value at R3, containing the number of whitespace characters to add, is decremented and control branches back up to 11672.
011706 005715 TST (R5)
011710 001403 BEQ 11720
011712 112720 MOVB #55, (R0)+
011716 000402 BR 11724
011720 112720 MOVB #40, (R0)+
The next part of the display code determines whether a minus sign is needed, or another whitespace character. The value at R5 is tested. This value will be zero if the number to be displayed is positive, or -1 if the number to be displayed is negative. If the value is equal to zero, control jumps to 11720, and another space character is added to the output string at R0, and R0 is incremented.
If the value equals -1 then a byte with the value 55 (the ASCII code for "-") is added to the output string at R0, and R0 is incremented. After this, control jumps to 11724.
011724 062716 ADD #60, (SP)
011730 112620 MOVB (SP)+, (R0)+
011732 005716 TST (SP)
011734 002373 BGE 11724
011736 112710 MOVB #40, (R0)
Now, at last, we add the digits into the output string. The value 60 is added to the value pointed to by the stack pointer. This will convert the value from a number, to the ASCII character representing that number. This value is then moved into the output string and both the stack pointer and R0 are incremented. The value at the stack pointer is then tested, to see if we have reached the end of the digits. If the value at the stack pointer is greater than zero, we loop up to address 11724 and add the next digit to the output string. Otherwise, a final whitespace character is added to the output string.
011742 062706 ADD #6, SP
011746 012605 MOV (SP)+, R5
011750 000207 RTS PC
Finally, six is added to the stack pointer, popping three values off the stack. The value from the top of the stack is then popped into R5 and control returns from the TRAP.
TRAP 12
Finally for this post we have TRAP 12, which is the main integer-to-ASCII (ITOA) entry point. This TRAP works by receiving a numerical value, passed in R1 and converting it to its corresponding ASCII decimal representation. The resulting string is stored in the memory locations starting at the value in R0.
Here's the code:
011506 010046 MOV R0, -(SP)
011510 005046 CLR -(SP)
011512 010146 MOV R1, -(SP)
011514 002002 BGE 11522
011516 005166 COM 2(SP)
011522 010601 MOV SP, R1
011524 162706 SUB #14, SP
011530 010600 MOV SP, R0
011532 104414 TRAP 14
011534 010601 MOV SP, R1
011536 016600 MOV 20(SP), R0
011542 062701 ADD #5, R1
011546 012702 MOV #7, R2
011552 112120 MOVB (R1)+, (R0)+
011554 005302 DEC R2
011556 003375 BGT 11552
011560 062706 ADD #22, SP
011564 000207 RTS PC
Let's see how this works:
011506 010046 MOV R0, -(SP)
011510 005046 CLR -(SP)
011512 010146 MOV R1, -(SP)
First R0 is pushed onto the stack, then a zero word is pushed onto the stack and finally R1 is pushed onto the stack.
011514 002002 BGE 11522
011516 005166 COM 2(SP)
If the value in R1 is greater than or equal to zero, the COM line is skipped. However, if the value in R1 is negative, all the bits in the value at SP+2 are flipped using the COM instruction. This will turn the value zero on the stack into a -1.
011522 010601 MOV SP, R1
011524 162706 SUB #14, SP
011530 010600 MOV SP, R0
The stack pointer is backed up into R1. Six words are then allocated on the stack by subtracting 14 (12 in decimal) from the value of the stack pointer. The updated value of the stack pointer is then moved to R0.
011532 104414 TRAP 14
TRAP 14 is then used to perform the integer-to-ASCII conversion, as described above. The output from this is a 12 character fixed-width, whitespace padded ASCII representation of the numeric value, stored in the memory location pointed to by R0.
011534 010601 MOV SP, R1
011536 016600 MOV 20(SP), R0
The value of the stack pointer, where the output string is stored, is moved to R1. Then, the value at location SP+20 is moved to R0. This is the value of the output location that was pushed onto the stack at the beginning of the TRAP.
There are a total of 12 bytes in the output from TRAP 14, pointed to by R1. Firstly, we skip the first five characters. Then 7, representing the remaining seven characters is moved into R2.
011552 112120 MOVB (R1)+, (R0)+
011554 005302 DEC R2
011556 003375 BGT 11552
Now, a byte is copied from R1 (the output from TRAP 14) to R0 (the output location for this TRAP), and both registers are then incremented. Then the value in R2 is decremented and if it is greater than zero, control jumps back up to 11552 to copy another byte.
011560 062706 ADD #22, SP
011564 000207 RTS PC
22 is added to the stack pointer, to pop 11 values off the stack and then control is returned from this TRAP.
Conclusion
The integer-to-ASCII conversion code was way, way more complicated than I expected it to be, but hopefully this walkthrough provides you with a decent understanding of how it works.
Feel free to submit any follow-up questions in the comments. Thanks for reading!
Odd choice of instructions in trap 14.
011620 012746 MOV #177777, -(SP)
could have been
SET -(SP)
"Words R2 and R3 are AND'd together into R0"
Correction: BIS is an OR
The TST below is not needed as the flags are already set.:
011634 050200 BIS R2, R0
011636 050300 BIS R3, R0
011640 005700 TST R0
011642 001370 BNE 11624
This:
011664 005303 DEC R3
011666 016500 MOV 2(R5), R0
011672 005703 TST R3
011674 003404 BLE 11706
Would be more efficiently written as this:
011666 016500 MOV 2(R5), R0
011664 005303 DEC R3
011674 003404 BLE 11706