Saturday, 3 July 2021

Fig-Forth At PC=Forty (Part 1)

Fig-Forth was a dialect of the up-and-coming systems language from the late 1970s and into the early 1980s. Forth itself was notable for becoming the standard language to control radio telescopes, but Fig-Forth was an attempt to produce an open-source version of the language easily ported to different processors.

And indeed that's what happened: within a few years it had been ported from the pdp-11 to a number of 8 bit CPUs like the 6502, the RCA 1802 (which was used on a number of space probes), the Intel 8080, the Zilog Z80 and later back up to a new generation of 16-bit microprocessors like the Intel 8086 and Motorola 68000. You can see the the list of popular (for the time) CPUs it used to run on here.

The point of this blog (series, maybe), though is to experiment with Fig-Forth for the original 8088-based IBM PC, because that landmark computer is 40 years old this year and I thought it would be interesting to plunge back into a development environment from that era.

The first step is to find a suitable emulator. I chose an IBM PC 5150 emulator from PcJs.org. I chose this emulator above some other small machine emulators, partly because it means you can try out my experiments for yourself under the browser you're reading this on, but primarily because the PcJs emulators run at the actual speeds of the computers they're emulating (roughly - I think the disk access is faster). Finally, PcJs works with raw image files for disks and they can be created with the dd command under Linux.

Imaging Fig-Forth

Although I could download the PC version of Fig-Forth from the Fig-Forth website. My MacBook (under Catalina) wouldn't expand it using the gui-based archiver, but it turns out you can use the command line unzip FIGFORTH.ZIP to expand the contents.

This Fig-Forth is designed for MS-DOS version 2.0 onwards, which came out 2 years later than the PC itself; so my quest for authenticity isn't perfect. I'll just have to console myself with the idea that an original PC could have run it.

So, the next step is to generate a suitable disk image. Again that's fairly easy. All I need to do is boot the PCJS 5150 PC with a disk image and then save it. I can then mount it on the Mac and drop the FORTH.EXE into it.



Finally, I can run Fig-Forth. Unfortunately, that doesn't work. Although Fig-Forth only uses 16kB, and the PC has 64kB which would be plenty on an 8-bit machine; it's not enough for Fig-Forth!

It turns out I needed to refresh the page too, pressing the reset button wasn't good enough. The next memory size is 96kB and that does work.

Running Fig-Forth

The 8088 is a much better processor for running Fig-Forth on than pretty much any 8-bit CPU, because the 8088 has far more 16-bit registers than 8-bit CPUs, a full set of 16-bit ALU operations. However, the 8088 still needs far more instructions than e.g. a 6809 to execute the inner interpreter:

NEXT:
LODSW ;AX <- (IP)
MOV BX,AX
NEXT1: MOV DX,BX ; (W) <- (IP)
INC DX ; (W) <- (W) + 1
JMP WORD PTR [BX] ; TO `CFA'

Which, coupled with the fact that the 8088 is a crippled 8086 and individual instructions are longer means that PC Forth isn't particularly fast. We can compare it with the Jupiter ACE / Fignition Forth benchmarks I published a little while ago:

They are (and thankfully, the code can be copied and pasted into the PC emulator):

: BM1 CR ." S"  10000 0 DO LOOP  ." E" ;
: BM2  CR ." S"  0 BEGIN  1+ DUP 9999 >  UNTIL DROP  ." E" ;
: BM3  CR ." S"  0 BEGIN  1+ DUP DUP / OVER  * OVER + OVER -
  DROP DUP 9999 >  UNTIL ." E"  DROP ;
: BM4  CR ." S"  0 BEGIN  1+ DUP 2 / 3  * 4 + 5 - DROP DUP
  9999 > UNTIL  ." E" DROP ;
: BM5SUB ;
: BM5  CR ." S" 0 BEGIN  1+ DUP 2 / 3  * 4 + 5 - DROP BM5SUB
  DUP 9999 > UNTIL  ." E" DROP ;
: BM6  CR ." S" 0 BEGIN  1+ DUP 2 / 3 * 4 + 5 -  DROP BM5SUB
  5 0 DO LOOP  DUP 9999 > UNTIL  ." E" DROP ;
5 ARR M [*]
: BM7  CR ." S" 0 BEGIN  1+ DUP 2 / 3 * 4 + 5 -  DROP BM5SUB
  5 0 DO DUP I M ! LOOP  DUP 9999 > UNTIL  ." E" DROP ;
: BM1F  10000 0 DO  10.9 9.8 F+ 7.6 F-  5.4 F* 3.2 F/  2DROP  LOOP ;
: BM3L  0 10000 0 DO  I + NEG [**] I AND  I OR I XOR  LOOP DROP ;

[* this needs an extra definition:  : ARR <BUILDS DUP + ALLOT DOES> OVER + + ; ]
[** MINUS is used instead of NEG]

 BMx
 Jupiter-Ace (fast mode)
 (FIGnition 1.0.0)
 PC FIG-FORTH  PC vs Ace
 PC FIG-FORTH vs BASICA
 BM1  1.6  0.0116  0.43  3.7  22
 BM2  0.54  0.046  0.22  2.5  15.7
 BM3  7.66  0.218 2.09  2.6  4.2
 BM4  6.46  0.228  2.04  2.4  4.4
 BM5  6.52  0.252  2.09  3.1  4.7
 BM6  7.38  0.320  2.43  3.0  7.0
 BM7  12.98  0.660  3.30  3.9  8.2
 BM3L  1.0  0.034  0.27  3.7  N/A
 BM1F  14.18  0.33  N/A  N/A  N/A
     Mean
    3.11  9.46
 Mean (subsets) BM1.. BM7 22.4  BM1.. BM3L  23.3  N/A

And so it looks like an IBM PC running FIG-Forth is about 3x faster than a Jupiter ACE. Probably the most telling tests are BM1, which represents 10K loops in 0.43s. This means a single Loop takes about 43µs, or about 200 clock cycles. BM3 adds a division, multiplication, addition and subtraction and 4 stack ops, i.e. 8 extra instructions. That adds 2.96-0.43 = 2.53s / 80000 = about 31.6µs per operation.

By contrast, it means FIG-Forth is perhaps 10x slower than assembler, or perhaps 5 to 10x faster than BASIC, or BASICA (which means Advanced Basic). FIG-Forth is relatively slow, because of the seemingly pointless MOV DX,BX; INC DX instructions.

Conclusion

FIG-Forth was an early public-domain systems language designed to run as the language and Operating System for a small computer, in as little as 16kB or so. Thanks to the increased memory needs of 16-bit CPUs and their OS's and the trend towards running Forth on a mainstream OS; 64kB isn't enough for a 16kB FigForth (though it is large enough for BASICA, which is a larger executable); I had to increase the PC's memory to 96kB (though I would have guessed 80kB would be big enough).

Prior to using PCJS I attempted to use a few different emulators including DosBox and Tiny8086, but neither seemed to have easily defined accurate timings.

Development environments from the past and the computers themselves were always fast enough and lean enough, because developers adjusted the implementations to match the capabilities of the hardware; despite the machines themselves being around 100,000x slower than modern computers. Nevertheless, it's fair to use modern facilities (such as being able to paste code into the emulator) to make our lives easier.

No comments: