One Week Wonder: Line editor

Showing posts with label Line editor. Show all posts

Sunday, 25 July 2021

Fig-Forth At PC=Forty (Part 4)

In part 1, I talked about how to get FIG-Forth for the IBM PC running on PCjs. FIG-Forth was a popular and very compact, public-domain version of the medium speed Forth systems programming language and environment during the early 1980s. Then part 2 covered how to implement a very rudimentary disk-based line editor as a precursor to an interactive full-screen editor; while part 3 dives into machine code routines and a PC BIOS interface, because there was no real screen cursor control via the existing commands.

Let's Edit!

Now, at least we can implement a screen editor. One of the constraints I'll impose will be to keep the editor to within 1kB of source code. At first that'll be easy, because all I want to support is normal characters, cursor control, return and escape to update. However, I know that I'll probably want to add the ability to copy text from a marker point using <ctrl-c>. But even a simple decision like this raises possible issues. Consider this, if I type VLIST I find I can press <ctrl-c> to stop the listing:

However, if I type this definition:

: T1 BEGIN KEY DUP . 27 = UNTIL ;

I find that <ctrl-c> doesn't break out of T1, instead it simply displays 3 and I really do have to press <esc> to quit the routine. But it could be that Forth still checks it automatically, just not with the above sequence of commands in T1. No, from the FIG-Forth source we can see, the breakout of VLIST is simply due to it executing ?TERMINAL and exiting if any key has been pressed. So, that potential problem is sorted!

VLIST Source Code:

DB 85H
DB 'VLIS'
DB 'T'+80H
DW UDOT-5
VLIST DW DOCOL
DW LIT,80H
DW OUTT
DW STORE
DW CONT
DW AT
DW AT
VLIS1 DW OUTT ;BEGIN

DW AT
DW CSLL
DW GREAT
DW ZBRAN ;IF
DW OFFSET VLIS2-$
DW CR
DW ZERO
DW OUTT
DW STORE ;ENDIF
VLIS2 DW DUPE
DW IDDOT
DW SPACE
DW SPACE

DW PFA
DW LFA
DW AT
DW DUPE
DW ZEQU
DW QTERM
DW ORR
DW ZBRAN ;UNTIL
DW OFFSET VLIS1-$
DW DROP
DW SEMIS

The Editor Itself

The first goal in the editor is to convert y x coordinates in the current screen to a memory location in a buffer. This is a minor change from the initial part of the code in EDL:

: EDYX& ( Y X -- ADDR)

>R 15 AND 8 /MOD SCR @ B/SCR * +

BLOCK SWAP C/L * + R> +

;

We want to be able to constrain Y and X coordinates to within the bounds of the screen (in this case with wrap around):

: 1- 1 - ; ( oddly enough missing from FIG-Forth, but I use it quite a bit in the editor)

: EDLIM ( Y X -- Y' X')

DUP 0< IF

DROP 1- 0

THEN

DUP C/L 1- > IF

DROP 1+ 0

THEN

SWAP 15 AND SWAP

OVER 2+ OVER 4 + AT

;

The key part of a screen editor is to be able to process characters, I've picked the vi cursor keys:

: DOKEY ( R C K )

R 8 = IF ( CTRL-H, Left)

THEN

R 12 = IF ( CTRL-L, Right)

THEN

SWAP R 11 = IF ( CTRL-K, Up )

THEN

R 10 = IF ( CTRL-J, Down)

THEN

SWAP R 13 = IF ( CR or CTRL-M)

64 +

THEN

EDLIM

R 31 > R 128 < AND IF ( PRINTABLE)

2DUP EDYX& R SWAP C!

R EMIT 1+ UPDATE EDLIM

THEN

;

Finally we want to put it all together in a top-level function:

: ED ( scr -- )

CLS LIST 0 0 EDLIM

BEGIN

KEY DOKEY

27 = UNTIL

DROP DROP

;

Interestingly, once I'd cleared up an initial bug where the cursor wasn't advanced when I typed a character, and another where I'd missed an AND when checking for printable characters, I was able to use the editor itself to edit improvements ( namely, putting the initial cursor at the right location instead of (0,0)).

Finally, although Forth isn't always as compact as its proponents like me often claim, in fact this editor in itself uses a mere 306 bytes, probably the most compact interactive editor I've seen and there's still over half the screen left for improvements. For example, there's no support for delete ( left, space, left); for inserting a line; copying text, nor blocks. But for the moment, it's easily far more enjoyable than the line editor it replaces.

Exercise For The Reader

The biggest user-interface problem I've found with extensions to the editor has been to decide which control character to use to mark the text location for copying. To explain: many archaic screen editors, including some early word processors used a mark, edit sequence for text manipulation. The user would move the cursor to where they wanted to perform an 'advanced' edit operation; mark the initial location; then move the cursor to either where they wanted the edit operation to finish; mark the end of that edit text; then finally, possibly move the cursor to some other location and complete the edit. For example, on the Quill word processor for the Sinclair QL, you'd Type F3, 'E' (for Erase), move the cursor to where you wanted to erase a block; press enter; move to where the erase should finish (and it would highlight the text as you went along); press enter; confirm you wanted to erase it and then it would. Or in the Turbo C editor you'd Mark an initial starting location; then Mark again an ending location and then perform an operation like 'Copy' to duplicate the text, or 'Delete' or 'Move' to move the text.

Or on a BBC Micro, the BASIC editor was essentially a line editor which supported two cursor positions (!!) If you needed to manipulate a line instead of just retyping it, you'd list the line (if it wasn't on the screen), then move the cursor keys and a second cursor would appear, moving to where you wanted on the screen; while the cursor at your editing position would remain. You'd then hit COPY and it would copy from the second cursor to your editing position, advancing both cursors.

So, in my system, which is similar to how editing works on FIGnition, I'd want to Mark the position where I wanted to copy / erase from; move to where I wanted to paste or finish a delete to and then COPY / MOVE a character at a time from the source to the destination (or Erase the text).

However, the most obvious control character to use, <ctrl-m> is already used for Return, and everything else seems rather contrived. Then I thought, what happens if I use <ctrl-symbol> instead? Do they produce interesting control codes? I found out quite a number produce 0s, but some actually generate the control codes in the range 0..31 that you can't generate from <ctrl-a> to <ctrl-z>.

This is what I found out:

Ctrl+	Code	Ctrl+	Code
\	28	]	29
6	30	-	31

So, what I'd like to know is whether this is just an artefact of the simulator being used on a Mac or whether it's common to other emulators and an actual IBM PC?

Conclusion

Once I'd implemented some earlier, critical definitions it turned out to be quite easy and satisfying to write a full-screen editor. The biggest challenges were in making sure certain key presses wouldn't collide with any system behaviour and finally thinking about some user-interface decisions for some future enhancements.

The editor also nicely illustrates some key Forth aspirations: the editor turns out to be very compact (though of course it's very rudimentary too); and I was able to use it to debug and improve itself once I'd reached some critical level of functionality. It was so easy and tiny, I wonder why it wasn't a standard part of FIG-FORTH, given that it was designed in an era when cursor addressable VDUs were already the norm.

Sunday, 18 July 2021

Fig-Forth At PC=Forty (Part 3)

In part 1, I talked about how to get FIG-Forth for the IBM PC running on PCjs. FIG-Forth was a popular and very compact, public-domain version of the medium speed Forth systems programming language and environment during the early 1980s. Then part 2 covers how to implement a very rudimentary disk-based editor as a precursor to an interactive full-screen editor.

It would be possible to implement a full-screen editor entirely using the existing word set, if it provided definitions that could control the position of the cursor on the screen and the ability to clear the screen.

Unfortunately, it's not possible to do that either by sending display control codes via EMIT, nor via any other special commands. EMIT does support some control codes, carriage return is 13 EMIT, backspace is 8 EMIT, cursor right is 9 EMIT and cursor down is 10 EMIT.

The rest just produce graphics characters.

Let's Do Some Machine Code!

It's inevitable I'd have to get onto some machine code at some point, and it turns out, pretty early on. That means I need some useful Forth and 8086 resources.

Firstly, there's an indispensable guide to the FIG-Forth core: The Systems Guide To FIG-Forth. In it, it says you can write machine code definitions using the ;CODE command. The idea is that you'd write something of the form:

: myMachineCodeDef ;CODE opCode0 C, opCode1 C, etc... ;

But that doesn't work as I imagined. Instead I found you need to do:

CREATE myMachineCodeDef opCode0 C, opCode1 C, etc... SMUDGE

Here, CREATE generates a CFA which points to the parameter field (by default), and because FIG-FORTH is an Indirect Threaded Forth, that's the machine code that gets executed. It's not quite the only way of doing it. The Jupiter Ace's method for executing machine code is to define a CODE word which jumps to machine code in the parameter field:

DEFINER CODE DOES> CALL ;
CODE Noop 253 C, 233 C,

And Direct threaded Forths merely need to build a header without a CFA, because in these cases, the CFA is machine code itself.

Probably the compact resource for translating 8086 instructions is the 8086 datasheet itself. I obtained a copy from Carnegie Mellon University (which incidentally did some pioneering work in parallel processors in the 1970s).

The most critical action a machine code definition must perform is to jump to the next command. My solution is to use the NOOP word whose behaviour does nothing but jump to the next word to execute. We find the CFA of NOOP and take the contents to find the first executable 8086 instruction:

' NOOP CFA @

NOOP is just a single byte jump instruction followed by an 8-bit displacement. Because it's a relative address, we need to add the address following the jump to the 8-bit displacement and because a displacement is a signed 8-bit integer, we need to perform a sign extension to find the true address for NEXT. Finally, we'll need the jump instruction that can handle a 16-bit displacement, which is code 233. This gives us the following, new definitions:

: SXT DUP 127 > IF 256 - THEN ;

: NEXT [ ' NOOP @ DUP 1+ C@ SXT ( 2+ ) + ] LITERAL 233 C, HERE - , ;

A simple, obvious machine code definitions to add to FIG-FORTH is a pair of shift operations, because shifts are really common operations in systems languages, but in FIG-FORTH it seems strangely absent.

To write a workable machine code definition we also need to know what 8086 registers must be preserved in Forth and which can be overwritten. The Forth.ASM assembler code from the original FIGFORTH.ZIP file tells us that SI=IP, SP points to the parameter stack, BP points to the return stack; AX must be preserved and CS, DS, SS all point to the same segment for the Forth executable. However, DX, BX, CX, DI, ES can all be freely modified. So, the shift operations will involve popping the count from the top of the stack into CX (which can be trashed); then the value into BX (which can be trashed); shifting BX by CL and then pushing the result. This gives us:

CREATE << HEX 59 C, ( pop cx) 5B C, ( pop bx) 0D3 C, 0E3 C, ( shl bx,cl) 53 C, ( push bx)

NEXT DECIMAL SMUDGE

CREATE << HEX 59 C, ( pop cx) 5B C, ( pop bx) 0D3 C, 0EB C, ( shr bx,cl) 53 C, ( push bx)

NEXT DECIMAL SMUDGE

This means we can now e.g. multiply or divide by a power of 2 over 100 cycles (20µs) faster than before :-) .

BIOS Functions

Let's go back to cursor control now. The easiest way to do that is via the BIOS functions on an IBM PC. It turns out all the screen control functions are INT 10H BIOS functions, so by creating a generic INT10H BIOS definition, we can then simply supply all the parameters to it in a higher level Forth definition. This function will be simple and only involves popping the registers DX through to AX; then calling INT10H. It isn't documented, but INT10H can foul up BP.

CREATE INT10H ( AX BX CX DX --)

HEX

05A C, ( POP DX )

059 C, ( POP CX )

05B C, ( POP BX)

89 C, 0F8 C, ( MOV DI,AX mod=11 reg=111=di r/m=000=AX )

058 C, ( POP AX )

057 C, ( PUSH DI)

1E C, ( PUSH DS)

55 C, ( PUSH BP [101])

0CD C, 10 C, ( INT10H)

5D C, ( POP BP)

1F C, ( POP DS)

058 C, ( POP AX)

DECIMAL SMUDGE

The only real complexity is that we need to load AX, but we also need to save AX too. It doesn't matter if DI gets trashed as Forth doesn't use it.

There are now quite a number of fun things we can add that use INT10H:

: AT ( R C ) SWAP 8 << + >R ( DX) 512 0 R> INT10H ; ( jupiter ace command for gotoxy)

; VMODE ( n -- ) 0 0 0 INT10H ; ( 0= 40 column 2= 80 column 4=cga)

: CLS 1536 15 0 1999 INT10H 0 0 AT ;

So, we can do 0 VMODE then 1536 HEX 1E00 0 1827 DECIMAL INT10H to put the screen into a 40 column mode with yellow text on a white background.

And with these commands, we can now write a full-screen editor!

Wednesday, 7 July 2021

Fig-Forth At PC=Forty (Part 2)

In part 1, I talked about how to get FIG-Forth for the IBM PC running on PCjs. FIG-Forth was a popular public-domain version of the Forth systems programming language and environment during the early 1980s, which offered a high degree of control, incredible compactness and a performance much better than the ubiquitous language BASIC and although quite a bit slower than assembler, comparable with high level language compilers of the day.

I Need An Editor

I found it was possible to copy and paste text from an editor into PCjs (actually, I simply copied it from the blog post as I was writing it), but it's quite an awkward way to program in Forth. I really want to be able to write code and store it on the emulated PC's disk.

And that's a problem for two reasons. Firstly, the FIG-Forth implementation I have is fairly minimalistic, with no text editor and just the raw disk block operations.

FIG-Forth is weird in that sense, because it was designed with a view to be the OS, language and editor. It doesn't really have any concept of a file system, just raw, absolutely addressed 0.5kB (or however big the disk sector is) disk blocks that can be read (into an in-memory cache) and written. Yet the executable is an MS-DOS program which is dependent on a file system, that absolutely can't be messed up by FORTH itself.

I Once Had a PC FIG-Forth

Some slightly later FORTHs fixed this by allowing users to create files and then access blocks within those files. I picked up one of those during the public-domain disk mail-ordering era of the later 1980s. It came on a single 360kB MSDOS disk (maybe 2) and was actually very complete, with a substantiative editor; maths libraries; libraries for handling more than 64kB; a string library and possibly hooks into MSDOS. I think maybe it even had a full-screen editor.

In FIG-Forth a screen itself always refers to 1kB of editable text, made from consecutive blocks, which is roughly the size of typical microcomputer screens ( 64x16 or eg 40x25). This Forth's screen editor was an overtyping editor, which meant that typing didn't insert characters, but simply replaced whatever was at the cursor position. However, I think you could copy lines around the screen which helped. Because the screen editor worked on a fixed character grid, it would have been extremely wasteful to type code in with the kind of indentations we would use today, instead definitions would spread out as much as possible.

I ordered that PC-based FIG-Forth while at University on a whim, because I liked Forth; having learned it on a Jupiter Ace and played with a version or two on a ZX Spectrum (Abersoft Forth). However, at the time, I was at the University Of East Anglia where the computer science course wasn't PC based. Instead we did all our course material on a DEC Vax (or Micro Vax I) or on early Macintosh computers (512kB, Mac Plus and later Mac II); or on Sun Workstations. Literally, nobody was interested in PCs even though the rest of the world had largely switched to them. And why would we? We already had access to a variety of graphical environments and PCs just felt like a step into the past.

But FIG-Forth for the PC did pique my interest which is why I tried it out.

The Solution

So, my solution is fairly simple. To avoid this new FIG-Forth on PCjs from overwriting MS-DOS files I'll simply swap the disk to a different one once I've run FORTH.EXE. It doesn't matter what I use for the image, I can just clone the existing disk, because Forth just cares about the sectors and I can just overwrite what I want.

There are standard line-oriented editors for Forth, but I'm not really interested in them (they're a pain), but I think a screen-oriented editor would be OK. However, to bootstrap that I'll have to write a simple line editor on one screen; then write my full-screen editor on another screen, so that I can then load my full-screen editor without needing the line editor.

All my line editor will be able to do is copy the rest of the command line of text from the input terminal to a specified line of text in the currently edited block and update it. Super-simple! Because screens map to multiple blocks, we need to specify the screen we want to edit and that's held in the variable SCR, which gets updated when we type n LIST.

To modify line l of the current block I'll add a word called EDL which would be used as:

l EDL : BM1 ." S" 10000 0 DO LOOP ." E" ;

At the end, EDL updates the block to say it's been edited. I can choose other screens to edit using LIST as much as I want - forth will cache them in its block buffers and write them back to disk as needed.

I'll also need to know how much I can write on the current line so text doesn't get truncated, so I'll add a command EDMAR which displays the margins. This means I need two commands as follows:

: EDMAR
SPACE
55 49 DO ." ....:...." I EMIT LOOP
." ...."
;

: EDL ( l --)
15 AND 8 /MOD SCR @ B/SCR * + BLOCK

SWAP C/L * + DUP C/L BL FILL ( clear line )
IN @ TIB @ + DUP 64 + SWAP DO ( dst I= maxTib tib )
I C@ -DUP IF
OVER C! 1+
ELSE

I IN ! LEAVE
THEN
LOOP
UPDATE DROP
;

So, a bit of an editing session might look like:

Of course, I'll want to put these two commands as my first two definitions on my first editable block, though in reality I chose block 50. Does this code work? Yes, because I corrected the bugs before publishing it ;-) . When I've finished editing though I should type FLUSH to copy any remaining buffers back to disk and save the disk on my local computer so nothing gets lost. When I boot up Forth again, I'll mount that disk; then I'll type LOAD to compile the code back from source.

In the future I might modify FIG-FORTH to be standalone (or add commands so that I can use it with MS-DOS). If it's standalone, I'll allow 1kB at the beginning of the disk as a bootstrap, then 16kB for the Forth executable; so the first editable block will be number 17.

Conclusion

FIG-FORTH and most early Forth editors were crude line-oriented things which I hate, so I've no intention of just loading up those early Forth editors even though might be relatively easy. Instead I've written a minimalistic editor which I'll use to bootstrap the better editor. That's also one of the good things about Forth, if you don't like what you've got - roll something you do.

FIG-FORTH for the PC (and early FORTHs) were also very (in my opinion) clumsy systems for handling files with zero integration with the operating system, in this case MS-DOS 2.0. This version of FIG-FORTH is odd, because it runs under MS-DOS, but can't edit code on MS-DOS disks. So, I'll use an empty disk for this purpose.

One Week Wonder

Sunday, 25 July 2021

Fig-Forth At PC=Forty (Part 4)

Let's Edit!

VLIST Source Code:

The Editor Itself

Exercise For The Reader

Conclusion

Sunday, 18 July 2021

Fig-Forth At PC=Forty (Part 3)

Let's Do Some Machine Code!

BIOS Functions

Wednesday, 7 July 2021

Fig-Forth At PC=Forty (Part 2)

I Need An Editor

I Once Had a PC FIG-Forth

The Solution

Conclusion

Blog Archive

About Me

Blog Links