Showing posts with label VIC-20. Show all posts
Showing posts with label VIC-20. Show all posts

Wednesday, 8 January 2025

Basic Blitz: A Surprisingly Addictive VIC-20 Remake

The game Blitz was written and self-published by Simon Taylor for the unexpanded VIC-20 in 1981, then later sold to Mastertronic.

https://www.eurogamer.net/lost-and-found-blitz

I always thought it looked like a game that must have been written in Basic, but I never got around to testing that until the beginning of 2025.

So, here's my version in all its glorious 64 lines of code!


Mine seems to be based on the later Mastertronics' version, because my plane is just one graphic character instead of 2 or 3 and my buildings are multicoloured instead of just black. Multi-coloured buildings adds to the fun, given most actual buildings are grey.

Also, mine doesn't speed up during each flight; it does get faster per level while the number of buildings it generates also increases by 2. My current high score is 533. Game control is pretty simple: you just press 'v' to drop a bomb as the ship flies across the screen. Only one bomb can be dropped at a time.

Design

Enough of the gameplay, let's discuss the software design. The outline of the game is pretty simple:
  • Line 5 reserves memory for the graphics characters then calls a subroutine at line 9000 to generate them.
  • Line 7 defines a function to simplify random number generation.
  • Line 8 is a bit of debug, see later.
  • Line 9 resets the high score. So, this only happens once.
  • Line 10 starts a game with a width of 5 (so 5x2=10 buildings are generated) and a delay of 100 between each frame.
  • Lines 30 to 60 are the main loop of the game. It really is that tiny. The loop terminates when the  plane lands or hits a building. Within that the plane is drawn (by displaying it in its next position then erasing the previous position to avoid flicker).
  • Bomb handling is done in lines 45 to 50, but the explosion is handled in lines 200-300.
  • End of game is handled in lines 66 to 80 including displaying "Landed" or "Crashed", updating the high score and handling the user wanting to quit.
  • Line 99 resets the graphics characters back to the normal character set so that you can carry on editing it.
  • The subroutine at line 100 performs the equivalent of a PRINT AT.
  • The subroutine at lines 200 to 250 handle a bomb hitting a building (a random number of floors are destroyed).
  • The subroutine at lines 8000 to 8070 generate a new level based on W, the width of the cityscape.
  • The subroutine at lines 9000 to 9010 generates the graphics characters and sets the sound level to 5.
  • The data from lines 9012 to 9090 are the graphics characters themselves, in the sequence: 'blank', 'solid square', 3x building types, 2x roofs, plane, grass.
  • The subroutine from lines 9500 to 9520 wait for a key to be released, then pressed, returning the key in A$.

Graphics

Because VIC-20 graphics are weird, programmers end up with bespoke graphics routines, so it's always worth discussing them. Firstly, VIC-20 graphics are tile-based, somewhat like the Nintendo Entertainment System. Video memory contains character codes between 0 and 255, and each character code points to an 8x8 bit pattern at CharacterMemoryBaseAddress+(CharCode*8). Usefully, the base address for the character bit patterns (and the video base address too) can be set by poking 36869. That base address can be set to RAM (which gives the programmer 256 tiles to play with), ROM (which is the default and provides caps+graphics or a caps+lowercase+some graphics option) or can be made to straddle both (which gives the programmer up to 128 tiles to play with + an upper case character set). This is the case even though the user defined graphics (UDGs) have addresses below 8192 while the ROM tiles are above 32768, because of the way the 14-bit VIC-chip's address space is mapped to the VIC-20's, full 16-bit address space.


In practical, unexpanded VIC-20 applications, programmers will want to use as few UDGs as possible to maximise program space while retaining much of the conventional character patterns. In Basic Blitz we therefore set the graphics to straddle mode (value 0xf, giving a CharacterMemoryBaseAddress of 0x1c00) which means characters 0..127 are in RAM and 128..255 are in ROM.

Intuitively, you might imagine that you'd want to start using tile 0 first, but that would waste of most of the tile space, so in fact we always count the UDGs we need backwards from tile 63, because tiles 64 to 127 overlap with video memory itself by default (and are therefore unusable!). Also, because the VIC-20 ROM characters aren't in ASCII order, and amazingly enough don't include the filled-in graphics character I have to provide that. When Basic Blitz is run, it first shows the entire usable character set.


I added this as a bit of debug, because I initially wasn't sure the ROM characters would print out OK. Also, I then made it print Hello in red to test both my PRINT AT subroutine and embedded colour control codes.

Graphics characters can easily be printed, because they're the normal characters '6, '7', 8', 9', ':', ';', '<', '=', '>', '?'. Normal text can be displayed, but you have to force 'inverse' characters which is achieved by preceding each print statement with <ctrl>+9 and ending with a true character <ctrl>+0.

Colours

Colours on a VIC-20 are strangely limited. There's a block of colour attribute memory, one location for each video byte, but each one is only 4 bits, which means you can only select an INK colour for on pixels. The PAPER colour is global, defined by bits 4..7 of 36879. The VIC-20 partially gets around this by normally making characters 128 to 255 inverse characters, but also by defining bit 3 of 36879 as normal or inverse mode.

The upshot though is that with the ROM character sets you can choose a common PAPER colour with any INK, or the common PAPER colour as INK, with any INK colour as PAPER. But when you select the character set to straddle RAM and ROM, you can only choose any INK colour + the common PAPER colour.

Hence in Basic Blitz, the background is white (as that seems most useful) and I have to define a UDG just so that I can get a filled in green character for grass with a building on top.

Sound

BASIC Blitz, sound is pretty simple. The initialisation routine switches audio on to level 5 (POKE 36878, 5); and leaves it there. There are 3 voice channels, which are individually switched on if bit 7 is set. In practical terms, each voice has a range of about 2 octaves, the first one having values from 128 to 65; then the next octave from 64 to 33. Beyond 32, the frequency ratio between each note is 1.03 to 1.06, close to that of a semitone 1.059 making most note intervals unusably out of tune.

The plane makes a drone sound using the lowest pitch audio channel (address 36874) OR'd with the bottom 4 bits of the jiffy clock at PEEK(162).

The bomb uses the high octave channel (at 36876) just generating an ascending tone. If the bomb hits a building it's silenced and the noise channel with a fixed low pitch of 129. The important thing, finally is to turn off all the sounds when they're done, by poking the channels with 0.

Playing The Game

You can run this VIC-20 Javascript emulator and type in the code (if the keyboard mapping allows it):


I've found this emulator is better than the Dawson one for .prg files. Here's how to load the .prg on a desktop/laptop. First download the BasicBlitz.prg from my Google Drive. Then drag the file from wherever you downloaded it from to the emulator in the browser. It will automatically load and run!

However, it's also useful to be able to type in code directly for editing, debugging and other stuff.

The keyboard on my MacBook M4 doesn't map correctly to VIC-20 keys, because the emulator does a straight translation from character codes to VIC-20 keys rather than from key codes. This means that pressing Shift+':' gives you ';' on this emulator rather than '[' as marked on a VIC-20 keyboard.

Mostly this makes typing easier, but the VIC-20 uses a number of embedded attribute key combinations. Basic Blitz doesn't use many, here's how to type what it does use, it isn't easy!

In Chrome, you need enable console mode, by typing function key F12. Then tap on the Console tab. In Safari, you need to choose Safari:Settings... then Select the 'Advanced tab'; and click on "Show features for web developers" at the bottom. Then the "Develop" menu appears on the menu bar and you can then choose Develop:Show JavaScript Console.

So far so good. Now, you can type most of the text as normal, but whenever you need to type a special code, type pasteChar(theCode) in the console followed by Enter (e.g. pasteChar(147) for the clear screen code). Here are the codes you'll need:
  • Inverse 'R' => 18. This is for Reverse text, which ends with inverse nearly underline => 146.
  • Inverse '£' (Red) => 28.
  • Inverse '┓' (Black) => 144.
  • Inverse 'S' => 19 (this is the home code).
  • Inverse heart => 147 (this is the clear screen code).
  • Inverse up-arrow => 30 (this is green).
  • Inverse 'Q' and inverse '|'can be typed directly just using the down cursor and left cursor respectively.
  • The codes in line 8045 are more colour codes used for the buildings. They are 144 (Black), 28 (Red), 159 (Inverse filled diagonal=cyan), 156 (checkered-black character=purple), 30 (inverse up arrow = green), 31 (inverse left arrow=blue), 158 (inverse 'π' = yellow).

Conclusion

The original VIC-20 Blitz program, though derivative in its own way, is so simple it could have been written in BASIC, as this version proves. The arcane design of the VIC-20 hardware and its lousy BASIC implementation means there's a lot of subtle complexity even in a simple game. Finally, although there are many emulators for the VIC-20, both the Javascript implementations I know of have limitations and bugs which make distributing this game and/or modifying it non-trivial.




Tuesday, 31 December 2024

Wobbly-Blue: An Optical Illusion on a ZX Spectrum And VIC-20

 Introduction

I recently came across an interesting optical illusion whereby a speckled-blue sphere on a random checkerboard pattern will appear to wobble relative to the checker-board if you move your head. When viewed on a mobile device, as you can move the device itself and the effect is even more pronounced.

I figured I could write a simple version of a program that generated it for the ZX Spectrum, and this is the result (you need to make the image occupy a fair amount of your field-of-view):

The program is fairly small:

The ZX Spectrum has character-level colour resolution, but because the blue sphere is surrounded by a black border, it doesn't cause any clashes. I originally wanted to produce a sphere where the blue coverage in the centre was obviously larger than at the edges, but it turned out that simply taking the sine of a random angle creates a distribution that looks spherical, because the rate of change is greatest near 0º so the dots are spread out more there and concentrated near the edges. If I let it run for about 2000 points it'd probably look more prominent.

Unexpanded VIC-20 Version

Quite frequently I like to create Unexpanded VIC-20 conversions of ZX Spectrum programs, because they're fairly contemporary machines with some similar characteristics, but the VIC-20 is more challenging to program due to a lack of support in its version of BASIC.

Here's the VIC-20 version:

The VIC-20 version is full of POKES to do what the ZX Spectrum version can do with PLOT, INK and PAPER. Also, the character set is squished and there's only 22 characters per line. But it's the techniques needed to perform hi-res graphics on a VIC-20 (particularly an unexpanded VIC-20) that's the real challenge.

Firstly, the VIC-20 can only really do hi-res graphics by modifying a character set of up to 256 characters. So, if you fill the screen with unique characters and update the pixels in each character then a full bitmap display is possible. However, on a standard VIC-20 screen there are 22 x 23 characters = 506 character positions which is far more than the number of characters in the character set! The VIC-20 'fixes' this by supporting double-height characters of 16-rows each, which means you only need 253 characters to fill the screen.

The second problem is that an unexpanded VIC-20 only has 3581 bytes free when you turn it on, and 253 double-height characters + the 506 screen bytes would need 4554 bytes, which is clearly more than what's available.

However, in this case, we don't need to fill the whole screen with bitmapped graphics, only the sphere in the centre! And in fact I would only need 172 single-height characters if I also reduced the screen size to 20x20 characters! 172 characters needs just 1882 bytes including the screen bytes. This leaves just over 2kB for my program!

How is this done? Well, I could work out which characters in the centre will be filled with the sphere's pixels and print unique characters for them, but it's easier to use the kind of tile-allocation technique you might use for a video game. You use some characters as background (in this case we only need one: character 255, which is filled with 8x $ff's).

Then whenever you want to plot properly on the screen you find out which character is being used at the character location at (INT(x/8),INT(y/8)). If it's not 255, then you can then look up the character in the character set memory; select the right row (Y & 7) then set the right pixel (128>>(X & 7)). Otherwise you allocate the next character code (denoted by UG% in the listing) and then fill in the pixel as before. It doesn't matter if the characters on the screen aren't allocated in order, because one simply gets the correct bitmap address from the character code itself.

The resultant program is similar to the ZX Spectrum version, except that a couple of subroutines are added. Line 1 allocates space for the new character set by setting the end of BASIC (and string stack) to 6143. Then 6144 onwards can be used. Lines 500 to 540 create the initial graphics setup. The screen size is set to 20x20 instead of 22x23. PAPER is set to black with the screen in non-inverse mode; the new character set is filled with 0s; the screen is filled with character code 255 and character 255 is filled with 255s too (all done with one loop). Finally, UG% is set to 0, as that's the first character code we'll allocate.

Lines 1000 to 1040 are very similar to the ZX Spectrum version except it works by setting the INK colour of each character to white or black for each checkerboard location.

Lines 200 to 220 are the plot routine discussed above. It also has to POKE colour memory (from 38400 onwards) to make the INK colour at that location Blue (colour 6). Finally we can run the program:


Pixels on a ZX Spectrum are square, but on a VIC-20 they're squished too, so the sphere is oblate. Total video memory is 2048b, and including 400b of screen memory that's room for 206 bitmap chars. So, I could have increased the checkerboard resolution by allocating 16 characters, taking the total up to 188.

What Causes The Effect?

I did a bit of searching for how the illusion works, but all I found was articles on how colour aberration can cause red or green colours to stand out in a sort-of 3D effect. But here's a simple theory. Human eyes are 10x less sensitive to blue than other primary colours and much more sensitive to luminance than colour. So, it's likely that the brain needs to do more processing for colour than for monochrome images; more processing for blue and more processing for sparse images (like this sphere).

That would make sense: the least sensitive blue cones might take longer to fire than the more sensitive rods (because it takes longer for enough energy to make them fire). There might be more neurone layers for making sense of colour image; more neurone layers for making sense of a sphere than a set of blocks and finally more neurone layers for making sense of a sparse image than a solid one.

All the extra processing causes a lag in processing; which means that when you move your head (or move the screen); the checkerboard pattern moves quickly in your field of view, but your other neurones take time to reconstruct the sparse, blue, sphere.



Sunday, 8 May 2022

A Tale Of Two Banners: VIC-20

 I previously posted about a Banner program I wrote for the 40th anniversary of the ZX Spectrum. One of my friend's followers tweeted that he always thought my friend was a Commodore C64 owner, who could PEEK and POKE with the best of them.

This set me thinking - what would a Commodore C64 version be like? And then, because a C64 version would be too easy, what would an unexpanded VIC-20 version be like? For sure, it's more challenging than a ZX Spectrum version.

Here's a bunch of reasons why:

  • The VIC-20 has a smaller screen area, just 22 x 23 characters; so an 8x5 character banner can't be done with 8x8 pixel characters.
  • The VIC-20 has no graphics commands. It can redefine the character set and that's about it. It can't easily PRINT AT any location on the screen.
  • The VIC-20 supports an ink colour per character, but only a global paper colour. That's because it only has 4-bits per colour byte instead of 8-bits (which gives room for an ink + paper per character). Therefore, I can't use the same colour trick as the ZX Spectrum.
  • The VIC-20's INKEY$ function (GET stringVar$) doesn't return proper ASCII upper and lower case characters, but PETSCII codes (weird).
  • The VIC-20 fouls up the character set pointer when you press Shift+[C=].
Nevertheless, I was able to do it, and here I'll describe how:














Smaller Screen Area

The VIC-20 has a smaller screen area, and if I understand it correctly, the screen can't be more than 512 characters (though they can be double-height!). Normally the screen is 22x23 characters, which isn't enough to fit 8 characters across made up from Battenberg, 4x4 pixels each. You'd need 32 characters across for that. However, it's almost enough to support 8 characters across made up from 6x6 block graphic fonts from 4x4 pixel Battenberg graphics.

And... the VIC-20 screen size can be redefined. By making it 24x21 there's room for 8 characters across x 7 characters down, even more than the ZX Spectrum!

Of course, on a VIC-20 it has to be done using POKEs:

1000 A=7504:POKE 36864,10: POKE 36867,42: POKE 36866,152


So, 36867 is the number of rows, *2 in bits 1..6, Address 36866 is the number of columns in bits 0..6. The default values were 46 and 150 respectively, so I changed them to 21*2 for 21 rows and 152 for 24 columns.

Where does all the information about the POKEs come from? Well the most concise information I've found is from here, an extensive resource on the VIC-20 memory map.

The values can be directly poked in, though I'd start with changing the dimension that gets smaller, so that the screen area is always <512b.

Finally, we need to adjust the left-hand side of the screen so that it's better centred. Address 36864 does that and changing it to 10 was found by experimentation.

There Are No Graphics Commands

However, the VIC-20 can display graphics characters, and there are Battenberg block graphics characters inherited from the Commodore PET. Strangely, and unlike the ZX81 or ZX Spectrum, they don't have a very logical order. Instead, in the sequence I'd use, the codes are:

9000 DATA 32, 124, 126, 226, 108, 225, 127, 251

9010 DATA 123, 255, 97, 236, 98, 254, 252, 160


Given that we have all the Block character graphics selected, all we need to do now is define the character set in terms of them. Unfortunately, that's not trivial either.  The first thing I did was to take a 6x6 bitmapped character set I'd used for a FIGnition example program:


 

















I have a java program which reads opens the image as a .png and then copies the pixels to an array where they can be subsequently transformed into a different image format.

I needed to be able to transform the character bitmaps so they could be represented in VIC-20 BASIC. I couldn't encode them as proper full bytes, because all 256 symbols can't be typed. I could have encoded them as 6-bit text, but again, the odd non-ascii use of VIC-20 characters made that more complex. So, I simply encoded them as 4-bit text using the characters A..O and then indexing each character (-65) in an array of Battenberg graphic characters. This meant the 96 printable characters would take up 864 bytes in themselves+ some overhead for the individual lines and BASIC commands, a good chunk the unexpanded VIC-20's 3.5kB memory space! Encoding as 6-bits could would have saved 33%, about 288 bytes.

Unfortunately, it wasn't likely to be feasible to just store the whole font in strings, so I figured that I could store them in DATA statements and then do RESTORE line to point to the right data statement where the character I wanted was defined.

Unfortunately, the VIC-20 only supports RESTORE to the beginning of the program. So, instead - yet again (and this is a common theme) I had to use memory PEEKing. I placed the data statements at the end, and when I'd read all the other data in the setup, I stored the system variable for where the DATA statement pointer was, and then literally PEEKed the right memory location to get the bytes.

It's possible to do a PRINT AT on a VIC-20 by printing the home and cursor control characters. Home is an inverse S, which you can display by literally typing PRINT " and then the home key, because the VIC-20 re-interprets keystrokes within quotes and similarly, you can move the print position to different locations by typing PRINT " and then the cursor keys themselves, for the same reason. This means that the VIC-20's screen editor, which is usually easy to use turns into a pain within quotes, because moving the cursor starts overwriting the rest of the text, so you have to wrestle with it to get it back into proper cursor mode (typing " usually works).

And colours work the same way, you type PRINT " and then a colour key and it will change the INK colour.

So, you can assign these to strings and then print "[HomeKey]";LEFT$(CD$,Y);LEFT$(CR$,X); to get the the right location, but it's fairly slow compared with poking directly into screen memory at 7680+22*row+column and of course, the cursor key technique doesn't work when the screen dimensions have been changed!

So, POKEing the screen is the best solution and you have to poke the colour attribute byte too, because the VIC-20 for some reason doesn't fill it in when it displays spaces. Clear screen, for example (PRINT "[Shift+Home]"; ) doesn't fill the attribute bytes with the current ink colour; it just clears the text bytes.

This is why in the real code I have to clear them explicitly:

FOR F=7680 TO 8183:POKE F,42:NEXT F

And the reason why it's code 42 and not 32 will be explained next:

Producing The Diagonal Stripes

I was pleased with how I generated the diagonal stripes on the ZX Spectrum, as it's a challenge when only 2 colours are allowed per character, and, helpfully enough, the VIC-20 does have a diagonal character!

Yet, doing the same thing on a VIC-20 is several times harder, because only 1 unique foreground colour can be defined per character and clearly we need two. Yet, it is just about possible, but only just!

The solution is that the VIC-20 supports 2-bits per pixel colours on a character-by-character basis, by setting bit 3 of every colour attribute byte. Each bit pair then selects one of four possible colours:

00: Which is the paper colour, bits 7-4 of location 36879.
01: Which is the border colour, bits 3-0 of location 36879.
10: Which is the auxiliary colour, bits 7-4 of location 36878 (the bottom 4 bits are the sound volume level).
11: Which is the ink colour of the character.

This means that one diagonal half can have a choice of 3 possible colours, while the other diagonal half (ink) can have a choice of 7 possible colours. We need to handle 5 colours: the black background (paper), Red, Yellow, Green and Cyan.





Using pairs of pixels also forces us to pair up the rows in the UDGs giving us an effective resolution of 4x4 for each character. You can see that the stripes are more blocky than an ideal 8x8 diagonal would be.

It also means we can't use the standard VIC-20 diagonal graphics character, because we actually need 5 different types of diagonal characters with bit pair combinations of xx/11 and 11/xx. This means we have to allocate space for a character set and in turn that means we can't use the built-in block graphics characters, we have to defines copies of those too. In total we need 16+5 characters (though in fact I used 16+6). In essence, then we need to first allocate space for the graphics characters:

5 POKE 52,29:POKE 51,80:POKE 56,29:POKE 55,80:PRINT CHR$(8);:CLR

Allocate the character set pointer to give us 64 graphics (thus the first character will be at code 64-6-16 = 42) and assign the Auxiliary and background colours.

1100 POKE 36878,112:POKE 36869,255:POKE 646,1:POKE 36879,11:P=7680

Copy over the block graphics from ROM (we could calculate them, but this is easier).

1010 READ P:P=P*8+32768

1015 FOR F=0 TO 7:POKE F+A,PEEK(P+F):NEXT F

1020 A=A+8:IF A<7632 THEN 1010

...

9000 DATA 32, 124, 126, 226, 108, 225, 127, 251

9010 DATA 123, 255, 97, 236, 98, 254, 252, 160


Generate the stripes characters:

1030 READ N,M:FOR F=0 TO 6 STEP 2:POKE A+F,N:POKE A+F+1,N:N=(N*4+M)AND 255:NEXT F

1040 A=A+8:IF A<7680 THEN 1030

...

9020 DATA 2,2,168,0,86,2,169,1,254,2,171,3


Clear the screen the hard way:

1105 FOR F=7680 TO 8183:POKE F,42:NEXT F:PRINT “[Home]”;


Then read the character codes for the stripes and place them at the right locations.

1120 FOR X=0 TO 4:READ N,M:P=8176+X

1130 FOR F=0 TO 7-X:POKE P+30720,M:POKE P,N:P=P-23:NEXT F

1140 NEXT X

...

9030 DATA 58,10,63,10,62,13,61,13,60,8


Ascii Code Conversions & Stopping Case Swapping

You can swap between Capitals + Graphics and Capitals and Lower Case (+ a few graphics) on the VIC-20 using Shift+[C=]. However, this doesn't affect what character codes are read by GET x$. Normal lower-case characters return upper-case ASCII characters and holding down shift gives the same codes + 128.

Fortunately, that's just a simple case of mapping the characters:

111 K$=CHR$((ASC(K$)+(32 AND (K$>=“A” AND K$<=“Z”)))AND 127)

Also, fixing the case swapping issue is fairly easy, it's done by printing a control character: PRINT CHR$(8) in line 5.

Conclusion

Early 80s computers had to be creative with graphics hardware, because the relatively high memory costs limited graphics detail, and lower memory bandwidth limited the range of colours. The ZX Spectrum and VIC-20, at first sight provided a very similar style of graphics, using 1 bit per pixel + an attribute byte for colour per character, but short-cuts in the colour memory (only 4-bits per character instead of 8) added even more limitations.

Consequently, porting a program from one architecture to another often involved a lot of additional work to map or work around the respective limitations. In the case of the VIC-20, a critical aspect of the Banner program (the diagonal red, yellow, green and cyan stripes against a black background) were only made possible by the VIC chip's ability to support 2 bit per pixel multi-colour graphics, plus the ability of one of those colours to be the ink colour at the character. An ordinary 2 bit per pixel graphics mode, such as that offered by the 6847 graphics chip could not have reproduced the stripes, even though, at a minimum, 96x84 pixels graphics would need 2kB of RAM vs the 932 bytes of RAM actually used.

Finally, even the differences in the implementation of what was accepted as the standard microcomputer language: BASIC could have serious ramifications; and often hacking directly into the OS or memory map was the only solution.

The Banner program is a great, and simple way of exploring the architectural differences, and at the end of it, it's fun to type out colourful chunky characters across the whole screen!

The Listing

Finally, here's the listing! There's about 1kB free on the unexpanded VIC-20 once it's been typed in. In VICE it's possible to copy and paste a line at a time, but you need to convert the characters to lower-case first!

5 POKE 52,29:POKE 51,80:POKE 56,29:POKE 55,80:PRINT CHR$(8);:CLR

10 POKE 36869,240:GOSUB 1000

100 FOR X=0 TO 2:BG(X)=PEEK(P+48+X):POKE P+48+X,54:NEXT X

110 GET K$:IF K$=“” THEN 110

111 K$=CHR$((ASC(K$)+(32 AND (K$>=“A” AND K$<=“Z”)))AND 127)

112 F=ASC(K$):IF F<32 AND F<>13 THEN 110

113 FOR X=0 TO 2:POKE P+48+X,BG(X)::NEXT X

116 IF ASC(K$)=13 THEN P=INT((P-7680)/24)*24+7752:GOTO 160

120 C=ASC(K$)-32:C=C0+(C AND 3)*9+INT(C/4)*44

130 I=INT(RND(0)*7)+1

140 FOR Y=0 TO 2: FOR X=0 TO 2:POKE P+X,PEEK(C+X)-23:POKE P+X+30720,I:NEXT X

150 P=P+24:C=C+3:NEXT Y:P=P-69

155 P=P-7680:P=(P-INT(P/24)*24)+INT((P+48)/72)*72+7680

160 IF P>=8184 THEN P=7680

170 GOTO 100

999 POKE 36869,240:POKE 36864,12:POKE 36866,150:POKE 36867,174:STOP

1000 A=7504:POKE 36864,10: POKE 36867,42: POKE 36866,152

1005 DIM BG(3)

1010 READ P:P=P*8+32768

1015 FOR F=0 TO 7:POKE F+A,PEEK(P+F):NEXT F

1020 A=A+8:IF A<7632 THEN 1010

1030 READ N,M:FOR F=0 TO 6 STEP 2:POKE A+F,N:POKE A+F+1,N:N=(N*4+M)AND 255:NEXT F

1040 A=A+8:IF A<7680 THEN 1030

1100 POKE 36878,112:POKE 36869,255:POKE 646,1:POKE 36879,11:P=7680

1105 FOR F=7680 TO 8183:POKE F,42:NEXT F:PRINT “[Home]”;

1120 FOR X=0 TO 4:READ N,M:P=8176+X

1130 FOR F=0 TO 7-X:POKE P+30720,M:POKE P,N:P=P-23:NEXT F

1140 NEXT X

1150 C0=PEEK(65)+256*PEEK(66)+7:P=7680

1999 RETURN

9000 DATA 32, 124, 126, 226, 108, 225, 127, 251

9010 DATA 123, 255, 97, 236, 98, 254, 252, 160

9020 DATA 2,2,168,0,86,2,169,1,254,2,171,3

9030 DATA 58,10,63,10,62,13,61,13,60,8

9100 DATA“AAAAAAAAAAKAACAACAFFAAAAAAANNINNIBBA"

9110 DATA"AOIBOADKAPECEGICBCJIAJMCBCCAKAAAAAAA"

9120 DATA"AJAAKAABAAGAAFAACAIIIFPACCCAKADLCACA"

9130 DATA"AAAAEAACAAAADDCAAAAAAAAAACAAECECACAA"

9140 DATA"JHIOCKBDAEKAAKABDADDIJDADDCDDIBDIDDA"

9150 DATA"EHAONIABALDCDDIDDAEDCLDIBDADDKAJAACA"

9160 DATA"JDIJDIBDAJDIBHCBCAAIAAIAAAAAAAACAECA"

9170 DATA"AJABIAABAMMIMMIAAABIAAJABAAJDIADAACA"

9180 DATA"JLIKDCBDCJDILDKCACLDILDIDDAJDCKAABDC"

9190 DATA"LGAKECDCALDCLDADDCLDCLDACAAJDCKDKBDC"

9200 DATA"KAKLDKCACBLAAKABDAAHCAFABCAFECFGABAC"

9210 DATA"FAAFAABDCOEKKCKCACOAKKGKCACJDIKAKBDA"

9220 DATA"LDILDACAAJDIKGKBDCJDILLACBCJDABDIBDA"

9230 DATA"DLCAKAACAKAKKAKBDAKAKGECACAKAKKKKBBA"

9240 DATA"GECEGACACGECAKAACADHCECADDCALAAKAADA"

9250 DATA"GAAAGAAACAHAAFAADAEGAAAAAAAAAAAAAMMM"

9260 DATA"EDAHCADDCEMAKFABDAOIAKFADCAEIAKAABCA"

9270 DATA"ENAKFABDAEIALDABDAEDAFDABAAEMAGNAEJA"

9280 DATA"OIAKFACBAAIAAIAACAACAAKAECAKAALLACBA"

9290 DATA"KAAKAABCAEIAPFACBAMIAKFACBAEIAKFABCA"

9300 DATA"JGAOJACAAJGAGNAABCAMAFAABAAAMABIADAA"

9310 DATA"FIAFAAADAIEAKFABCAIEAOCACAAIEAPPABCA"

9320 DATA"IEAFKACBAIEAGNAEJAMMAECADDAAJABKAABA"

9330 DATA"AKAAKAACABIAALABAAJJAAAAAAAJHIKHKBDA”



Wednesday, 8 March 2017

uxForth: Unexpanded forth for a standard VIC-20. Part 3, the memory map

I'm the developer of the DIY 8-bit computer FIGnition, but it doesn't mean I'm not interested in other retro computers and the idea of developing a minimal Forth for the ancient, but cute Commodore VIC-20 is irresistable!

Part 1 talks about the appeal of the VIC-20 and what a squeeze it will be to fit Forth into it's meagre RAM.

In Part 2 I discussed choices for the inner interpreter and found out that a token Forth model could be both compact and about as fast as DTC.

Now I'm going to allocate the various parts of the Forth system to VIC-20 memory to make the best of what's there. Some of it will be fairly conventional and some somewhat unorthodox.

(An Aside, The Slow uxForth Development Process)

From the presentation of the blog entries it looks like I'm working these things out as I'm going along. For example, it's worthwhile asking why it looks like I can leap to fairly concrete decisions about the inner interpreter or even that I think I'll be able to fit the entire system into the available space.

The simple answer is that I've already done much of the work to make this possible. I've already written the code that implements the primitives (in fact I've written, modified and rewritten it a few times as I've improved it). I've made use of the wonderful resources at 6502 org, particularly the idea of splitting the instruction pointer (called gIp in my implementation) into a page offset and using the Y register to hold the byte offset: it really does improve the performance of the core Next function.

Similarly, I've written the non-primitive code and accounted for the space. It's written in Forth with a home-brew meta-forth compiler written in 'C'. So, there will be a future blog on that too!

However, it's not a cheat as such. The code is not tested yet; nor even loaded into a real VIC-20 nor emulator (I don't have a real VIC-20 :-( ). I have real decisions to make as the blog continues, which means I can make real mistakes too and have to correct them. What I've done, really, is basically a feasibility study, so that you don't waste your time reading the blog. And of course, the whole of uxForth will be released publicly, on a GPL licence via my GitHub account.

Admittedly, it's being released slowly, a 2.75Kb program I hope to release over the course of 2017!

The Memory Map

Page 0

Page 0 is the gold dust of every 6502 system: versatile and in short supply. BASIC uses the first 0x90 bytes and the KERNAL uses the rest. We'll use all 0x90 bytes for the data stack and some key system variables:


Addr Size Name Comment
$00 2 gIp Instruction pointer, lower byte always 0.
$02 1 gTmpLo Temporary byte
$03 1 gTmpHi Temporary byte used for indirect access.
$04 2 gILimit The limit for the inner-most do.. loop. uxForth (and FIGnition Forth) differ from most Forths in that the inner most loops values, the limit and the current value are held in global locations. do causes the previous gILimit and gCurrent to be pushed to the stack; thus r is equivalent to j on other forths.
$06 2 gICount The current loop count for the inner-most do.. loop.
$08 1 gUpState The current compilation state.
$09 1 gUpBase The current number base
$0a 2 gUpDp The current dictionary pointer.
$0c 2 gUpLast A pointer to the header of the most recent dictionary entry compiled
$0e 2 gUpTib The pointer to the input buffer (I'm not sure if we need this)
$10 128 gDs The data stack
$fb 2 gTmpPtr0 Spare pointer 0
$fd 2 gTmpPtr1 Spare pointer 1

Page 1

Page 1 is the return stack as you might expect. Oddly enough, we only get 192b, because the KERNAL uses $100 to $13F.

Page 2

There are 89 bytes available here, because they're used by BASIC. I plan to use them for the byte code vectors which are:

# Name # Name # Name # Name
$00 (nop) $0b (+loop) $16 u/ $21 rp!
$01 ;s $0c 0< $17 @ $22 drop
$02 exec $0d 0= $18 c@ $23 dup
$03 (native) $0e + $19 ! $24 over
$04 (lit8) $0f neg $1a c! $25 swap
$05 (lit16) $10 and $1b r> $26 (vardoes)
$06 0 $11 or $1c >r $27 (constdoes)
$07 (0branch) $12 xor $1d r $28 inkey
$08 (branch) $13 >> $1e sp@ $29 emit
$09 (do) $14 << $1f sp! $2a at
$0a (loop) $15 * $20 rp@ $2b

The codes that are greyed out have no names in the dictionary to save space; the way you'd insert them into code would be with [ nn c, ] sequences.

Page 3 and Page 4

There are a total of 116 bytes free from $2A0 to $313, I'll fill that area with some of the actual native definitions.

The cassette buffer is at $33c to $3fb. We'll be using the cassette for storage so we can't use it for code. 

Pages 16 to 31 ish ($1000 to $1dff)

This is the area of RAM reserved for BASIC. It will contain the rest of the Forth system.

The screen RAM ($1e00 to $1ff9)

The end of RAM for an unexpanded VIC-20 is used for the screen. The plan here is to use that area for the editing space.  Instead of implementing a line editor (ACCEPT in FIG-forth and early FIGnition Forth), we use key to call the KERNAL editor and allow it to manage the editing of a line including cursor movement. Pressing Return doesn't execute the command line, instead, pressing F1 exits the editor and sets the interpretation point to the current cursor position. The end of the interpretation point is set to the end of the screen and emit is turned off until interpretation gets to the end of the screen. Importantly, pressing return doesn't start interpretation.

In addition, pressing F2 saves the screen bytes onto cassette.

This is how I'll implement storage in a fairly minimal way. By implementing save via F2 I can save a block (actually the 506 screen bytes are roughly half a traditional block), but LOAD is a normal word, so multiple blocks can be loaded (you just add load to the end of the block).

So, this is how you'd do normal editing operations. For normal words you would place the cursor near the end of the screen and edit to the end of the screen; cursor to return to the first character you want to interpret and then press F1. In a sense this is easy, because you can just press Return and then cursor up until you get there. The same method would also work if you wanted to compile a whole screen's worth of code. Load itself would reset the cursor position to [home] and then return to the interpreter, so placing a load at the end of the screen would load the next screen without any recursion. That way you'd be able to develop programs that were longer than just one screen without manual reloading.

Conclusion

In the memory allocation of uxForth, we've squirrelled away about 1053 bytes of RAM, embedding the line buffer in the screen and a number of system variables in page 0. We've also included 212 bytes of what we'd use for the program proper. It won't get much better than this!

In the next post I hope to talk in more detail about the implementation of the primitive words and the code used to test them.

Sunday, 6 November 2016

uxForth: Unexpanded forth for a standard VIC-20. Part 2, the inner interpreter.

I'm the developer of the DIY 8-bit computer FIGnition, but it doesn't mean I'm not interested in other retro computers and the idea of developing a minimal Forth for the ancient, but cute Commodore VIC-20 is irresistable!

In the first part I talked about the appeal of the VIC-20 and how much usable RAM I thought I could squeeze out of it.

That turned out to be between 3947 bytes and 4646 bytes depending on whether we count the screen and the CPU stack. And this sounded more credible, except that I want at least 1Kb of RAM for user programs which brings me back to 2923 to 3622 bytes. A terrible squeeze after all.

There's one obvious way to tackle that: use the Token Forth model. A definitive articles covering all the trade-offs with developing Forth are in the series "Moving Forth" by Brad Rodriguez, but here, we just need to recap on the most popular Forth models.

Forth Execution Models

Forth normally represents its programs as lists of addresses which eventually point to machine code. The mechanism is handled by the inner Forth interpreter called "Next". The traditional Forth model implements what's called Indirect Threaded Code.



Here, each forth command (in blue) points to an indirect address (in green) which points to some machine code (in pink). Primitive commands in Forth (like DUP, >R, SWAP and R> here) have an indirect address which points to the next RAM location where the machine code starts. But commands written in Forth itself (like ROT) start with an indirect address which points to ENTER which implements a function call and is then followed by more Forth commands (in blue). A Forth command like this then ends in EXIT, which returns Forth execution to the next calling function (MYFUNC).

The next Forth model to consider is Direct Threaded Code. Here's the same thing:


Here, every forth command (in blue) points directly to machine code (in pink). Primitive commands are executed directly, but commands written in Forth itself (like ROT) start with a "JSR Enter" machine code instruction which saves the return address (to Forth code) on the normal stack and in the DTC Forth, this return address is used as the new Forth Instruction Pointer after pushing the old IP. We can see that DTC will normally be faster than ITC because there's less indirection.

Token Threaded Forth is essentially a byte-coded Forth, except that in the case of commands written in Forth itself, the NEXT routine uses the top bit of the token to denote an address. Thus, only a maximum of 128 tokens can be supported and only 32Kb of Forth code.


In this example, we can see that the Forth code has been reduced from 14 bytes to 8b, but there is a jump table of addresses which is the same size as the indirect entries in ITC (10b used for these entries). DTC used an additional JSR (3 bytes) for the ':' defined word, but TTC didn't need any extra bytes for the ':' definition (it uses a single bit, encoded in the $93A0 address). Here, the overhead of ITC weighs in at 24 bytes, TTC weighs in at 18 bytes and DTC weighs in at 17 bytes.

We can see that TTC could significantly reduce the size of Forth code if the forth tokens are used often enough, but traditionally a byte-coded interpreter is slower than a threaded code interpreter. uxForth won't beat a DTC Forth, so the question is whether it can compete with an ITC Forth.

Execution Timings

ITC Forth:

NEXT      LDY #1
          LDA (IP),Y ;Fetch
          STA W+1    ;Indirect
          DEY        ;Addr
          LDA (IP),Y ;to
          STA W      ;W
          CLC        ;Inc
          LDA IP
          ADC #2
          STA IP     ;IP.lo
          BCC L54    ;If CY,
          INC IP+1   ;inc IP.hi
L54       JMP IndW
IndW:  .byte $6c ;JMP () opcode
W         .word 0

This is the implementation from the original 6502 FIG Forth. It uses zero-page for IP and W. The indirection is achieved by jumping to an indirect Jump.  It requires 41 cycles.

DTC Forth

NEXT      LDY #1
          LDA (IP),Y ;Fetch
          STA W+1    ;Indirect
          DEY        ;Addr
          LDA (IP),Y ;to
          STA W      ;W
          CLC        ;Inc
          LDA IP
          ADC #2
          STA IP     ;IP.lo
          BCC L54    ;If CY,
          INC IP+1   ;inc IP.hi
L54       JMP (W)
W         .word 0

This is a simple derivation from the original 6502 FIG Forth. As before it uses zero-page for IP and W. The indirection is achieved using a simple indirection.  It requires 36 cycles.

UxForth

Next:
lda (gIp),Y ;byte code.
asla ;*2 for gVecs index
                        ;Also 'Enter' bit in Carry
iny ;inc gIp.lo
beq Next10 ;page inc?
;no page inc, fall-through.
Next5:
bcc Enter ;Handle Enter.
Next7:
sta Next7+4 ;modify following Jmp.
jmp (gVecs) ;exec byte code.
Next10:
inc gIp+1 ;inc page
bcs Next7 ;now handle token/enter.

This is the proposed UxForth implementation. The UxForth version has to handle both multiplying the token by 2 to get the index for the jump table (gVecs) and testing to see if it's a call to another Forth routine (bcc Enter). It requires 22 cycles, so we can see that it's almost twice as fast as the ITC version. This is because it has one natural advantage and uses several techniques to improve the speed:

  • Y is used to hold the low byte of IP, thus when we execute lda (gIP),Y , only the upper byte of gIP is used, the lower byte is always 0.
  • Branches are arranged so that the common case is the fall-through case. Thus when IP increments over a page boundary two jumps are needed.
  • We normally only have to read one instruction byte instead of two. This is the one natural advantage TTC has over ITC or DTC.
  • The vector is stored directly in the code (the second byte of jmp (gVecs) ).
18b vs 26b for ITC Forth and 25b for DTC forth. It's possible to use most of these techniques to improve the speed of ITC and DTC 6502 Forth, but I'm not so concerned about that, because the easiest to access VIC-20 Forth is the Datatronic Forth (which is an 8Kb ROM cartridge) and Datatronic Forth uses exactly the same version of NEXT as FIG-Forth.


Conclusion

RAM is still very tight, but we can reduce its usage by implementing a byte-coded Forth and we should find it's perhaps up to twice as fast as a traditional FIG-FORTH implementation.

In the next post we'll look at how we might map our Forth implementation to the available RAM regions!

Thursday, 3 November 2016

uxForth: Unexpanded Forth For A Standard VIC-20

I'm the developer of the DIY 8-bit computer FIGnition, but it doesn't mean I'm not interested in other retro computers and as far as it goes, the Commodore VIC-20 is one of the cutest to come out of the 1980 stables.

The VIC-20 was cute, because it had a combination of fun and dumb features. Like: a full quality 65 key keyboard - and only two cursor keys!


Or the ability to support business applications with a floppy disk drive, but only having 23 column text. Or multi-colour graphics (and even a 2-bit per pixel mode that can co-exist with a 1-bit per pixel mode) with a near complete lack of support for bitmapped graphics. Or it's 16Kb of ROM and only 5Kb of RAM (with just 3582bytes free when Basic boots).

So, the fun challenge here is to see how much of a Forth I can squeeze into the Basic, unexpanded VIC-20 given the RAM limitations. I'm pretty confident I can do this, given that a super-tiny Forth subset has been crammed into just 1Kb of 8086 code (itsy Forth). I'm aiming for something that's kinda more usable.

Dude, Where's My RAM?

The first step (and this is the topic of this blog) is to find out how much RAM we can really use. A VIC-20 boots up and proudly displays: 

But it actually has 5Kb, so where has that other 1.5Kb gone? Armed with a detailed VIC-20 memory map we can see that areas of the first 1Kb have been nicked by Basic and the Kernal, which is a set of OS services abstracted from Basic and forms part of the ROM. For our purposes we don't want to use Basic, but we do want to use the Kernal, so we can read the keyboard, display to the screen and input/output between peripherals. For some of this 1Kb it's obvious which is used by Basic, but not all. So, here I decided to use the VIC-20 ROM disassembly. I first worked out that the Kernal starts at the address $E475, or thereabouts by observing that the rest of that code doesn't reference Basic. So, then I looked up all the system variables used by that section of code and found this set of addresses:

01,X 0100+1,X 0200,X 0259,Y 0263,Y 026D,Y 0277-1,X 0277
0277+1,X 0281 0282 0283 0284 0285 0286 0287
0288 0289 028A 028B 028C 028D 028E 028F
0290 0291 0292 0293 0293,Y 0294 0297 0298
0299 029A 029B 029C 029D 029E 029F 02A0
0300,X 0314 0314,Y 0315
(EABF) 0314 IRQ vector
(FED2) 0316 BRK vector
(FEAD) 0318 NMI vector
(F40A) 031A open a logical file
(F34A) 031C close a specified logical file
(F2C7) 031E open channel for input
(F309) 0320 open channel for output
(F3F3) 0322 close input and output channels
(F20E) 0324 input character from channel
(F27A) 0326 output character to channel
(F770) 0328 scan stop key
(F1F5) 032A get character from keyboard queue
(F3EF) 032C close all channels and files
(FED2) 032E user function
(F549) 0330 load
(F685) 0332 save

C533
C677

I also searched the Kernal code to find references to addresses within the Basic part of the ROM and found none, which meant that Basic sits properly on top of the Kernal. So, this tells us what areas of RAM we can use and it's as follows:

Address Range Size Owner Available for Forth?
$000 .. $08F $090 BASIC Yes, it's Page 0 how could we avoid it :-) ?
$090 .. $0FA KERNAL No.
$0FB .. $0FE $004 Nothing Yes, not sure yet what for.
$0100..$013E KERNAL Unlikely (tape error log/correction buffer)
$013F..$01FF $0C1 CPU Yes, for the return stack
$0200..$0259 $05A BASIC Yes, for Forth code
$025A..$029F KERNAL No.
$02A0..$02FF $060 None Free, more Forth code.
$0300..$0313 $013 BASIC Yes.
$0314..$03FF KERNAL No.
$033C..$03FB KERNAL Cassette buffer, maybe, but limited usage.
$03FC..$03FF $004 Free Some Vars?
$1000..$1DFF $E00 BASIC Forth code
$1E00..$1FF9 $1FA VIC (Screen)
$1FFA..$1FFF $006 Free 6b, more vars?

This gives us a total of $F6B (3947) bytes, or 4453 bytes if we can use the screen, or 4140 (4646) bytes if we include the CPU stack, which of course we will.

In the next part we'll make some basic decisions about the uxForth model, this will help us decide how to use all these areas.