Wednesday, 31 December 2025

Personal Computer World Sun SparcStation-1 Review

While doing some research on the Sun Sparcstation-1 I was finding it difficult to search for pages that actually gave the prices of anything other than the basic diskless configuration. However, the review of the SparcStation in Personal Computer World (June 1989) in fact does. It's a comprehensive review and word reading! 

This post is almost entirely just pictures!








The critical information I was looking for is near the end.

The entry-level price for a SPARCstation 1 with 8Mbytes of RAM, a single 1.44Mbyte floppy drive, a standard 1152x900 video board and a 17in grey-scale screen is £7400, while a system with two
104Mbyte hard disks and the same video board driving a 16in 256-colour screen costs £12,700. With two hard disks, a 19in colour screen and the GX graphics accelerator, the price goes up to £16,400.

Conclusion

The Sun SparcStation 1 was a ground-breaking RISC workstation with a performance and price that matched x86 (i.e. the recently released i486) PCs for at least the next 12 months. I'm writing a series of blog posts on the machine. However, because data relating to the computer is getting lost to the internet over time it shows that having actual hard-copy magazines of the era can provide better and faster answers than Google (or another search engine).












TI-30LCD Says Hello & Does Stats!

There were a wide variety of scientific calculators in our Maths class in 1981. In 2025 I bought a Casio FX-180P to compensate for my original Casio FX-180P that got broken! It's great calculator with lots of features.

Although Casio calculators were the most popular at school at the time, other schoolmates had different makes. In my mind, I recall the TI-30LCD was one of the worst, but was it really that bad?

So, I bought one off eBay too! It turns out it's quite easy to make it says Hello!



Introduction

This was Texas Instruments' first attempt to update their much earlier TI-30 LED scientific calculator, which had been quite popular. They copied the late 70's brushed-aluminium look to make it seem more trendy and Japanese. The sideways view was pretty ugly though, because they tried to make it really thin, but then stuffed it up by adding a bulky 'AA' battery compartment instead of lithium coin cells!



 
Why? Cheap batteries? To make it incline to compensate for the poor LCD contrast?

It does have one nice touch, the [ON/C] button is slightly recessed: 1mm lower than the others and has a prominent surround to stop users accidentally turning it on when it's bouncing around in a haversack.

The buttons are pretty stiff though, requiring some force before clunking down, which although I think these are a bit worn after 44 years, it's much like I remember when borrowing a school-mate's TI-30 briefly. The layout is quite clever though; adding keys on both sides of the numbers makes for a pleasing symmetry.

On the positive side it can do the basic scientific stuff: π, Trig, logs, factorial and BOOBS! It's only got 8-digits + a dedicated '-', so 99-BOOBS- is its limit! Also interestingly, it displays "Error" in text using the LCD segments whereas Casio calculators just said: "E"

It's slow and inaccurate. 69! takes 1.34s on my FX-180P, but 7s on the TI-30 😮 ! The sin⁻ⁱ(cos⁻ⁱ(tan⁻ⁱ(tan(cos(sin(1)) test says 1.4756033, whereas my FX-180P says: 1.00020289 (it should be 1).

Finally, it lacks a specific stats mode which makes computing standard deviations a real pain and only has 4 levels of brackets (the Casios had 6). OTOH it can convert from degrees to radians if you press a number then [INV] [DRG>].

This calculator travelled 122 miles to get to me, but originally it was bought for a company, just 16 miles from where I live!

Stats

I played around with the calculator for a while and cleaned it up. Then I did a bit of thinking and realised that it is in fact possible to compute statistics fairly easily!

The standard Casio Calculator stats mode could compute all the functions you needed (like population and standard deviation, or averages) from three variables:

  • n: the number of items entered.
  • ∑x: the sum of the items.
  • ∑x²: the sum of the squares of all the items.
In theory this means you need 3 memories. We can reduce it to 2 by simply counting n, the number of items. But the TI-30 only has one memory. Or does it?

The trick is to use the internal memory to compute the sums, and the calculator stack to compute the ∑x² terms. And this is possible, because as well as a [STO] and [RCL] button on the calculator, it also has a [SUM] button.

So, we can use [SUM] to store the ∑x terms as you enter each one and [x²] [+] to compute the running ∑x² total.

For example, let's say we have the data: {5, 4, 8, 3, 4, 5, 7, 4, 3}.

First, you'd press: [0] [STO] to clear memory and the display.

Then

Number ∑x Term Generate x² (Display) ∑ (Display)
5 [SUM] [x²] 25 [+] 25
4 [SUM] [x²] 16 [+] 41
8 [SUM] [x²] 64 [+] 105
3 [SUM] [x²] 9 [+] 114
4 [SUM] [x²] 16 [+] 130
5 [SUM] [x²] 25 [+] 155
7 [SUM] [x²] 49 [+] 204
4 [SUM] [x²] 16 [+] 220
3 [SUM] [x²] 9 [+] 229

At this point the calculator shows 229 (∑x²). Pressing [RCL] gives you 43 (∑x) and manually you count n=9.

So, then:
  • Average=[RCL]/9=4.78;
  • Standard Deviation=√(∑x²/n -(∑x/n)²= √(229/9-(43/9)²)= 1.62.
It's better to clear the memory first and then use [SUM] all the time even though you could do 5 [STO] for the first one, because it allows you to repeat the pattern and avoid thinking.

I don't think this was actually taught as a technique at school. Instead they expected you to compute ∑x terms, then go back and compute the ∑x² in a new column, before summing each column and then calculating the variance (rather than the Standard Deviation). This technique avoids entering the numbers twice. It's still somewhat slower than using a Casio with its Sd mode, but at least it's quicker than a purely manual mode.

Correcting Errors

I've also worked out how to correct data entries too. Let's say you mis-entered 5307 as 5607 by typing:
5607 [SUM] [x²] [+]

On an FX-82 you'd just do 5607 [DEL] and it would correct it. But on a TI-30 there's no such button (just as there isn't an [x] button). But correcting data is almost as easy. You type:

[-] 5607 [+/-] [SUM] [X²] [+]

On a real TI-30 LCD you're much more likely to miss or double-type a digit due to the dodgy debounce (it shares LCD and keypad pins so it can't display and read keys at the same time), so correcting errors is important.

The TI-30 isn't RPN, so the [+] at the end of each line signifies that the next calculation will be another addition, but also gives you the running total. So, the initial [-] overrides the previous [+] so that when the [+] is hit at the end, it subtracts 5607². But when you enter the number, you can't type [SUM] to delete the ∑x term, because the number on the display will be positive. You need to hit [+/-] to make it properly negative and then hit [SUM]. When you hit [X²] it makes it positive again so the earlier [-] will subtract, as you intend. Also, it wouldn't work to type in 5607 [+/-], because when you later hit [X²] it'll become positive again.

Conclusion

The TI-LCD was already a poor calculator by the standards of the early 1980s. It lacked functions other calculators had as standard and accuracy was poor. However, a critical statistical mode can be implemented fairly easily, and can roughly halve the number of keypresses to compute.




Monday, 22 December 2025

D'oh, D'oh! Host <-> Target Transfer On MAME's SparcStation 1 Emulator


I'm interested in simulating the real-time experience of a Sun SparcStation 1, because I'm currently interested in RISC Workstation era at the turn of the 1990s and I've never used one. Without an actual SS1, an emulator is the easiest choice.

There are two primary emulators available. QEMU supports a SparcStation 5 emulator, but it runs as fast as possible. MAME supports SparcStation 1 emulator. It can be used to try out SunOS (e.g. version 4.1.2), but lots of things don't work, at least on the MAME Version 0.255 I'm exploring, e.g. ethernet, serial, colour framebuffers (and accelerated graphics), second drives, floppy disks, tape (I think, but I haven't tried that).

There's a fairly decent introduction to getting MAME's SS1 emulator running. The only major difference I'd make is to create an uncompressed hard disk image just using dd (which is the subject of this blog post). In my case, it's called SunHd320.hd not sunos412.chd.

dd if=/dev/zero of=SunHd320.hd  bs=16k count=19530

I can then run the emulator with this command line:

./mame sun4_60 -window -slot1 "bwtwo" "-scsibus:0" harddisk "-scsibus:1" "" "-scsibus:6" cdrom -hard SunHd320.hd -cdrom SunOS_412.iso

To make it somewhat usable I needed to get data in and out, and I faffed a lot with trying to create floppy disk images or second hard disks or serial transfer, but it was basically a waste of time. I could describe my multiple attempts to use these, but suffice to say, don't go down that rabbit hole until there's good evidence they work! Then, today it occurred to me I could just use dd, which of course that Unix Workstation had in 1989 and I also have on the Mac mini 2012 I'm emulating it on.

Background: Why The SparcStation 1?

The SparcStation 1 is a really interesting Sun Workstation, because it appeared at the peak of RISC's ascendency (April 12, 1989), just 2 days after the Intel i486 was launched, proving that CISC CPUs could achieve RISC levels of performance. The first Intel 486 computer didn't actually appear until September. I'm a fan of RISC. I loved PowerPC and I'm super-glad Apple have switched to ARM-based Macs. But the 1990s were an awkward reckoning for the methodology as vendors strove to stay ahead of Intel despite having far fewer engineers, only to find Intel copying their techniques and often better than comparable RISC computers.

So, skipping back to 1989. The SparcStation 1 was Sun's attempt to compete with PCs by designing a highly-integrated, low-end Sparc workstation. It ran at a blazing 20MHz, delivering about 13 Dhrystone MIPS (and a SpecInt89 of about 13-14 too). This was over twice as fast as the fastest Intel i386DX at the time and about 30% faster than the just announced Intel i486. So, it was impressive. It had 64kB of external cache (no internal cache) and used the bizarre Sun MMU ported from their earlier 68020-based workstations. This MMU doesn't use a Translate Look-aside Buffer, but dedicated SRAM chips to maintain a large, 2-level cache of page translations.

Finally, it has a Sparc V7 CPU, the epitome of a classic RISC design. For example, it has no multiply or divide instructions (just a multiply step instruction: 32 cycles for a 32x32=>64 bit result, but other multiply sizes can be done quicker).

Setting Up Data Transfer

The basic principle is that the disk image (.hd) is divided 8 partitions in SunOS and only 5 of them are used, so assigning some space to a spare partition and then accessing it using dd from either side is relatively easy.

There are several stages to making it work though! I'm using a 320MB HD image (a Sun type 5 drive). This would have been very big for 1989. Standard drives from Sun were 80MB or 104MB. You can install the OS from a CD image as described in the earlier link. It's a horrible textual, menu-driven thing which is easy to get wrong. The critical thing though is that you can install the OS and then later modify the partitions. By convention, disk3 is used for the main HD; partition a is root '/'; partition b is swap (not mounted as such); c is a non-mounted partition that covers the whole drive; g is /usr and h is /home.

For this experiment I didn't quite do an 'easy' setup. Initially one needs to >b cdrom at the bootloader prompt. Then when it says:

Enter 2. It puts you in a # root type prompt, but there isn't a user as such and you can't logout. Enter format and it will take you through the standard setup. You can type ?<cr> at any time to see all the menu options. I typed disk<cr> 0<cr> type<cr> 5<cr> partition<cr> . I knew from an earlier, 207MB HD setup what sizes I needed, and wanted the rest of the space for the /home directory, so I basically duplicated that:

Partition
(type letter<cr>)
Cylinder (prompted)
(672 x 512b blocks/ cylinder)
Blocks
(prompted)
a 0 16800
b 25 102144
c 0 623616
g 177 201600
h 477 303072

You should check they all add up and no partitions overlap (apart from partition c which covers the whole disk). Then You need to type label<cr> y<cr> quit<cr> then I typed label<cr> y<cr> quit<cr> just to make sure. Back at the # prompt, type reboot cdrom<cr> and on this run through, type 1<cr> then it'll install miniroot, then type 2<cr> because you've already 'formatted' the disk, 1<cr> to reboot, note the vmunix is now a bit bigger at 802kB instead of about 737kB. If the SS1 doesn't automatically reboot, type >b disk3 at the prompt and it should boot. At the # prompt type suninstall<cr> and you'll be into the installer proper ("Welcome to SunInstall"). Type 1<cr> then n<cr>. Although this means the partitions will be overwritten, the sizes of partitions a, b, c, g and h will be preserved. I then chose the Programmer option by following the on-screen instructions, then finally y<cr> to start. Note also, the installer runs newfs for each of these partitions, which means it isn't changing the partition sizes.

After the basic install, it will complete the installation by asking you to provide extra information. I pressed 2<cr> <RETURN> I set the hostname to SS1mini , chose GMT for the time zone. 2<cr> then 22/12/2025<cr> for the date (UK format) and 11:34:30<cr> for the time. 1<cr> to accept that. 2<cr> (not on a network), y<cr> to accept. Finally I chose a password; I set up a user account with a full name of Julian Skidmore; user name as js; 100 as the user ID (because 0..99 are supposedly reserved) and the same password as root. It then finally completed the installation and dropped me into a login prompt.

I logged in as root ; then typed reboot cdrom to reboot into the cd and typed 2<cr> to use the single user shell; ran format again.

The trick is to steal space from another partition and because a, b, g and h are all mounted as proper filesystems, the obvious place is to steal from the swap partition. We have about 49MB allocated to that, for a 16MB SS1, so stealing around 1.44MB will be OK. Type 0<cr> then partition<cr> then make these changes:


Partition
(type letter<cr>)
Cylinder (prompted)
(672 x 512b blocks/ cylinder)
Blocks
(prompted)
b 25 98784
d 172 3360

Then you need to check all the partitions add up; write the label (label<cr> y<cr>), then quit<cr> label<cr> quit<cr>. Finally you can reboot back into the emulator: reboot disk3<cr>.

You only have to do this once and you're now ready to do data transfer!

Performing A Data Transfer

As a test, I modified a version of dhrystone to provide more accurate results when using time() instead of the µsecond accurate system functions, then tar'd them up into a 40kB tar file. You could use any file as a demonstration. You need to measure the length of the files you want to transfer in this technique of course.

From the SS1 emulator I could then copy dhry.tar to d using:

dd if=dhry.tar of=/dev/sd3d bs=1024 count=40

Then I needed to logout, login as root; then /etc/halt the SS1 emulator and quit MAME. Now I could copy from the hard disk image to the host on my Mac mini using:

dd if=SunHd320.hd of=dhry.tar bs=1024 oseek=57792 count=40

(Since partition d was at cylinder 172 according to format: Partition: Print). If you had created a compressed, chd hard disk, you would need the MAME tool chdman and calculate absolute byte offsets and lengths ( e.g. ./chdman extracthd -i sunos412.chd -isb 84951040 -ib 40960 -o dhry.tar ).

I could then copy back to the SunHd320.hd, but this is a bit non-trivial, because I found dd truncates the file. I needed to do:

cp SunHd320.hd SunHd320B.hd

dd of=SunHd320.hd if=dhry.tar bs=1024 oseek=56112 count=40

Then add the rest of the hard disk image using:

dd of=SunHd320.hd if=SunHd320B.hd bs=1024 iseek=56152 oseek=56152

Finally I could boot back into the SS1 emulator and copy the data back (when logged in as root) using:

dd of=dhry.tar if=/dev/sd3d bs=1024 count=40

I could then untar it using tar -xvf dhry.tar.

Conclusion

MAME provides a real-time emulation of a SparcStation 1 on hardware at least a few hundred times faster (in my case, the 2.5GHz Dual-Core i5 on my Mac mini 2012). To make any emulator truly useful you have to get data in and out of it. However, this emulator is only barely working.

After quite a few dead ends, I realised I could use dd on the SparcStation emulator and the hard disk image on the host side to transfer data in a fairly clumsy way. It's very crude. Still, I have bi-directional data transfer and I can start playing around with some other SparcStation software installs.


Saturday, 17 May 2025

SIBlings! 80386 Addressing Modes Simplified

I can't believe it's taken me over 30 years to get around to actually learning the 80386 addressing modes. So, long in fact that the x86 architecture has now been obsolete for about 20 years (slightly earlier for the original AMD64).

This is a blog post about how to make sense of the x86 addressing modes, because it's really not that complex if it's presented in a clear way. You can skip ahead to the end, but first a bit of background on my 8086 coding experiences!

I used to do a significant amount of 8086 programming in the early-mid 1990s, when I was working at Micro Control Systems in Sandiacre, Nottingham. I was assigned to the solid-state storage products (called Silicon Disk) which implemented battery-backed SRAM, EPROM and Flash expansion cards for PC compatibles that emulated bootable hard disks via an Int13h interface and a boot rom. They were versatile in one sense, because you could combine different storage media on a single card or combine multiple cards to make a larger disk - even so, the disks were small by modern standards: a maximum of 3MB per full-length ISA card.

The EPROM and Flash disks were fairly rudimentary though: the user would have to first erase all the storage (using UV in the case of EPROM and using a software erasing tool for Flash); and then copy the data they needed to the Silicon Disk. The firmware hijacked the MSDOS INT23h(?) interface so that writes to the Silicon Disk used our code. Files were written directly to the non-volatile storage, but the firmware kept the FAT table in the PC's RAM so that they could be updated multiple times. Finally, the user was expected to close the disk, which caused the FAT table to be written to the disk properly, along with a boot ROM image so the PC could boot up from the device.

This meant that people were able to develop a full, solid-state PC. Of course, a PC with a read-only disk isn't terribly useful, so most EPROM or Flash disks would also have at least one bank allocated to battery-backed RAM. And in this sense the PC became a large microcontroller with up to 2MB of EPROM/Flash and 1MB of a RAM disk.

All the firmware was written in 8086 code which meant I ended up being pretty familiar with 8086 coding. Coming from a 68000 background that was fairly disappointing, but I learned to make decent use of the CPU. Eventually I persuaded the company that we could improve development time by only having to write the core routines in assembly, while the rest of it could be written in 'C'.

I rarely had to write in 80286 assembly - it was basically the same as 8086 programming with a few more instructions, pusha and popa being the most useful. We never had to write 80386 code, so I never had to learn that and my embedded programming jobs after Micro Control Systems never required it either.

But occasionally I'd come across 80386 code and realise that some of the basic stuff had changed: in i386 mode, addressing modes are more flexible and you can scale index registers so that the CPU can direct address 16-bit or 32-bit arrays. It's possible to read the code OK, but not write it unless you know what the constraints on the address modes really are.

So, finally I've had a go at trying to understand them and it turns out, it's not very complex.

Recap: 8086 Addressing Modes

The x86 series has a byte-oriented instruction set, which means it's just a series of 1 to however many bytes regardless of whether you're dealing with the original 16-bit 8086 or a 64-bit Core i7. Many instructions consisted of a specific initial byte followed by an effective address byte which told the CPU where to find the memory location (or register) to obtain the source or destination data. This was called the MOD:REG:R/M byte. In some cases this byte is sufficient, but in other cases, the MOD bits would indicate 8 or 16-bit literal offsets would then follow and these values were added to whatever memory location was indicated by the R/M bits. The meaning of the R/M bits themselves depended on the MOD bit value too, and could either be one of the 8, 8-bit registers (or 16-bit registers if it was operating on 16-bit data); one of the 4, 16-bit registers that could be used for indexing: BX, BP, SI and DI or a restrictive combination of a pair of those registers: BX+SI (or DI) / BP+SI (or DI).

In addition, although SP addressed memory (because it was the stack pointer) it wasn't possible to index via SP; instead the convention was to use BP to point to a frame of data in the stack and it was possible to use that with an offset. Intel did that, because frame pointers were a fairly common convention for Pascal in the 1970s when the 8086 was designed.

So, it was all pretty restrictive: only 3 address registers were generally available and indexing data with a computed offset was even more limited: only BX+SI (or DI) being the useful mode (because normally you wouldn't stick an entire array on the stack). Still experienced 8086 programmers were used to juggling registers in functions so that the right address registers just so happened to be available when the programmer needed them. It was slow, tedious, but surprisingly efficient.

By comparison, the 68000 was a delight, because you could use any one of 8 address registers (A0..A7) and index them directly; or with a post-increment / pre-decrement (directly implementing 'C's ++ and --); or with a 16-bit displacement or 8-bit displacement and a second register which could be any of the 16 registers D0..D7, A0..A7 treated as 16-bit or 32-bit offsets). Very flexible.

The whole set of addressing modes can be summarised below:

MOD R/M: 000 001 010 011 100 101 110 111
00 [BX+SI] [BX+DI] [BP+SI] [BP+DI] [SI] [DI] disp16 [BX]
01 disp8 [BX+SI] [BX+DI] [BP+SI] [BP+DI] [SI] [DI] [BP] [BX]
10 disp16 [BX+SI] [BX+DI] [BP+SI] [BP+DI] [SI] [DI] [BP] [BX]
11 (reg:) AL/AX CL/CX DL/DX BL/BX AH/SP AH/BP AH/SI AH/DI

This means there are 25 unique addressing modes on the 8086, ignoring the register mode, as it's not addressing memory.

80386 Addressing Modes

Intel could have kept the same set of addressing modes for the 32-bit 80386, but computer architecture design had advanced between 1978 and 1985 when the 80386 was released. Firstly, the 68000 CPU with its more flexible addressing modes, represented a significant amount of competition, primarily because it was already 32-bit and used for high-end workstations and secondly, RISC processor designs were starting to emerge and these showed that simple addressing modes were used most of the time.

Therefore, the i386 took the drastic step of changing the addressing modes in its 32-bit mode. Instead of the double-index register modes and single-index register modes being allocated to the R/M field, only a wider set of single-index register modes were allocated; and where SP would be being used as an index register a second address extension byte was added, the SIB byte. All the SIB byte does is provide a pair of 3-bit index register fields and a 2-bit scaling field for the first index register. These fields can be basically mixed and matched.

MOD R/M: 000 001 010 011 100 101 110 111
00 [EAX] [ECX] [EDX] [EBX] [Ix*n+ Base] disp32 [ESI] [EDI]
01 disp8 [EAX] [ECX] [EDX] [EBX] [Ix*n+ Base] [EBP] [ESI] [EDI]
10 disp32 [EAX] [ECX] [EDX] [EBX] [Ix*n+ Base] [EBP] [ESI] [EDI]
11 (reg:) AL/AX/ EAX CL/CX/ ECX DL/DX/ EDX BL/BX/ EBX AH/SP/ ESP AH/BP/ EBP AH/SI/ ESI AH/DI/ EDI

SIB

Ix:3 Scale:2 Base:3
EAX *1 EAX
ECX *2 ECX
EDX *4 EDX
EBX *8 EBX
(none) ESP
EBP *
ESI
EDI

(* if Mod=00 and Base=* then the addressing mode is disp32[Ix*n], i.e. a 32-bit displacement and a scaled index register without a base register).

Some of the SIB encodings overlap with existing addressing modes. E.g. Ix=No Index and Base=EAX..EBX will overlap with R/M= the same Base register. Also, Mod=00, Ix=Index, n=1, Base=* overlaps with disp32[R/M=Ix].

This means that if we count addressing modes on the 80386 the same way we count them on the 8086, we have 7*3+1 = 22 main addressing modes + the SIB addressing modes * 3. There are 8*4*5=160 SIB modes, giving another 480 modes + the special '*' mode, giving a grand total of 503 addressing modes (most of which are 2 bytes, yet these are also the least frequently used ones).

A programmer, of course, could restrict themselves to only using the same subset of addressing modes available on the 16-bit x86 CPUs, by merely substituting 32-bit index and base registers. This would have the same syntax, but a longer encoding for all the Index + Base addressing modes.

Conclusion

I did a lot of 8086 assembly programming in the 1990s, because developers had to do more assembly, because CPUs were slower and compilers were poor. However, the 16-bit x86 CPUs were also simpler devices.

I have never had to do i386+ programming, as embedded programming moved away from x86 and shifted away from assembly, but I was always intrigued by it. The hardest bit would always be the new addressing modes, but every time I read up on them, it just seemed like more effort than it was worth. Finally, after about 30 years I took it seriously and discovered they're more simple than the descriptions I've seen. So, this blog post covers my new understanding.

Of course, the 32-bit Intel era has been over, for more than 10 years, even though Intel is unable to delete it from its CPUs! Soon the Intel era itself might be over as ARM CPUs (and perhaps RISC-V CPUs) overtake its performance at all scales.

Tuesday, 21 January 2025

Burn's Night Is The Coldest

In winter I usually mark off 4 dates as the yearly cycle starts transforming into a more positive outlook. I call these 'Milestones'.

Milestone 1

This is the earliest evening. Most people are unaware that evenings start getting later, before the shortest day. In 2024, evenings in the UK started getting later from December 12.

Milestone 2

This is the shortest day, the winter solstice. Everyone knows about this, however, even after this day, the mornings are still getting later; it's just that the evenings are getting later quicker than the mornings are, so the days start getting longer.

Milestone 3

This is the latest morning. Again most people are unaware of this, but it happens right at the end of the year. In 2024 it happened around December 30 or 31.

Milestone 4

This is the coldest day of the year on average and the topic for this post. It's difficult to calculate this date, because daily temperatures vary wildly from day to day and also across years for equivalent days. Nevertheless, it's fairly easy to see that after the days start getting longer, they continue to also get colder for a while. This is because other environmental factors such as cloud cover, heat loss from the ground, air temperatures or the Jet Stream can continue to drive temperatures on average down faster than the sun adds energy to the atmosphere and land.

Anecdotally, I used to figure the coldest time of the year was at the end of January / beginning of February, so I set Milestone 4 on January 31. Later, however, I thought to myself that perhaps it's mid-way between the winter solstice and spring equinox, because all of these diurnal patterns tend to follow year-long sine waves.

Winter solstice is on December 21, and Spring equinox is on March 21. So, that's 10 days in December + January (31 days) + February (28.24 days) + 21 days of March. This is 10+31+28.24+21=90.24 days. Calculating Milestone 4 after 90.24/2=45.12 days, which, given 10 days at the end of December + 31 days in January leaves 4.12 days. So for the past several years I've been setting it on February 4.

But neither of these techniques are based on actual evidence. What if it's not symmetrical as I've assumed? What if temperatures simply aren't shifted mid-way? To figure that out I need real data.

Real Data

I was involved in an on-line, climate discussion trying to work out how temperatures had changed in the UK over the past decade or so and found an open Statistica page on it:


You can hover over the months to get the actual figures, downloading the raw data requires a subscription. It turns out that for nearly all the months in the year there's an upward trend, but for January there's no observable trend.

But as I was looking at it, I realised that I could use my new understanding of Fourier transforms to obtain a better approximation for the coldest day.

The Winding Principle

At University we covered quite a lot of math in the first year including Fourier Transforms (and Laplace Transforms). I was able to do the math, but I didn't remotely understand how one can isolate the set of harmonic frequencies from waveform data. It took a Hackaday article to help me. I can't do justice to the article, nor the associated animated video explainer, but I can précis the idea as far as the fundamental harmonic goes, which is all we care about here.

Every complex, repeating, sampled waveform can be constructed from a set of sine waves at 1x, 2x, 3x.. the fundamental frequency up to half the sample period just added together. However, if I want to isolate the fundamental frequency that turns out to be pretty easy. All you do is multiply each sample by the sine of the corresponding angle within the waveform and add the results together. If the fundamental is present, then its amplitude at any point will cohere with the sine wave itself, but higher frequencies will 'disappear', because their positive phases will end up getting multiplied by both the positive and negative phases of the reference sine wave. For example, consider an 8-sample wave containing a fundamental and 1st harmonic:

Sample# 0 1 2 3 4 5 6 7 Total
Ref Sine 0.000 0.707 1.000 0.707 0.000 -0.707 -1.000 -0.707 0.000
Fundamental 0.000 0.573 0.810 0.573 0.000 -0.573 -0.810 -0.573 0.000
^^^ x Ref Sine 0.000 0.405 0.810 0.405 0.000 -0.405 -0.810 -0.405 3.240
1st Harmonic 0.000 0.210 0.000 -0.210 0.000 0.210 0.000 -0.210 0.000
^^^ x Ref Sine 0.000 0.148 0.000 -0.148 0.000 0.148 0.000 -0.148 0.000

To fully calculate each harmonic you need to consider the phase of each harmonic. That's because a sine wave at any given phase can be generated by a pair of sine + cosine waves with two respective amplitudes; thus the above technique will only recover the sine wave component. For example, if the Fundamental was shifted by +90º, then the Fundamental * the Ref Sine would still end up with a total of 0, but here the wave * a Reference cosine wave would have an amplitude of 3.240.

Finding The Phase

Therefore, a Fourier analysis of the fundamental can tell us not only its amplitude, but also its phase. And it turns out we can obtain an accurate phase from relatively few samples. The phase is simply obtained from the vector obtained from ∑ waveform data * the Reference sine wave on the x axis ∑  waveform data * the Reference cosine wave on the y axis.

This means that even though all we have are monthly values for the temperature data, we can calculate the actual minima, zero-crossings and maxima at a much higher resolution.

The phase calculated is always relative to the reference angles. For example, if we started the reference angle at 30º and the samples were a sine wave starting at 30º, then the phase would still be 0º. If the reference angle was 0º, and the samples were a sine wave starting at -90º, then the relative phase would be reported as 90º, because the zero-crossing for the sine wave would be at 90º.

The phase therefore tells us the average temperature day and the minimum temperature day will be 90º earlier (or 91.31 days earlier). For UK temperatures, the minimum temperature is therefore reported as Jan 25.5. Ironically, this means that Burn's Night is the coldest.

There's one more aspect of the model that's worth mentioning, which is that the reference phases aren't equidistant, because the months don't all have the same number of days in them (though it's close). Therefore, in this calculation, the reference dates are taken from the mid-point of each month, on the basis that the average temperature for that month represents the temperature half-way through the month.

Minimal Temperature

Temperatures:

Your browser does not support the canvas element.

Wednesday, 8 January 2025

Basic Blitz: A Surprisingly Addictive VIC-20 Remake

The game Blitz was written and self-published by Simon Taylor for the unexpanded VIC-20 in 1981, then later sold to Mastertronic.

https://www.eurogamer.net/lost-and-found-blitz

I always thought it looked like a game that must have been written in Basic, but I never got around to testing that until the beginning of 2025.

So, here's my version in all its glorious 64 lines of code!


Mine seems to be based on the later Mastertronics' version, because my plane is just one graphic character instead of 2 or 3 and my buildings are multicoloured instead of just black. Multi-coloured buildings adds to the fun, given most actual buildings are grey.

Also, mine doesn't speed up during each flight; it does get faster per level while the number of buildings it generates also increases by 2. My current high score is 533. Game control is pretty simple: you just press 'v' to drop a bomb as the ship flies across the screen. Only one bomb can be dropped at a time.

Design

Enough of the gameplay, let's discuss the software design. The outline of the game is pretty simple:
  • Line 5 reserves memory for the graphics characters then calls a subroutine at line 9000 to generate them.
  • Line 7 defines a function to simplify random number generation.
  • Line 8 is a bit of debug, see later.
  • Line 9 resets the high score. So, this only happens once.
  • Line 10 starts a game with a width of 5 (so 5x2=10 buildings are generated) and a delay of 100 between each frame.
  • Lines 30 to 60 are the main loop of the game. It really is that tiny. The loop terminates when the  plane lands or hits a building. Within that the plane is drawn (by displaying it in its next position then erasing the previous position to avoid flicker).
  • Bomb handling is done in lines 45 to 50, but the explosion is handled in lines 200-300.
  • End of game is handled in lines 66 to 80 including displaying "Landed" or "Crashed", updating the high score and handling the user wanting to quit.
  • Line 99 resets the graphics characters back to the normal character set so that you can carry on editing it.
  • The subroutine at line 100 performs the equivalent of a PRINT AT.
  • The subroutine at lines 200 to 250 handle a bomb hitting a building (a random number of floors are destroyed).
  • The subroutine at lines 8000 to 8070 generate a new level based on W, the width of the cityscape.
  • The subroutine at lines 9000 to 9010 generates the graphics characters and sets the sound level to 5.
  • The data from lines 9012 to 9090 are the graphics characters themselves, in the sequence: 'blank', 'solid square', 3x building types, 2x roofs, plane, grass.
  • The subroutine from lines 9500 to 9520 wait for a key to be released, then pressed, returning the key in A$.

Graphics

Because VIC-20 graphics are weird, programmers end up with bespoke graphics routines, so it's always worth discussing them. Firstly, VIC-20 graphics are tile-based, somewhat like the Nintendo Entertainment System. Video memory contains character codes between 0 and 255, and each character code points to an 8x8 bit pattern at CharacterMemoryBaseAddress+(CharCode*8). Usefully, the base address for the character bit patterns (and the video base address too) can be set by poking 36869. That base address can be set to RAM (which gives the programmer 256 tiles to play with), ROM (which is the default and provides caps+graphics or a caps+lowercase+some graphics option) or can be made to straddle both (which gives the programmer up to 128 tiles to play with + an upper case character set). This is the case even though the user defined graphics (UDGs) have addresses below 8192 while the ROM tiles are above 32768, because of the way the 14-bit VIC-chip's address space is mapped to the VIC-20's, full 16-bit address space.


In practical, unexpanded VIC-20 applications, programmers will want to use as few UDGs as possible to maximise program space while retaining much of the conventional character patterns. In Basic Blitz we therefore set the graphics to straddle mode (value 0xf, giving a CharacterMemoryBaseAddress of 0x1c00) which means characters 0..127 are in RAM and 128..255 are in ROM.

Intuitively, you might imagine that you'd want to start using tile 0 first, but that would waste of most of the tile space, so in fact we always count the UDGs we need backwards from tile 63, because tiles 64 to 127 overlap with video memory itself by default (and are therefore unusable!). Also, because the VIC-20 ROM characters aren't in ASCII order, and amazingly enough don't include the filled-in graphics character I have to provide that. When Basic Blitz is run, it first shows the entire usable character set.


I added this as a bit of debug, because I initially wasn't sure the ROM characters would print out OK. Also, I then made it print Hello in red to test both my PRINT AT subroutine and embedded colour control codes.

Graphics characters can easily be printed, because they're the normal characters '6, '7', 8', 9', ':', ';', '<', '=', '>', '?'. Normal text can be displayed, but you have to force 'inverse' characters which is achieved by preceding each print statement with <ctrl>+9 and ending with a true character <ctrl>+0.

Colours

Colours on a VIC-20 are strangely limited. There's a block of colour attribute memory, one location for each video byte, but each one is only 4 bits, which means you can only select an INK colour for on pixels. The PAPER colour is global, defined by bits 4..7 of 36879. The VIC-20 partially gets around this by normally making characters 128 to 255 inverse characters, but also by defining bit 3 of 36879 as normal or inverse mode.

The upshot though is that with the ROM character sets you can choose a common PAPER colour with any INK, or the common PAPER colour as INK, with any INK colour as PAPER. But when you select the character set to straddle RAM and ROM, you can only choose any INK colour + the common PAPER colour.

Hence in Basic Blitz, the background is white (as that seems most useful) and I have to define a UDG just so that I can get a filled in green character for grass with a building on top.

Sound

BASIC Blitz, sound is pretty simple. The initialisation routine switches audio on to level 5 (POKE 36878, 5); and leaves it there. There are 3 voice channels, which are individually switched on if bit 7 is set. In practical terms, each voice has a range of about 2 octaves, the first one having values from 128 to 65; then the next octave from 64 to 33. Beyond 32, the frequency ratio between each note is 1.03 to 1.06, close to that of a semitone 1.059 making most note intervals unusably out of tune.

The plane makes a drone sound using the lowest pitch audio channel (address 36874) OR'd with the bottom 4 bits of the jiffy clock at PEEK(162).

The bomb uses the high octave channel (at 36876) just generating an ascending tone. If the bomb hits a building it's silenced and the noise channel with a fixed low pitch of 129. The important thing, finally is to turn off all the sounds when they're done, by poking the channels with 0.

Playing The Game

You can run this VIC-20 Javascript emulator and type in the code (if the keyboard mapping allows it):


I've found this emulator is better than the Dawson one for .prg files. Here's how to load the .prg on a desktop/laptop. First download the BasicBlitz.prg from my Google Drive. Then drag the file from wherever you downloaded it from to the emulator in the browser. It will automatically load and run!

However, it's also useful to be able to type in code directly for editing, debugging and other stuff.

The keyboard on my MacBook M4 doesn't map correctly to VIC-20 keys, because the emulator does a straight translation from character codes to VIC-20 keys rather than from key codes. This means that pressing Shift+':' gives you ';' on this emulator rather than '[' as marked on a VIC-20 keyboard.

Mostly this makes typing easier, but the VIC-20 uses a number of embedded attribute key combinations. Basic Blitz doesn't use many, here's how to type what it does use, it isn't easy!

In Chrome, you need enable console mode, by typing function key F12. Then tap on the Console tab. In Safari, you need to choose Safari:Settings... then Select the 'Advanced tab'; and click on "Show features for web developers" at the bottom. Then the "Develop" menu appears on the menu bar and you can then choose Develop:Show JavaScript Console.

So far so good. Now, you can type most of the text as normal, but whenever you need to type a special code, type pasteChar(theCode) in the console followed by Enter (e.g. pasteChar(147) for the clear screen code). Here are the codes you'll need:
  • Inverse 'R' => 18. This is for Reverse text, which ends with inverse nearly underline => 146.
  • Inverse '£' (Red) => 28.
  • Inverse '┓' (Black) => 144.
  • Inverse 'S' => 19 (this is the home code).
  • Inverse heart => 147 (this is the clear screen code).
  • Inverse up-arrow => 30 (this is green).
  • Inverse 'Q' and inverse '|'can be typed directly just using the down cursor and left cursor respectively.
  • The codes in line 8045 are more colour codes used for the buildings. They are 144 (Black), 28 (Red), 159 (Inverse filled diagonal=cyan), 156 (checkered-black character=purple), 30 (inverse up arrow = green), 31 (inverse left arrow=blue), 158 (inverse 'π' = yellow).

Conclusion

The original VIC-20 Blitz program, though derivative in its own way, is so simple it could have been written in BASIC, as this version proves. The arcane design of the VIC-20 hardware and its lousy BASIC implementation means there's a lot of subtle complexity even in a simple game. Finally, although there are many emulators for the VIC-20, both the Javascript implementations I know of have limitations and bugs which make distributing this game and/or modifying it non-trivial.




Tuesday, 31 December 2024

Wobbly-Blue: An Optical Illusion on a ZX Spectrum And VIC-20

 Introduction

I recently came across an interesting optical illusion whereby a speckled-blue sphere on a random checkerboard pattern will appear to wobble relative to the checker-board if you move your head. When viewed on a mobile device, as you can move the device itself and the effect is even more pronounced.

I figured I could write a simple version of a program that generated it for the ZX Spectrum, and this is the result (you need to make the image occupy a fair amount of your field-of-view):

The program is fairly small:

The ZX Spectrum has character-level colour resolution, but because the blue sphere is surrounded by a black border, it doesn't cause any clashes. I originally wanted to produce a sphere where the blue coverage in the centre was obviously larger than at the edges, but it turned out that simply taking the sine of a random angle creates a distribution that looks spherical, because the rate of change is greatest near 0º so the dots are spread out more there and concentrated near the edges. If I let it run for about 2000 points it'd probably look more prominent.

Unexpanded VIC-20 Version

Quite frequently I like to create Unexpanded VIC-20 conversions of ZX Spectrum programs, because they're fairly contemporary machines with some similar characteristics, but the VIC-20 is more challenging to program due to a lack of support in its version of BASIC.

Here's the VIC-20 version:

The VIC-20 version is full of POKES to do what the ZX Spectrum version can do with PLOT, INK and PAPER. Also, the character set is squished and there's only 22 characters per line. But it's the techniques needed to perform hi-res graphics on a VIC-20 (particularly an unexpanded VIC-20) that's the real challenge.

Firstly, the VIC-20 can only really do hi-res graphics by modifying a character set of up to 256 characters. So, if you fill the screen with unique characters and update the pixels in each character then a full bitmap display is possible. However, on a standard VIC-20 screen there are 22 x 23 characters = 506 character positions which is far more than the number of characters in the character set! The VIC-20 'fixes' this by supporting double-height characters of 16-rows each, which means you only need 253 characters to fill the screen.

The second problem is that an unexpanded VIC-20 only has 3581 bytes free when you turn it on, and 253 double-height characters + the 506 screen bytes would need 4554 bytes, which is clearly more than what's available.

However, in this case, we don't need to fill the whole screen with bitmapped graphics, only the sphere in the centre! And in fact I would only need 172 single-height characters if I also reduced the screen size to 20x20 characters! 172 characters needs just 1882 bytes including the screen bytes. This leaves just over 2kB for my program!

How is this done? Well, I could work out which characters in the centre will be filled with the sphere's pixels and print unique characters for them, but it's easier to use the kind of tile-allocation technique you might use for a video game. You use some characters as background (in this case we only need one: character 255, which is filled with 8x $ff's).

Then whenever you want to plot properly on the screen you find out which character is being used at the character location at (INT(x/8),INT(y/8)). If it's not 255, then you can then look up the character in the character set memory; select the right row (Y & 7) then set the right pixel (128>>(X & 7)). Otherwise you allocate the next character code (denoted by UG% in the listing) and then fill in the pixel as before. It doesn't matter if the characters on the screen aren't allocated in order, because one simply gets the correct bitmap address from the character code itself.

The resultant program is similar to the ZX Spectrum version, except that a couple of subroutines are added. Line 1 allocates space for the new character set by setting the end of BASIC (and string stack) to 6143. Then 6144 onwards can be used. Lines 500 to 540 create the initial graphics setup. The screen size is set to 20x20 instead of 22x23. PAPER is set to black with the screen in non-inverse mode; the new character set is filled with 0s; the screen is filled with character code 255 and character 255 is filled with 255s too (all done with one loop). Finally, UG% is set to 0, as that's the first character code we'll allocate.

Lines 1000 to 1040 are very similar to the ZX Spectrum version except it works by setting the INK colour of each character to white or black for each checkerboard location.

Lines 200 to 220 are the plot routine discussed above. It also has to POKE colour memory (from 38400 onwards) to make the INK colour at that location Blue (colour 6). Finally we can run the program:


Pixels on a ZX Spectrum are square, but on a VIC-20 they're squished too, so the sphere is oblate. Total video memory is 2048b, and including 400b of screen memory that's room for 206 bitmap chars. So, I could have increased the checkerboard resolution by allocating 16 characters, taking the total up to 188.

What Causes The Effect?

I did a bit of searching for how the illusion works, but all I found was articles on how colour aberration can cause red or green colours to stand out in a sort-of 3D effect. But here's a simple theory. Human eyes are 10x less sensitive to blue than other primary colours and much more sensitive to luminance than colour. So, it's likely that the brain needs to do more processing for colour than for monochrome images; more processing for blue and more processing for sparse images (like this sphere).

That would make sense: the least sensitive blue cones might take longer to fire than the more sensitive rods (because it takes longer for enough energy to make them fire). There might be more neurone layers for making sense of colour image; more neurone layers for making sense of a sphere than a set of blocks and finally more neurone layers for making sense of a sparse image than a solid one.

All the extra processing causes a lag in processing; which means that when you move your head (or move the screen); the checkerboard pattern moves quickly in your field of view, but your other neurones take time to reconstruct the sparse, blue, sphere.