Friday, 23 June 2023

QUADFIT: Least Squares Quadratic Curve Fitting on a 1K ZX81

 Yet another snappy blog post title!

The sub-theme for this one is that by rethinking a problem in a different way, we can squeeze the solution into far fewer resources.

In this case, I'd recently been doing some work (at work) which involved fitting a quadratic curve to a set of noisy datapoints (because they're derived from ADCs, which... often need a lot of filtering to get decent results despite their nominal 12-bit accuracy).

In practice I found a website that covered the Least Squares Quadratic fitting algorithm.

https://www.omnicalculator.com/statistics/quadratic-regression

The interesting thing for me is that this algorithm is explained in quite a lengthy webpage; and on top of that there's a whole side box that allows you to enter a set of (x,y) coordinates and it'll figure out the quadratic coefficients (and the correlation).

That's quite a lot of resources, which implies it's fairly involved, but is it really? Looking at the equations and how the coefficients are derived from them:



It looks like any programmer would need to store the set of (x,y) coordinates used, along with the number of coordinates n, and then perform a number of loops to sum all the terms, starting with the averages:

But the format for all the S** terms reminded me of how 1980s Casio Scientific Calculators managed to perform linear regression without needing to store an array of data, and they needed Sx, Sx2, Sxy type terms too. And it was possible to correct the data you'd entered too - even though they didn't store the data itself. How was this possible?

Well, the answer lies in observing that firstly we don't need to calculate the averages for x, x² or y, because you can simply calculate ∑x, ∑x² and ∑y after each new coordinate is entered, and divide by the current value of n. Then the same reasoning applies to calculating all the S** terms: you simply update the sums for each of them whenever you enter a new coordinate. This reduces the problem to evaluating 10 terms on each new iteration (a doesn't need to be stored, it can be merely displayed).

The final question then is how to delete data? That's surprisingly simple too, all you need to do is use the erroneous coordinate (x',y'), but subtract the corresponding x', x'², x'³, x', y', x'y', x'²y' values from the corresponding terms and then decrement n.

And because the computation really can be reduced to that degree, it can be squeezed into a 1K ZX81 (and probably an early 1980s Casio programmable!). Go to the Javascript ZX81, then type POKE 16389,17*4 [Newline] then NEW. The ZX81 is then effectively a 1kB ZX81


You'll find that as you get towards the end of the program, the edit line will start jumping up, as you're running out of memory. In fact there are a whole 78 bytes free (including the screen space used by the program) for any enhancements you want (e.g. correlation!)