2022-03-01 Retro Computing Microsoft's BASIC "ST ERROR"

From Wikistix

Doing a little retro computing on an old, original CoCo1 (Tandy TRS-80 Colour Computer I), writing my own port of "wordle" in BASIC (because, of course you would), I managed to trip over ?ST ERROR. I had to look that one up, I hadn't seen it before - STRING FORMULA TOO COMPLEX. Ok, that's weird. The line in question was just doing some simple string concatenation combined with the STR$ function:

DRAW "BM"+STR$(LC*20+2)+","+STR$(UG*24+4)+"S16C0"

I tried a bunch of things to simplify the expression:

D$="BM"
D$=D$+STR$(LC*20+2)
D$=D$+","
D$=D$+STR$(UG*24+4)
D$=D$+"S16C0"
DRAW D$

Same error. Normally, this error is pretty hard to trigger:

10 PRINT "A"+("B"+("C"+("D"+("E"+("F"+("G"+"H"))))))
RUN
ABCDEFGH
10 PRINT "A"+("B"+("C"+("D"+("E"+("F"+("G"+("H"+"I")))))))
RUN
?ST ERROR IN 10

So, I pulled out the big guns, and referred to Color BASIC Unravelled, and started learning what the error really meant. So, there's an 8 entry stack of string descriptors used for holding temporary strings, both program literals and intermediate results of, eg. concatenation and function calls (like STR$). However, of note, these temporaries are popped back off as consumed, so a chain of concatenations should not cause a deep stack. So what's going on here? From the Unravelled book, TEMPPT (2 bytes at &H0B) points to the current stack entry. On a whim, I dumped this out in the top loop in my code:

PRINT HEX$(PEEK(&H0B)*256+PEEK(&H0C))

As expected, this initially printed &H01A9, which is STRSTK. However, sporadically, it would increment by 5 bytes (the size of a descriptor) - &H01AE, &H01B3, etc. And the ?ST ERROR would occur after all 8 entries were consumed. What's odd, is that the stack should be clean at the end of each statement, so this is a leak!

So, let's go bug hunting. I scattered copies of this line throughout my code:

IF (PEEK(&H0B)*256+PEEK(&H0C))<>&H01A9 THEN STOP

Long story short, the statement that occasionally leaked a string stack descriptor was:

A$=INKEY$

Weird! There's obviously a bug somewhere, but reading the annotated assembly in Unravelled for INKEY$ and string management, nothing really stands out. But, thankfully, I found a simple workaround that appears to completely stop the leak:

A$="":A$=INKEY$

Update 2022-03-28 - A repro, it's a bug

A bunch of great minds got curious when I mentioned this page on Facebook, which got me interested in exploring this bug further. And after some stuffing around, I've found a simple reproduction (save this as STERROR/BAS if that isn't obvious):

10 B$="AAAAA"
20 OPEN"D",#1,"STERROR/BAS",1
30 FIELD#1,1 AS A$
40 GET#1,1
50 'ABORT ON LEAK
60 'IF (PEEK(&H0B)*256+PEEK(&H0C))<>&H01A9 THEN STOP
70 'MAKE TEMPPT POINT TO A$
80 IF B$=A$ THEN STOP
90 Q$=INKEY$
100 GOTO 50

The TL;DR is that the FIELDed string points to the dedicated record buffer allocated by Disk Basic. The IF comparison makes the first entry of the temporary string stack TEMPPT also point into that buffer. Then, INKEY$, assuming no key is held down, just sets the length, and only the length, of the temporary string stack entry to zero. The address still points at Disk Basic's record buffer. Then, during assignment, Disk Basic has a hack to copy FIELDed strings to string space, but this doesn't quite work in this case, and fails to pop the string descriptor, causing a leak. Big thanks to William Astle (author of the below linked article on Color Basic and String Handling) for coming up with the theory that allowed me to write a simple reproduction of the issue.

See Also