Home

Awesome

The Eight Bit Algorithmic Language

<p align="center"><img src="pics/Asset 1.png" alt="eightball" height="200px"></p>

The Eight Bit Algorithmic Language for Apple II, Commodore 64 and VIC20

Includes:

Table of Contents

Intro

What is EightBall?

EightBall is an interpreter and bytecode compiler for a novel structured programming language. It runs on a number of 6502-based vintage systems and may also be compiled as a 32 bit Linux executable. The system also includes a simple line editor and the EightBall Virtual Machine, which runs the bytecode generated by the compiler.

Design Philosophy

EightBall tries to form a balance of the following qualities, in 20K or so of 6502 code:

Supported Systems

The following 6502-based systems are currently supported:

EightBall also runs on Linux (built as a 32 bit process using gcc -m32.)

With some small modifications, the code could also be built for any 6502-based system supported by the cc65 compiler. For the interpreter/compiler program, upper and lower case text support is required (so Apple II/II+ would need an 80 column card.) The virtual machine program does not necessarily require lower case (if you do not use it in your EightBall code.)

Licence

Free Software licenced under GPL v3.

Roadmap

EightBall is an ongoing development project. See the project roadmap here.

This is a free software / open source project and I invite anyone interested to participate via GitHub.

Getting Started

There are executables and disk images available to download for Apple II, Commodore 64 and VIC-20. These may be run on real hardware or one of the many emulators that are available.

The language itself is documented in this file. The best way to learn is to study example programs.

Disk images:

Apple II

I used ADTPro to copy eightball.dsk to a real Disk II 140K floppy. A solid state drive such as CFFA3000 should also work.

It is also possible to run the EightBall system using the MAME Apple II emulation under Linux.

To run the main EightBall executable, which includes the line editor, interpreter and bytecode compiler, choose to start EB.SYSTEM from within the ProDOS launcher.

You can then enter and run the test program below.

One you have entered the test program and run it in the interpreter, you can compile it to bytecode as follows:

comp "test"
quit

The compiled code is written to the file test on the floppy diskette containing the EightBall system.

If you then invoke the EightBall Virtual Machine EBVM.SYSTEM, it will prompt you for the name of the bytecode file to load. Enter test at the prompt to run the code you just compiled. The VM is much faster than the interpreter.

Commodore 64

For the Commodore 64, the file eightball.d64 can be written to a real C1541 floppy, or to a solid state drive such as SD2IEC.

It is also possible to run the EightBall system using the Vice C64 emulator under Linux.

To run the main EightBall executable, which includes the line editor, interpreter and bytecode compiler, run 8BALL64.PRG as follows:

LOAD"8BALL64.PRG",8
RUN

You can then enter and run the test program below.

One you have entered the test program and run it in the interpreter, you can compile it to bytecode as follows:

comp "test"
quit

The compiled code is written to the file test on the floppy diskette containing the EightBall system. (Note that if this file already exists an error will occur. This is a known deficiency which I will address in due course.)

If you then invoke the EightBall Virtual Machine 8BALLVM64.PRG, it will prompt you for the name of the bytecode file to load. Enter test at the prompt to run the code you just compiled. The VM is much faster than the interpreter.

LOAD"8BALLVM64.PRG",8
RUN

VIC 20

For the Commodore VIC20 (plus 32K expansion RAM), the file eightball.d64 can be written to a real C1541 floppy, or to a solid state drive such as SD2IEC.

It is also possible to run the EightBall system using the Vice VIC20 emulator under Linux.

To run the main EightBall executable, which includes the line editor, interpreter and bytecode compiler, run 8BALL20.PRG as follows:

LOAD"8BALL20.PRG",8
RUN

You can then enter and run the test program below.

One you have entered the test program and run it in the interpreter, you can compile it to bytecode as follows:

comp "bytecode"
quit

The compiled code is written to the file bytecode on the floppy diskette containing the EightBall system. (Note that if this file already exists an error will occur. This is a known deficiency which I will address in due course.)

If you then invoke the EightBall Virtual Machine 8BALLVM20.PRG, it will load and execute this bytecode. The VM is much faster than the interpreter.

LOAD"8BALLVM20.PRG",8
RUN

Simple Test Program

Here is a simple test program you can enter to play with EightBall when getting started:

:i0
byte b=0
for b=1:10
  pr.msg "Hello world ..."; pr.dec b; pr.nl
endfor
end
.

I have included the line editor commands to begin inserting text :i0 and to leave the editor and return to the interpreter (a single period on its own.)

You can list the program using the :l (the letter Ell, not the number 1!) command and run it using the EightBall interpreter using the run command.

Building the Code

Build Toolchain

I am building EightBall using cc65 v2.15 on Ubuntu Linux.

The Linux version of EightBall is currently being built using gcc v7.3.0. It should build with whatever version of gcc you have to hand.

In order to build Apple diskette images I use the open source Apple Commander tool. ADTPro is an awesome tool for transferring disk images to a real Apple II via a serial (RS-232) cable.

In order to build Commodore 1541 diskette images, I use the c1541 tool that comes with the open source VICE emulator.

I find the VICE emulator useful for testing on the Commodore C64 and VIC20 pathforms. MAME provides a useful Apple //e enhanced emulation.

Links to these projects:

Build Procedure

I use Ubuntu Linux (18.04 at the current time.) It should also be possible to build the project using any relatively recent Linux distribution.

First clone the repository from GitHub.

$ git clone https://github.com/bobbimanners/EightBall.git

Then, edit the Makefile to adjust the paths to point to your local installation of the cc65 compiler. If you wish to build disk images for Apple and Commodore machines, you will need to adjust the paths to point to your local installation of Apple Commander or VICE (for the c1541 tool).

$ cd EightBall
$ vi Makefile

Once you are satisfied with the Makefile, building the software is simple:

$ make

This will build executables for Linux using gcc and for 6502 targets using cc65. The build targets are as follows:

First Run (on Linux)

First start the EightBall editor/interpreter/compiler:

$ ./eightball

To load and run the unit test script within the EightBall interpreter:

:r"unittest.8b"
run

Then to compile it, and save the bytecode to the file bytecode:

comp "bytecode"
quit

And finally, to run the bytecode under the VM:

$ ./eightballvm

The bytecode disassembler may be used to examine the bytecode in a human-readable format:

$ ./disass

Both the VM and the disassembler prompt for the name of the bytecode file to load (bytecode in this example.)

Running Apple //e Version with MAME

You will have to find the Apple II ROMs online for use with MAME.

To start MAME and boot from eightball.dsk:

$ mame -w apple2ee -sl6 diskii -floppydisk1 eightball.dsk

Look here for further instructions.

Running C64 Version with VICE

To start the x64 emulator:

$ x64 -8 eightball.d64

Note that EightBall scripts on Commodore platforms must be encoded in PETSCII rather than ASCII. unittest.8bp is a PETSCII version of unittest.8b (created automatically using the Linux tr tool - see Makefile for details of how this is done.)

Look here for further instructions.

Running VIC20 Version with VICE

To start the xvic emulator:

$ xvic -mem all -drive8type 1541 -8 eightball.d64

Note that EightBall scripts on Commodore platforms must be encoded in PETSCII rather than ASCII.

Look here for further instructions.

Unit Tests

There is a unit test script unittest.8b written in EightBall.

It is quite large so it does not load in all 8-bit platforms. Deleting the comments would help! However I usually test using the Linux EightBall environment, so large scripts are less of a problem. Currently the script loads and runs on C64, but not Apple II or VIC20 (due to lack of memory for the source code.)

EightBall Language Reference and Tutorial

Variables

Defined Constants

EightBall allows the programmer to define constant values as follows:

const size = 10

Constant values are represented as 16 bit words internally.

Simple Types

EightBall has two basic types: byte (8 bits) and word (16 bits).

word counter = 1000
byte xx = 0

Variables must be declared before use. Variables must be initialized. A constant may be used as an initializer:

const size = 10*10
word mysize = size+3

The first four letters of the variable name are significant (this may be increased by changing VARNUMCHARS in eightball.c). Any letters after that are simply ignored by the parser.

Variables of type word are also used to store pointers (there is no pointer type in EightBall).

Arrays

At present, only 1D arrays are supported, but this will be expanded in future releases.

Array Declaration and Initialization

Arrays of byte and word may be declared as follows. The mandatory initializer is used to initialize the elements:

word myArray[100] = {1, 2, 3};              ' 1, 2, 3, 0, 0, 0 ...
byte storage[4] = {100, 200, 300, 200+200}; ' 100, 200, 300, 400, 0, 0, 0 ...

Initializer lists must be no longer than the number of elements in the array. The following is an error:

word bad[3] = {1, 2, 3, 4}; ' INITIALIZER LIST TOO LONG!

If the initializer list is shorter than the number of elements in the array then the remaining elements are set to zero. The empty list initializes all elements to zero:

word allzero[10] = {}

It is also possible to use string literals as array initializers. This is usually used with arrays of byte to initialize strings, for example:

byte msg[100] = "Please try again!"

The array msg will be initialized to the character values of the string literal, and a null terminator will be appended. Because strings are null-terminated, the string initializer can be no longer than the array size minus one:

byte aa[4] = "ABC"; # Okay
byte aa[4] = "ABCD"; # TOO LONG!

Note that string literals may also be used to initialize word arrays:

word vals[10] = "ABCABCABC"

Since the Commodore VIC20 and C64 lack the { and } symbols, [ and ] are used in their place, for example

word commodore[10] = [10, 9, 8 ]

Array Indexing

Array elements begin from 0, so the array storage above has elements from 0 to 9.

storage[0] = 0;  ' First element
storage[9] = 99; ' Last element

Array dimensions must be known at compile time, but expressions made up of constants (both defined constants and literal constants are allowed for array dimensions and for the members of the initializer list (if any). This is allowed:

word knownsize[10*10+5] = {}

And so is this:

const width = 20
const margin = 4
word knownsize[10*width+margin] = {margin, margin*2, margin*3}

But this is illegal because myvar is a regular variable, not a const:

word myvar = 10
word knownsize[10*myvar] = {1, 2, 3}

Expressions

Literal Constants

Constants may be decimal:

byte a = 10
word w = 65535
word q = -1

or hex:

byte a = $0a
word w = $face

or character:

byte c = 'a'
word w = 'Z'

Character literals assume the ASCII value of the character in the single quotes.

Operators

EightBall supports most of C's arithmetic, logical and bitwise operators. They have the same precedence as in C as well. Since the Commodore machines do not have all the ASCII character, some substitutions have been made (shown in parenthesis below.)

EightBall also implements 'star operators' for pointer dereferencing which will also be familiar to C programmers.

Arithmetic

Logical

Bitwise

Address-of Operator

The & prefix operator returns a pointer to a variable which may be used to read and write the variable's contents. The operator may be applied to scalar variables, whole arrays and individual elements of arrays.

word w = 123
word A[10] = 0
pr.dec &w;       ' Address of scalar w
pr.dec &A;       ' Address of start of array A
pr.dec &A[2]     ' Address of third element of array A

Note also that for arrays, evaluating just the array name with no index give the address of the start of the array. (This trick enables the array pass-by-reference feature to work.)

The following code will print "ALL THE SAME" on the console:

word A[10] = 0
word a1 = A
word a2 = &A
word a3 = &A[0]
if ((a1 == a2) && (a1 == a3))
  pr.msg "ALL THE SAME"; pr.nl
endif

'Star Operators'

EightBall provides two 'star operators' which dereference pointers in a manner similar to the C star operator. One of these (*) operates on word values, the other (^) operates on byte values. Each of the operators may be used both for reading and writing through pointers.

Here is an example of a pointer to a word value:

word val = 0;     ' Real value stored here
word addr = &val; ' Now addr points to val
*addr = 123;      ' Now val is 123
pr.dec *addr;     ' Recover the value via the pointer
pr.nl

Here is an example using a pointer to byte. This is similar to PEEK and POKE in BASIC.

word addr = $c000; ' addr points to hex $c000
byte val = ^addr;  ' Read value from $c000 (PEEK)
^val = 0;          ' Set value at $c000 to zero (POKE)

Parenthesis

Parenthesis may be used to control the order of evaluation, for example:

pr.dec (10+2)*3;   ' Prints 36
pr.dec 10+2*3;     ' Prints 16

Operator Precedence

Precedence LevelOperatorsExampleExample CBM
11 (Highest)Prefix Plus+a
Prefix Minus-a
Prefix Star*a
Prefix Caret^a
Prefix Logical Not!a
Prefix Bitwise Not~a.a
10Power ofa ^ b
Dividea / b
Multiplya * b
Modulusa % b
9Adda + b
Subtracta - b
8Left Shifta << b
Right Shifta >> b
7Greater Thana > b
Greater Than Equala >= b
Less Thana < b
Less Than Equala <= b
6Equalitya == b
Inequalitya != b
5Bitwise Anda & b
4Bitwise Xora ! b
3Bitwise Ora | ba # b
2Logical Anda && b
1 (Lowest)Logical Ora || ba ## b

Flow Control

EightBall supports a 'structured' programming style by providing multi-line if/then/else conditionals, for loops and while loops.

Note that the goto statement is not supported!

Conditionals

Syntax is as follows:

if z == 16
  pr.msg "Sweet sixteen!"
  pr.nl
endif

Or, with the optional else clause:

if x < 2
  pr.msg "okay"
  pr.nl
else
  pr.msg "too many"; pr.nl
  toomany = toomany + 1;
endif

For Loops

Syntax is as per the following example:

for count = 1 : 10
  pr.dec count
  pr.nl
endfor

While Loops

These are quite flexible, for example:

while bytes < 255
  call getbyte()
  bytes = bytes + 1
endwhile

Subroutines

Simple Subroutine Declaration

EightBall allows named subroutines to be defined, for example:

sub myFirstSubroutine()
  pr.msg "Hello"; pr.nl
endsub

All subroutines must end with endsub statement.

A subroutine may return a word value to the caller using the return statement.

sub mySecondSubroutine()
  return 2
endsub

If the flow of execution hits the endsub (without first encountering a return statement) then 0 is returned to the caller.

Simple Subroutine Invocation

The subroutine above can be called as follows:

call myFirstSubroutine()

When myFirstSubroutine hits a return or endsub statement, the flow of execution will return to the statement immediately following the call.

Local Variables

Each subroutine has its own local variable scope. If a local variable is declared with the same name as a global variable, the global will not be available within the scope of the subroutine. When the subroutine returns, the local variables are destroyed.

byte val = 10; ' Global byte variable
sub myThirdSubroutine()
  byte w[10] = 0;  ' Local array
  byte i = 0; ' Local byte iterator
  for i=0 : 9
    w[i] = val; ' Using both local and global variables
  endfor
endsub

Just like in C, a local variable can 'hide' a global of the same name:

word hideme = 10;
call obscuredByClouds()
end

sub obscuredByClouds()
  word hideme = 100;
  pr.dec hideme; pr.nl; ' Prints 100 (val of local), not 10 (val of global)
endsub

Argument Passing

Subroutines may take byte or word arguments, using the following syntax:

sub withArgs(byte flag, word val1, word val2)
  ' Do stuff
  return 0
endsub

This could be called as follows:

word ww = 0; byte b = 0;
call withArgs(b, ww, ww+10)

When withArgs runs, the expression passed as the first argument (b) will be evaluated and the value assigned to the first formal argument flag, which will be created in the subroutine's local scope. Similarly, the second argument (ww) will be evaluated and the result assigned to val1. Finally, ww+10 will be evaluated and assigned to val2.

Argument passing is by value, which means that withArgs can modify flag, val1 or val2 freely without the changes being visible to the caller.

Function Invocation

Subroutines may be invoked within an expression. In this case, the subroutine is executed and the value returned is evaluated within the expression in which it appears.

For example, the following subroutine:

sub adder(word a, word b)
  return a+b
endsub

Could be used in an expression like this:

pr.dec adder(10, 5); ' Prints 15

or like this:

word res = adder(2, 3);
pr.dec res; ' Prints 5

Functions may invoke themselves recursively.

Passing by Reference

Passing by reference allows a subroutine to modify a value passed to it. EightBall does this using pointers, in a manner that will be familiar to C programmers. Here is adder implemented using this pattern:

sub adder(word a, word b, word resptr)
  *resptr = a+b
endsub

Then to call it:

word result
call adder(10, 20, &result)

This code takes the address of variable result using the ampersand operator and passes it to subroutine adder as resptr. The subroutine then uses the star operator to write the result of the addition of the first two arguments (10 + 20 in this example) to the word pointed to by resptr.

Unlike C, there are no special pointer types. Pointers must be stored in a word variable, since they do not fit in a byte. Pointers are dereferenced using the * operator to reference words or the ^ operator to reference bytes.

Here is an example of using a pointer to byte:

word xx = 0
call poke(&xx, 10)
pr.dec xx; pr.nl;   ' Should print 10
end

sub poke(word addr, byte val)
    ^addr = val
endsub

Passing an Array by Reference

It is frequently useful to pass an array into a subroutine. It is not very useful to use pass by value for arrays, since this may mean copying a large object onto the stack. For these reasons, EightBall implements a special pass by reference mode for array variables, which operates in a manner similar to C.

Here is an example of a function which takes a regular variable and an array:

sub clearArray(byte arr[], word sz)
  word i = 0
  for i = 0 : sz-1
    arr[i] = 0
  endfor
endsub

This may be invoked like this:

word n = 10
byte A[n] = 99
call clearArray(A, n)

Note that the size of the array is not specified in the subroutine definition - any size array may be passed. Note also that the corresponding argument in the call is simply the array name (no [] or other annotation is permitted.)

This mechanism effectively passes a pointer to the array contents 'behind the scenes'.

End Statement

The end statement marks the normal end of execution. This is often used to stop the flow of execution running off the end of the main program and into the subroutines (which causes an error):

call foo()
pr.msg "Done!"; pr.nl
end
sub foo()
  pr.msg "foo"; pr.nl
endsub

Code Format

Whitespace, Semicolon Separators

EightBall code can be arranged however you wish. For example, this:

word w = 0; for w = 1 : 10; pr.dec w; pr.nl; endfor

is identical to this:

word w = 0
for w = 1 : 10
  pr.dec w; pr.nl
endfor

Semicolons must be used to separate multiple statements on a line (even loop contructs as seen in the first example above.)

Indentation of the code (as shown in the examples in this manual) is optional, but encouraged.

Comments

Comments are introduced by the single quote character. A full line comment may be entered as follows:

' This is a comment

If you wish to comment after a statement, note that a semicolon is required to separate the statement and the comment:

pr.msg "Hello there"; ' Say hello!!!

Bits and Pieces

Run Stored Program

Simple:

run

Program runs until it hits an end statement, an error occurs or it is interrupted by the user.

Compile Stored Program

comp "bytecodefile"

The program in memory is compiled to EightBall VM bytecode. This is written to a file specified.

The bytecode file may be executed using the EightBall Virtual Machine that is part of this package.

Quit EightBall

quit

Returns to ProDOS on Apple II, or to CBM BASIC on C64/VIC20.

Clear Stored Program

new

Clear All Variables

clear

Show All Variables

vars

Variables are shown in tabular form. The letter 'b' indicates byte type, while 'w' indicates word type. For scalar variables, the value is shown. For arrays, the dimension(s) are shown.

Show Free Space

free

The free space available for variables and for program text is shown on the console.

Input and Output

Only console I/O is supported at present. File I/O is planned for a later release.

Console Output

pr.msg

Prints a literal string to the console:

pr.msg "Hello world"

pr.dec

Prints an unsigned decimal value to the console:

pr.dec 123/10

pr.dec.s

Prints a signed decimal value to the console:

pr.dec.s 12-101

pr.hex

Prints a hexadecimal value to the console (prefixed with '$'):

pr.hex 1234

pr.nl

Prints a newline to the console:

pr.nl

pr.ch

Prints a character to the console:

pr.ch 'A'
pr.ch 65; ' Same as above

pr.str

Prints a byte array as a string to the console. The string is null terminated (so printing stops at the first 0 character):

pr.str A; ' A is a byte array

mode

This is for setting the text video mode on the Apple II only. It only works in the interpreter at present.

mode 40; ' Set 40 column mode
mode 80; ' Set 80 column mode

Console Input

kbd.ch

Allows a single character to be read from the keyboard. Be careful - this function assumes the argument passed to it a pointer to a byte value into which the character may be stored.

We can print a character obtained from the keyboard as follows:

byte c = 0
while 1
  kbd.ch &c
  pr.ch c
endwhile

kbd.ln

Allows a line of input to be read from the keyboard and to be stored to an array of byte values. This statement takes two arguments - the first is an array of byte values into which to write the string, the second is the maximum number of bytes to write.

byte buffer[100] = 0;
kbd.ln buffer, 100
pr.msg "You typed> "
pr.str buffer
pr.nl

Line Editor

Eightball includes a simple line editor for editing program text. Programs are saved to disk in plain text format (ASCII on Apple II, PETSCII on CBM).

Be warned that the line editor is rather primitive. However we are trying to save memory.

Editor commands start with the colon character (:).

Load from Disk

To load a new source file from disk, use the :r 'read' command:

:r "myfile.8b"

Save to Disk

To save the current editor buffer to disk, use the :w 'write' command:

:w "myfile.8b"

On Commodore systems, this must be a new (non-existing) file, or a drive error will result.

Insert Line(s)

Start inserting text before the specified line. The editor switches to insert mode, indicated by the '>' character (in inverse green on CBM). The following command will start inserting text at the beginning of an empty buffer:

:i0
>

One or more lines of code may then be entered. When you are done, enter a period '.' on a line on its own to return to EightBall immediate mode prompt.

Append Line(s)

Append is identical to the insert command described above, except that it starts inserting /after/ the specified line. This is often useful to adding lines following the end of an existing program.

Delete Line(s)

This command allows one or more lines to be deleted. To delete one line:

:d33

or to delete a range of lines:

:d10,12

Change Line

This command allows an individual line to be replaced (like inserting a new line the deleting the old line). It is different to the insert and append commands in that the text is entered immediately following the command (not on a new line). For example:

:c21:word var1=12

will replace line 21 with word var1=12. Note the colon terminator following the line number.

Note that the syntax of this command is contrived to allow the CBM screen editor to work on listed output in a similar way to CBM BASIC. Code may be listed using the :l command and the screen may then be interactively edited using the cursor keys and return, just as in BASIC.

List Line(s)

This allows the program text to be listed to the console. Either the whole program may be displayed or just a range of lines. To show everything:

:l

To show a range of lines:

:l0-20

(The command is the letter Ell, not the number 1!)

EightBall Compiler and Virtual Machine

What is it?

The EightBall Virtual Machine is a simple runtime VM for executing the bytecode produced by the EightBall compiler. The EightBall VM can run on 6502 systems (Apple II, Commodore VIC20, C64) or as a Linux process.

How to use it?

The EightBall system is split into two separate executables:

On Linux, the editor/interpreter/compiler is eightball and the Virtual Machine is eightballvm.

On Apple II ProDOS, the editor/interpreter/compiler is eightball.system and the VM is 8bvm.system.

On Commodore VIC20, the editor/interpreter/compiler is 8ball20.prg and the VM is 8ballvm20.prg.

On Commodore C64, the editor/interpreter/compiler is 8ball64.prg and the VM is 8ballvm64.prg.

Here is how to use the compiler:

The compiler will dump an assembly-style listing to the console and also write the VM bytecode to a binary file called bytecode. If all goes well, no inscrutable error messages will be displayed.

Then you can run the VM program for your platform. It will load the bytecode from the file bytecode and execute it. Running compiled code under the Virtual Machine is much faster than the interpreter (and also more memory efficient.)

VM Internals

VM Architecture

The EightBall Virtual machine has the following features:

The evaluation stack is used for all computations. The VM offers a variety of instructions for maniplating the evaluation stack. All calculations, regardless of the type of the variables involved, is performed using 16 bit arithmetic.

For shorthand, we define the names X, Y, Z, T for the top four slots in the evaluation stack. This notation is stolen from the world of HP RPN calculators.

The call stack is used for all memory allocation within the virtual machine, as follows:

VM Instructions

Note that all the instructions with names ending in 'I' are so-called 'immediate mode' instructions. This means that the operand is the 16 bit word following the opcode, rather than the topmost element of the evaluation stack. The 'immediate mode' operand may be a data value or an address.

Relative mode instructions allow addressing relative to the frame pointer. This is helpful for easy access to local variables.

InstructionDescriptionImm?Rel?
ENDTerminate execution
LDIPushes the following 16 bit word to the evaluation stack*
LDAWReplaces X with 16 bit value pointed to by X.
LDAWIPushes the 16 bit value pointed to by following 16 bit word to evaluation stack.*
LDABReplaces X with 8 bit value pointed to by X.
LDABIPushes the 8 bit value pointed to by following 16 bit word to evaluation stack.*
STAWStores 16 bit value Y in addr pointed to by X. Drops X and Y.
STAWIStores 16 bit value X in addr pointed to by following 16 bit word. Drops X.*
STABStores 8 bit value Y in addr pointed to by X. Drops X and Y.
STABIStores 8 bit value X in addr pointed to by following 16 bit word. Drops X.*
LDRWReplaces X with 16 bit value pointed to by X+FP+1.*
LDRWIPushes the 16 bit value pointed to by following 16 bit word +FP+1 to evaluation stack.**
LDRBReplaces X with 8 bit value pointed to by X+FP+1.*
LDRBIPushes the 8 bit value pointed to by following 16 bit word +FP+1 to evaluation stack.**
STRWStores 16 bit value Y in addr pointed to by X+FP+1. Drops X and Y.*
STRWIStores 16 bit value X in addr pointed to by following 16 bit word +FP+1. Drops X.**
STRBStores 8 bit value Y in addr pointed to by X+FP+1. Drops X and Y.*
STRBIStores 8 bit value X in addr pointed to by following 16 bit word +FP+1. Drops X.**
SWPSwaps X and Y
DUPDuplicates X -> X, Y
DUP2Duplicates X -> X,Z; Y -> Y,T
DROPDrops X
OVERDuplicates Y -> X,Z
PICKDuplicates stack level specified in X+1 -> X
POPWPop 16 bit value from call stack, push onto eval stack [X]
POPBPop 8 bit value from call stack, push onto eval stack [X]
PSHWPush 16 bit value in X onto call stack. Drop X.
PSHBPush 8 bit value in X onto call stack. Drop X .
DISCDiscard X bytes from call stack. Drop X.
SPTOFPCopy stack pointer to frame pointer. (Enter function scope)
FPTOSPCopy frame pointer to stack pointer. (Release local vars)
ATORConvert absolute address in X to FP-relative address
RTOAConvert FP-relative address in X to absolute address
INCX = X+1.
DECX = X-1.
ADDX = Y+X. Y is dropped.
SUBX = Y-X. Y is dropped.
MULX = Y*X. Y is dropped.
DIVX = Y/X. Y is dropped.
MODX = Y%X. Y is dropped .
NEGX = -X
GTX = Y>X. Y is dropped.
GTEX = Y>=X. Y is dropped.
LTX = Y<X. Y is dropped.
LTEX = Y<=X. Y is dropped.
EQLX = Y==X. Y is dropped.
NEQLX = Y!=X. Y is dropped.
ANDX = Y&&X. Y is dropped.
ORX = Y||X. Y is dropped.
NOTX = !X
BANDX = Y&X. Y is dropped.
BITORX = Y|X. Y is dropped.
BITXORX = Y^X. Y is dropped.
BITNOTX = ~X.
LSHX = Y<<X. Y is dropped.
RSHX = Y>>X. Y is dropped.
JMPJump to address X. Drop X.
JMPIJump to 16 bit word following opcode.*
BRCIf Y!= 0, jump to address X. Drop X, Y.
BRCIIf X!= 0, jump to 16 bit word following opcode. Drop X.*
JSRPush PC to call stack. Jump to address X. Drop X.
JSRIPush PC to call stack. Jump to 16 bit word following opcode. Drop X.*
RTSPop call stack, jump to the address popped.
PRDECPrint 16 bit value in X in decimal. Drop X.
PRHEXPrint 16 bit value in X in hexadecimal. Drop X.
PRCHPrint character in X. Drop X.
PRSTRPrint null terminated string pointed to by X. Drop X.
PRMSGPrint literal string at PC (null terminated)
KBDCHPush character from keyboard onto eval stack
KBDLNObtain line from keyboard and write to memory pointed to by Y. X contains the max number of bytes in buf. Drop X, Y.

VM Memory Organization

cc65 places the VM excutable code and static evaluation stack (32 bytes) in low memory. In an optimized virtual machine implementation, this would be placed in zero page.

Virtual machine addresses correspond to physical machine addresses on 6502 systems.

Under Linux, the virtual machine uses a 64K byte array as workspace, and addresses point into this space.

The call stack grows down from top of memory.

The bytecode is loaded at the start of memory. This location differs depending on the platform:

These addresses are chosen to allow space for the EightBall VM executable, which loads below these addresses. These values can be tuned by inspecting the map files generated by cc65.

Interpreter / Compiler Internals

Relationship of Interpreter / Compiler

EightBall was first implemented as an interpreted language (although the language design was always intended to permit compilation.) The bytecode compiler and virtual machine were added with v0.5 in April 2018.

In order to use the least code possible, the compiler uses the same data structures as the interpreter, but in a different way.

Interpreter Memory Organization

cc65 places the executable code of the EightBall line editor / interpreter / compiler in low memory.

There are two storage areas (or 'arenas') which are denoted as HEAP1 and HEAP2 in the eightball.c code. The historical origin of this organization is the fact that EightBall first originated as a language targetting the VIC20 with 32K expansion. In this configuration, there is an 8K memory block (starting at address $A000m referred to as BLK5 in the VIC20 design) which is not contiguous with the rest of RAM. For the VIC20, BLK5 was designated as HEAP1 and the remainder of RAM (above the executable code) was designated HEAP2. For other 6502 architectures (Apple II, Commodore 64), the HEAP1 / HEAP2 arenas are maintained, but since there is no 'gap' in the memory map, the boundary between them may be adjusted to any arbitrary address.

The division of interpreter memory into two distinct blocks turns out to be quite useful, as we shall see below.

The source code of the program is stored in plain ASCII (or PETSCII on Commodore systems) text at the bottom of HEAP2 immediately above the EightBall executable code (using routine alloc2bttm()). As more lines of source code are added, the it is added to the heap, growing upwards to higher addresses.

Note that the lower bounds of arena HEAP2 have to be adjusted by hand in eightball.c when the code changes size. The size of the code segments generated by cc65 can be determined by inspecting the map file created by the compiler.

Variables

Global and local variables are allocated at the top of HEAP1, from the highest available memory address down. For each variable a small var_t header is stored, consisting of the first four characters of the name, a byte which records whether it is a byte or word variable and also the number of dimensions. If the number of dimensions is zero then this indicates a scalar variable, otherwise it is an array of the specified number of elements. The var_t header also includes a two byte pointer to next, allowing them to be assembled into a linked list.

Following the var_t header the actual variable data is stored:

Normally when a global or local array is allocated, the data block immediately follows. However the pointer to the data block is exploited to allow the 'array pass by reference' feature to be implemented. In this case, the var_t header and the two byte datablock pointer is copied into the local frame (the pointer still refers to the original datablock of the array passed by reference.)

Subroutines: Entry and Return, Local Variables and Parameters

The interpreter maintains a pointer to the beginning of the local stack frame (varslocal) as well as to the beginning of the list (varsbegin) which allows the global variables to be located. When operating at the global scope (ie: not within a subroutine) varslocal points to varsbegin.

When entering a subroutine a special var_t entry is made for a word variable using the otherwise illegal name "----" to mark the stack frame and this is pushed to the call stack. The value of this this variable is used to store the current value of varslocal (ie: the previous stack frame). This is used to unwind the stack when a subroutine exits.

Local variables are allocated on HEAP1 in exactly the same way as globals. The variable search routine getintvar() knows to search the local variables and then (if within a subroutine) the globals also. The stack frame marks allow getintvar() to know where the globals end and the stack frame of the first subroutine begins.

The interpreter creates a local variable for each parameter, copying the value provided by the caller. Parameters behave exactly like local variables, because they are local variables like any other.

When leaving a subroutine with return or endsub, the interpreter uses the innermost stack frame (which, remember, records the stack frame of its calling subroutine) to unwind the stack. The local variables and the innermost stack frame are released and varslocal is set to point to the caller stack frame. Finally, the flow of control returns to the statement following the call (or the evaluation of the expression including the function continues, in the case of function invocation.)

Summary of Interpreter Memory Allocations

Compiler Memory Organization

The compiler shares most of the infrastructure with the interpreter. The source code of the program is obviously still stored at the bbottom of HEAP2.

The compiled bytecode is written to the beginning of HEAP1, starting from the lowest address and working up. Since no actual data is stored in HEAP1 when compiling (only var_t headers and addresses), it is hoped that there will be enough space for the compiled code without having it collide with the symbol tables (which are stored from the top of HEAP1 going down).

Variables

The main difference is that instead of storing global and local variables in HEAP1, the compiler uses the var_t data structures to keep track of the variable during compilation only - they serve as temporary symbol tables so the compiler can keep track of the address of all the variables in scope. Instead of the payload described above, the entries created by the compiler contain a pointer to the address of the variable in the virtual machine's address space.

Within the VM there is no 'management overhead' for storing variables - a word is always two bytes, a byte always one byte. All of the housekeeping takes place within the compiler (which has to keep track of the address of every variable in scope.)

The compiler has a simple allocator (managed by rt_push_callstack() and rt_pop_callstack()) that mimics the behaviour of the virtual machine, keeping track of the value of the stack pointer (SP). In the same way that the interpreter allocates all variables (global and local) on the call stack, the compiler uses the same strategy of allocating all variables on the call stack of the virtual machine ("VM call stack" from now on.) Since the compiler target memory allocator functions keep track of the VM SP register, the compiler is able to push values to the call stack and still know the addresses to be able to access them later. This can make the compiler output hard to read for humans however!

Subroutines: Entry and Return, Local Variables and Parameters

The EightBall Virtual Machine has a number of features which are intended to make it easier to implement subroutine call and return, argument passing etc. In particular, there is a special frame pointer (FP) register which is useful for easily accessing parameter and locals.

Before generating code to enter a subroutine, the compiler ensures code has been generated to evaluate any parameters and push the result to the call stack. Then the compiler emits a JSR instruction to call the subroutine entry point. The virtual machine will automatically store the return address on the VM call stack and the VM program counter will be set to the entry point.

On entry to the subroutine, the compiler will emit VM instruction SPFP which pushes the current value of the frame pointer (FP) to the VM call stack and copies the stack pointer (SP) to the frame pointer (FP). This sets up the call frame allowing us to easily refer to the parameters and the local variables.

The virtual machine makes this simple by providing special instructions LDRW, LDRB, STRW and STRB which load and store word and byte values to memory using addressing relative to the frame pointer FP. In this relative addressing mode, the parameters which were pushed to the call stack before entry have small positive valued addresses (FP + offset). Local variables are pushed to the call stack, which grows down as usual. As a result, the local variables will have small negative addresses relative to the frame pointer (FP - offset).

At the same time, absolute addressing via instructions LDAW, LDAB, STAW and STAB can be used to access the global variables.

On exit from the subroutine, the compiler emits code to evaluate the return value and leave it on the evaluation stack in the topmost slot (X). It then emits a FPSP instruction which copies the frame pointer (FP) to the stack pointer (SP) and restores the value of the frame pointer by popping a word from the call stack. Copying FP to SP has the effect of immediately releasing all of the space (local variables) allocated in the topmost stack frame. The stack pointer is then positioned to where the frame pointer is topmost, so it is available to be popped and restored to FP. The overall effect is to unwind the stack back to the calling stack frame.

The return value is left on the evaluation stack. If the calling code does not use it, the compiler must issue a DROP instruction to discard it.

Subroutine Call Linkage

The compiler also maintains a linked list of subroutine calls and a linked list of subroutine entry points which are used for the final step of compilation - internal linkage. Subroutine calls and entry points are both represented using records of type sub_t, each of which contain the first eight characters of the subroutine name, a two byte address pointer and a two byte pointer to the next record.

The compiler allocates these linked lists (anchored by callsbegin and subsbegin) at the end of HEAP2, growing down towards the source code, which grows up from the bottom of this same arena. The linked list of subroutine calls is freed as soon as compilation is completed.

Summary of Compiler Memory Allocations

Compiler Address Fixups

When compiling EightBall code, there are instances where the generated code needs to jump or branch ahead, to some location within code that has yet to be generated. In this case, the compiler will emit the dummy address $ffff and will come back later to insert the correct address, once it is known. This is referred to as an "address fixup."

Conditionals / While loops

When compiling if / endif or if . else, endif conditionals, the compiler needs to generate code to branch forward to jump over the if or else code blocks. Similarly, for while / endwhile loops, the compiler needs to branch forward to jump over the loop body if the condition is false. In all these cases, the address fixup is computed when the destination code is generated.

Subroutine Calls

Another situation where address fixups are required is subroutine calls. When a subroutine is called, a new entry is recorded in the callsbegin linked list, containing the beginning of the subroutine name and a pointer to the VM address of the call address to be fixed up. When a subroutine definition is encountered, a new entry is recorded in the subsbegin linked list, again containing the subroutine name but this time with the address of the entry point.

The final step of compilation involves iterating through the callsbegin list, looking up each subroutine name in the subsbegin list. If the name is found, then the dummy $ffff at the fixup address is replaced with the entry point of the filename. Otherwise a linkage error is (cryptically) reported.

Data Types

A byte variable is one byte everywhere. A word variable is two bytes everywhere, except in the Linux interpreter (where is is 32 bit word, 4 bytes.)

PlatformSize in Bytes
6502 Interpreterword 2, byte 1
6502 VMword 2, byte 1
Linux Interpreterword 4, byte 1
Linux VMword 2, byte 1

Code Examples

Hello World

This one is obligatory:

pr.msg "Hello world!"; pr.nl
end

You can omit the end statement if you like.

Recursive Factorial

This example shows how EightBall can support recursion. I should point out that it is much better to do this kind of thing using iteration, but this is a fun simple example:

pr.dec fact(3); pr.nl
end

sub fact(word val)
  pr.msg "fact("; pr.dec val; pr.msg ")"; pr.nl
  if val == 0
    return 1
  else
    return val * fact(val-1)
  endif
endsub

fact(3) calls fact(2), which calls fact(1), then finally fact(0).

See eightballvm.h for technical details.

Prime Number Sieve

Here is the well-known Sieve of Eratosthenes algorithm for finding prime numbers, written in EightBall:

 ' Sieve of Eratosthenes

const sz=20
byte A[sz*sz] = {}
word i=0
pr.msg "Initializing array ..."; pr.nl
for i=0:sz-1
 A[i]=1
endfor
call doall(sz, A)
end

sub doall(word nr, byte array[])
  word n = nr * nr
  pr.msg "Sieve of Eratosthenes ..."
  pr.msg "nr is "; pr.dec nr; pr.nl
  call sieve(n, nr, array)
  call printresults(n, array)
  return 0
endsub

sub sieve(word n, word nr, byte AA[])
  pr.msg "Sieve"
  word i = 0; word j = 0
  for i = 2 : (nr - 1)
    if AA[i]
      j = i * i
      while (j < n)
        AA[j] = 0
        j = j + i
      endwhile
    endif
  endfor
  return 0
endsub

sub printresults(word n, byte AA[])
  word i = 0
  for i = 2 : (n - 1)
    if AA[i]
      if i > 2
        pr.msg ", "
      endif
      pr.dec i
    endif
  endfor
  pr.msg "."
  pr.nl
  return 0
endsub

(See the Wiki for more code examples.)