Awesome
Smalltalk-80
Welcome to my "by the Bluebook" C++ implementation of the Smalltalk-80 system that runs on OS X, Windows, Linux, OpenBSD, and FreeBSD! Since first reading about Smalltalk in the August 1981 issue of Byte magazine, I have always been intrigued by it. At that time all we had were slow 8-bit computers with 4K of RAM barely running fast enough to do anything useful. I was stunned as I read through the article -- this was futuristic alien technology that was surely beyond my reach. In 1988, while attending the University of Washington, I was exposed to two memorable pieces of technology: The first was Steve Job's NeXTCube and the other was a Tektronix 4404 workstation running Smalltalk-80. Both were, and still are, amazing. It was only fitting that I implemented this Smalltalk on a descendent of the NeXTCube -- a MacBook Pro laptop.
In the late 90s, I discovered that my heroes from Xerox PARC got the "band back together" and released Squeak. I was thrilled to finally have a nice system to play with. But, I didn't care for the look and feel and missed the minimalism I had experienced on the Tektronix system. I also had a copy of the Bluebook and Greenbook and always daydreamed about making my own implementation, but I had no access to the porting kit that was used. That changed a few years ago when I found Mario Wolczko's website: http://www.wolczko.com/st80/ He had the Xerox release 2 virtual image (from magnetic tape) as well as the instruction manual and hand-typed smalltalk sources from the Bluebook! He also had his implementation, which was written in pre-ANSI C. But, despite many hours, I was not able to get anything to run. I then started to consider making my own implementation.
I knew this wouldn't be an easy task, as there would undoubtedly be errors in the book (there were). I also knew that C++ likes to index arrays starting from zero whereas Smalltalk starts at 1, so surely there would be off-by-one errors (there were). Smalltalk has a different operator precedence than C/C++, so I was on guard about that as well (to their credit, with a few exceptions, the expressions were parenthesized in a way where that wasn't an issue). Then there is reference counting being used, so there would probably be reference counting errors too (there were) and finally memory was a bunch of 16-bit words so you could accidentally store/fetch the wrong thing without realizing it. To top it off, the Alto is a big-endian system so I had to decide what to do about that (I byte swapped the virtual image as recommended by the Xerox guide -- their guidance was solid, but they missed the need to byte swap floats).
I started working on this in late December of 2019, but didn't really focus on it for a few months until the Corona virus struck and I decided to "quarantine myself" so as not to vector the virus to someone else. I then worked in earnest on it. The Xerox porting kit included reference "trace files" you could compare against your system. I felt an incredible rush when I passed trace2. After that the system ran a little while longer, and ran out of memory. I had no idea what was going on (this is before I had the display stuff working). To gain visibility, I intercepted and logged message sends using the show: and error: selectors as well as big-looking strings being pushed on the stack. Sure enough there was a traceback string that would have been displayed by the smalltalk debugger had only it made it that far. But, I was able to figure where I screwed up and then felt another rush when the screen was drawn into a Form I was able to dump to a .pbm
file and see! After that I implemented the host graphical support for Smalltalk using SDL and implemented the input primitives so I had a working mouse and keyboard. I was finally able to really use the system. At that point, the system helped me find other issues! The graphical Smalltalk inspector and debugger really did the heavy lifting for me and saved me countless hours. The resiliency of the system is amazing.
But... since no file system support was available (the system image used the Alto file system classes), I could only see decompiled code in the source browser (which is still very useful and you could edit that -- it just didn't show temporary names you actually used and you lose your comments). Well, I had to decide what to do about that. Should I try to emulate the Alto file system? I decided against that and followed the approach used by Tektronix in the Greenbook -- I subclassed FileDirectory, File, and Page and created PosixFileDirectory, PosixFile, and PosixPage. I fired up VSCode and wrote an implementation for them that I could copy and paste into my Smalltalk (I added a way to paste characters from the clipboard into the input queue). After I had those implemented, I tested them, and once I was satisfied, I created the SourceFiles array with file streams for Smalltalk-80.sources and Smalltalk-80.changes. Once I did that --the instant I assigned that variable -- the magic happened. BANG! All of a sudden I could browse real source code in the browser! Now that I had a working file system, I re-pasted my implementation which was then stored by the system (to the changes file) and I could finally see my implementation with real temporary names and comments! It was the very delightful bootstrapping experience that makes you grin from ear to ear. I then did Smalltalk condenseChanges
and saved the image. After I got it working on OS X, I moved it over to Windows in under an hour and also got it running under Linux in 15 minutes (for Linux, I tested it in a VirtualBox without graphics acceleration and the performance was bad -- hopefully on a "real" Linux machine it performs better).
Is the system usable? YES! You will definitely be constrained by the size of the object table, but you can still do stuff. I have to admit that I prefer the spartan look of Smalltalk-80 vs Pharo and Squeak (even though they are far superior).
There's a lot more to tell, but that's the gist of it. It was fun to do, and I'm glad that after all these years I was able to make it happen. I am still amazed by what the geniuses at PARC did. To think that amazing system I read about in 1981 was almost ten years old at the time of publication is mind boggling.
Using Smalltalk
Smalltalk-80 uses a three button mouse labeled Red (the left mouse button), Yellow (the middle), and Blue (the right button). The Red button is for clicking and selection. The Yellow is for running commands like "doit" and the Blue for things like closing a window. If you have a three button mouse you can use the -three
command line option to use the traditional mapping. If you do not have a three button mouse, you can use the alternate (and default) mapping:
Two button mouse mapping
Button | Mapping |
---|---|
Left Button | Maps to Red button (selection) |
Right Button | Maps to Yellow button (doit) |
Alt+Left | Maps to blue button (close). On the Macintosh, this is Command+Left |
Note: Ctrl+Left Button can also be used as the "yellow button."
The nice thing about building things by the book is there are books to document how to use the system and how it works!
Arrow Notation
Assignment in Smalltalk is made using the left-arrow which is written with shift -
(_
), the vertical-arrow is entered with shift 6
(^
).
Building and running
OS X
You will need to install SDL 2.0.12 or later. You can download it from http://libsdl.org. Choose the development library .dmg
and install. Then, go into the osx folder and type:
make -f MakefileRT
If you installed SDL2 through a package manager, or built it yourself, so your system has sdl-config
in your path you can simply do:
make
You should then be able to do:
./Smalltalk -directory ../files
See the command line options section below for what this means. There is an xcode project you can open if you like.
Windows
I have included the SDL 2.0.12 headers/binary so all you need to do is open the project in the windows directory and hit Run.
Linux
You need to install SDL 2.0.12 or later. You can either go to http://libsdl.org and download and build it yourself or use a package manager to install it.
Then, cd into the linux subdirectory of this project and type make
.
You should then be able to do:
./Smalltalk -directory ../files
See the command line options section below for what this means.
OpenBSD and FreeBSD
The process is much the same for OpenBSD and FreeBSD as it is for Linux, except cd into the bsd subdirectory instead.
Configuration options
The behavior of the Smalltalk application can be customized through #define
settings in the header file for each module.
Object Memory (objmemory.h) #defines
The Object Memory #defines
determines the garbage collection scheme to be used (all the approaches described in the Bluebook are available).
Define | Meaning |
---|---|
GC_MARK_SWEEP | Use Mark and Sweep garbage collection |
GC_REF_COUNT | Use the reference counting scheme |
RUNTIME_CHECKING | Include runtime checks for memory accesses |
RECURSIVE_MARKING | The book describes a recursive marking algorithm that is simple, but consumes stack space. If this symbol is defined, that algorithm is used. If not the more complicated, and clever, pointer reversal approach is used instead. |
The GC_MARK_SWEEP
and GC_REF_COUNT
flags are not mutually exclusive. If both are defined (and they are), normal reference counting will be performed, with full garbage collection only when memory is exhausted or the object table becomes full. Smalltalk-80 generates a lot of cyclical references (e.g. a MethodContext references, through a temporary field, a BlockContext that back references the context through the sender field). A Mark and Sweep only approach will result in many frequent collections as the object table will quickly fill with MethodContexts and other frequently allocated objects as the system runs. Some Smalltalk implementations recycle contexts for this reason.
Interpreter (interpreter.h) #defines
The interpreter #defines
are provided to allow the inclusion/exclusion of optional primitives as well as a handful of helpful debugging methods. If an optional primitive is not implemented, a Smalltalk fall-back is executed instead. This can have a significant effect on performance.
Define | Meaning |
---|---|
DEBUGGING_SUPPORT | Adds some additional methods for debugging support |
IMPLEMENT_PRIMITIVE_NEXT | Implement the optional primitiveNext primitive |
IMPLEMENT_PRIMITIVE_AT_END | Implement the optional primitiveAtEnd primitive |
IMPLEMENT_PRIMITIVE_NEXT_PUT | Implement the optional primitiveNextPut primitive |
IMPLEMENT_PRIMITIVE_SCANCHARS | Implement the optional primitiveScanCharacters primitive |
Application (main.cpp) #defines
When running under Windows I ran into two problems. First, the mouse cursor wouldn't reliably change if the left mouse button was being held down (e.g. when reframing a window). The second issue was that the mouse cursor was very small on high resolution displays, even when system scaling options were set to compensate for it. For these reasons, I added the option to have the app render the mouse cursor rather than the operating system. The SOFTWARE_MOUSE_CURSOR
can be defined to do this. I set this conditionally if it's a Windows build. It works on the other platforms, but is unnecessary as they behave properly without it.
Command line arguments
The Smalltalk application takes a number of arguments. The only required argument is -directory
path which specifies the directory where the snapshot image can be found. Here is the complete list:
Argument | Meaning | Default |
---|---|---|
-directory path | Specifies the directory that contains the snapshot image, as well as the Smalltalk-80.sources and Smalltalk-80.changes files. Any other files in the directory will be accessible from Smalltalk-80 using the normal file/directory access methods. | required |
-three | Use the three button mapping | two button scheme |
-image | Name of the snapshot file to use. | snapshot.im |
-cycles | Number of VM instructions to run per update loop | 1800 |
-vsync | Turn on vertical sync for synchronizing the frame rate with the monitor refresh rate. This can eliminate screen tearing and other artifacts, at the cost of some input latency. | off |
-delay ms | if vsync is not used a delay can be specified after presenting the next frame to the GPU. This is useful for lowering the CPU usage while still enjoying the benefits of not using vsync | 0 |
-2x | Specifies that the display should be doubled. Helpful for farsighted folks, or people running on very high resolution displays | off |