Home

Awesome

Varaboy

Varaboy is a Game Boy emulator written in uxntal for the Varvara system.

Special Thanks

Features

Controls

KeyFunction
HomeStart
ShiftSelect
CtrlA
AltB
ArrowsDpad
0-9Set frame skip (default: 0)
EscapeWrite SRAM to disk and quit

Stability

Note that due to the incomplete MBC implementations and SRAM not being repacked unless you quit with the escape key I strongly discourage playing any game seriously with this emulator at this time. Games will almost certainly break, and save files will almost certainly be lost/corrupted.

Usage

This emulator requires a UXN emulator to run. The name of the Game Boy ROM to run is provided on the command line, for example:

uxnemu varaboy.rom tetris.gb

Note: Since uxngb (my UXN VM written for Game Boy) only supports up to 8KiB UXN ROMs, it's not possible to run this Game Boy emulator inside uxngb. It is entirely possible to run UXN ROMs 8KiB or smaller inside uxngb inside varaboy though!

Screenshots

cpu_instrs dmg_acid2 sml tetris megaman sml2 megaman5 ffl3 shocklobster fruitpursuit deathplanet uxn_screen

Compatibility

Note that unless otherwise noted, only a few minutes of testing per game was performed.

GameNotes
TetrisPlayable
Super Mario LandPlayable
The Legend of Zelda: Link's AwakeningPlayable
Mega Man: Dr. Wily's RevengePayable
Mega Man VPlayable
Mario's PicrossPlayable
Final Fantasy Legend IIIPlayable
Wario Land IIPlayable
Super Mario Land 2: Six Golden CoinsMap corrupt, levels playable
Donkey KongFreezes after first level, visual glitches in cutscenes
Dr. MarioFreezes when you try to start the game
Shock Lobster (Homebrew)Playable
Fruit Pursuit Beta (Homebrew)Playable
Adjustris (Homebrew)Playable
uxngb (Homebrew)Playable
Death Planet (Homebrew)Playable
Libbet (Homebrew)Playable
Geometrix (Homebrew)Playable
Sam Mallard (Homebrew)Hangs on startup
Quartet (Homebrew)Works, but RNG doesn't function due to the unimplemented joypad interrupt

How it works

Both the Game Boy and UXN use a 16bit address space ($0000-$ffff). The Game Boy has a large region of "echo RAM" from $e000 to $fdff, which mirrors the contents of WRAM (c000~ddff) and was considered off-limits for Game Boy software by Nintendo.

Varaboy starts at the UXN entry point ($0100, the same as the Game Boy entry point), sets up some basic stuff, reads the GB ROM header to load the appropriate MBC handler code, and then jumps to the main runtime code inside echo RAM. As long as we can fit all UXN runtime code inside echo RAM the Game Boy code is able to access the rest of memory using native addresses, which I find super fun!

Note that this means we don't have access to the UXN zero page, which contains the Game Boy RST and interrupt vectors.

Performance

This emulator performs fairly well on a recent/fast CPU. On a Ryzen 5600X running in uxnemu most games are playable with no frameskip. Emulation speed in uxn32 is slightly slower, though it's not clear why at this time. A i5-540M with a frameskip of 3 could be considered playable for some games, but action games are pushing it. Performance on the Nintendo DS UXN VM is even slower, which isn't surprising.

I've sped up instruction dispatch by using jump tables, which in certain cases "wastes" as much as 126 bytes for the ~64 "ld r8,r8" instructions, but overall I believe the performance gain is worth it. I've also tried to pre-calculate as much as possible in the PPU scanline renderer to reduce redundant calculations as I'm not considering mid-scanline register writes. Background/window tiles are cached for reuse for up to 8 pixels, which provides a slight performance gain, though the presence of that code also slows things down a bit, so the net gain isn't huge. In addition, several common operations (ticks, reads, etc) have been converted to macros for speed over size, though the gains are minor. The initial release had a very inefficient OAM scan approach, which has since been resolved. I've also reworked the PPU mode advancement to only check a single PPU transition dot per mode, and to only check once per instruction, which speeds things up quite a bit and doesn't affect accuracy with a simple scanline renderer.

Save files are unpacked into a file per bank on startup for faster access during SRAM banking. The individual bank files are repacked on shutdown if you quit by pressing the Escape key. Quitting by closing the VM any other way will not properly write SRAM contents back to the SAV file. Without file seeking, games which use lots of ROM banks (and bank often) could also suffer a notable performance hit which could be reduced by unpacking ROM banks in a similar manner.

The only "big" idea I have to speed things up right now is to write 8 full rows of pixels to a buffer of 20 2bpp UXN tiles and draw them to screen with two .Screen/sprite (auto) writes instead of 1280 .Screen/pixel writes. It's unclear if the extra VM instructions to juggle the buffering would be worth the reduction in .Screen calls though, and the benefits may vary by VM implementation. I've tried this in the sprite-ppu branch and my current attempt runs slower than the .Screen/pixel approach.

In addition, I'm still very new at writing uxntal, so there are likely a whole bunch of smart optimizations which could be done to speed things up. Anything that could speed up the main CPU and PPU loops would likely yield huge speed benefits.

Accuracy