Home

Awesome

I thank Eugene Shelwien and Dmitry Bortoq for long discussions regarding these topics. Many key ideas of this project were proposed by them, and without their help, it will never became reality.

Make assembly magic great again!

Modern optimizing C++ compilers almost entirely displaced assembly languages. I don't wrote assembly code for 20 years. Neither I see it used anywhere, except for Intel own libraries.

But C++ optimizers aren't the silver bullet. Each time I write high-optimized algorithm, I go through fight against compilers. I know from the start an assembler code I want to see, but it's hard to force compiler to generate exactly what I need. Modern compilers feel themselves so smart and move code around, allocate registers at their own discretion, and select assembler instructions what they prefer.

Nevertheless, I don't write my own asm code for a few reasons:

Portability

Fortunately, there are various solutions to both classes of problems, in particular for portability:

Overall, we can make assembly code fully portable within single ISA, and highly portable between x86 and x64, using macro packages to abstract register names and to hide ABI. Alternative approach is to link high-quality code produced by GCC into executables produced by other compilers.

Brevity

High-level assemblers

Various approaches to make assembly code more compact and readable are available in MASM, FASM, HLA, ForwardCom, and have the common name of High-level assemblers. A few examples:

PTX is a particularly interesting example of virtual assembly language that allows to declare unlimited amount of typed "registers" and supports legacy ISA instructions by emulating them with command sequences. This allows NVidia to make each next generation of video cards incompatible with previous one, and to adapt to varying amount of registers that depends on compilation options.

Sphinx C--

Sphinx C-- is the language providing C-like syntax for assembly code, including computations and if/while statements. The rest is implemented via usual assembly statements. It looks like ideal high-level assembly language for me, but unfortunately original compiler was 16-bit only and various 32/64-bit clones don't took off.

So my first idea was to make open-source implementation of similar language using a modern parsing approach (such as PEG) in a high-level language (probably, Haskell or OCaml) with massive extensibility features (ability to add new operators and statements).

Turbo: C with benefits

And at this moment I recalled Turbo C - old dumb C compiler that allowed to use register names as variables, f.e. if (_AX > _BX) _AX <<= _CL plus had plain MS-style asm inline statements. These two features made it quite similar to Sphinx C--, but with important benefit - except for these two extensions, it was plain C code. This allowed to write code that contains both portable and optimized low-level implementation, selected depending on the compiler used:

#if TURBO_C
#  define bitbuf _AX
#  define count  _CL
#else
   int bitbuf, count;
#endif

   bitbuf <<= count;

My experience of program optimization using Turbo C was really nice - I started with existing C algorithm and gradually replaced all complex expressions with single-operation assignments, and then added register bindings, similar to the code above, to declarations of hot variables. And the code remained working at each step of this transformation.

This pseudo-variable register syntax is probably available in newer Borland C++ compilers too, including free Borland C++ 5.5 version.

New ideal found

At this point I realized that all I need is just C/C++ "with benefits":

Similar to Turbo C approach, it will allow to develop plain C/C++ code and debug it using any existing C/C++ compiler. Once the code is working, we can rewrite critical loops to use only low-level operations, directly compilable to single assembly commands. At any moment, it still remains usual C/C++ code whose correctness can be checked with usual C/C++ compiler.

Once transformation is done, we can compile the code with our Magus C++ compiler and get exactly the asm code we developed. On platforms not supported by Magus, the code still can be compiled with usual C/C++ compilers.

This approach will provide us all benefits of Sphinx C-- (i.e. code portability, brevity and familiar C syntax), plus allow to share the same code between C/C++ (for portability to any system and debugging) and high-level assembler (for performance).

Now, once I figured what to do, I started to research various approaches to C/C++ compiling which can be extended with Magus code generator: LCC, TCC, ANTLR C++ parser, gcc/clang IR transformations. But every approach I was able to find was either hard to learn and implement (such as IR transformations), or had limited usefulness (such as modification of any OSS C compiler), so while my goal became perfectly defined, implementation seemed pretty hard.

Another variation of this idea was employing Nim - it allows to transform code AST at the compile time, which is exactly a kind of transformation I'm looking for. So, once algorithm was developed as low-level Nim code, it can be translated into C/C++ in usual way, or preprocessed by Magus replacing original statements with inline asm code.

Embedding

And at this point of discussion Eugene brought two great ideas to the table:

And combination of these ideas is absolute win, allowing us to quickly develop minimal practical translator and then extend it at comfortable pace. Since we plan to stay strictly within existing C/C++ syntax, and don't need anything but raw syntax parser, we can choose among well-known existing C/C++ parsers, such as ANTLR C++14 and Haskell C-language. C-language parser has additional advantage - it supports gcc asm statements in the AST type, and provides AST pretty-printer, so we can map C statements like eax=ebx into corresponding asm statements and then pretty-print code region back to feed any C/C++ compiler compatible with GCC asm statements.

Implementation

Implementation plan is: