Home

Awesome

Quick (incomplete) overview of the compilation process w/ LLVM:

*.c
 |
 | clang -cc1
 v
*.ll
 |         ^
 | llvm-as | llvm-dis
 v         |
*.bc
 |
 | llvm-link + *.a (native static libs) + *.bc (bitcode)
 v
*.bc
 |
 | llc
 v
*.s
 |
 | as
 v
*.o
 |
 | lld + *.a (native static libs) + *.o (native obj files)
 v
 🎉 (native binary)

There are many prior arts to getting parts of the LLVM toolchain running on the browser:

This is perhaps the first successful attempt at this. It (appears to?) execute the LLVM IR directly using an earlier version of Emscripten (which was in JavaScript). llvm-as and llvm-dis are used for LLVM IR validation and pretty-printing. This approach probably doesn't work anymore as Emscripten (or at least the SDK) now requires Node.js and Python, among other things.

This (appears to?) compile clang along with a (bespoke?) WASM runtime with Emscripten. No idea how all this works, the build scripts are... not pretty.

This is the latest attempt I can find, and makes use of WASI. It compiles clang and lld to WASI using a hacked LLVM source. It gets access to libc through a custom in-memory file system.

The approach done here is a mix between llvm.js and wasm-clang: we compile llc & lld using Emscripten. llc is used to compile the LLVM IR to a wasm32-wasi object file. The object file is run through lld along with (WASI) libc into a wasm32-wasi binary.

For lld to find libc, we need to create an in-memory file system, like in wasm-clang. Fortunately, Emscripten provides this, so all we need to do is to preload the WASI sysroot (which includes libc) into Emscripten's virtual filesystem.

After running the linker, we now have a wasm binary, but this isn't enough to run it on the browser. WASI hasn't been standardized yet, so there isn't native browser support for it, so we need some sort of polyfill. Fortunately, Wasmer provides just that with @wasmer/wasi, which they used for wasm-terminal.

And with that, we can run the wasm binary and you're off to the races! :)

Now for the build steps...

Building llc & lld

This was done on a AWS EC2 c6i.metal. Here we compile LLVM 14.0.6 - make sure the version number is consistent on every step.

Packages

sudo apt-get -y update
sudo apt-get -y install cmake g++ git lbzip2 ninja-build python3

Emscripten

git clone --branch 3.1.40 --depth 1 https://github.com/emscripten-core/emsdk
cd emsdk
./emsdk install 3.1.40
./emsdk activate 3.1.40
source ./emsdk_env.sh
echo "source $PWD/emsdk_env.sh" >> $HOME/.bashrc
cd ..

WASI sysroot

As mentioned in the preface, we need the WASI sysroot to provide the linker with libc. You also need the clang compiler runtime. Get these here. These are wasi-sysroot-x.y.tar.gz and libclang_rt.builtins-wasm32-wasi-x.y.tar.gz respectively.

wget -qO- https://github.com/WebAssembly/wasi-sdk/releases/download/wasi-sdk-20/wasi-sysroot-20.0.tar.gz | tar -xz
wget -qO- https://github.com/WebAssembly/wasi-sdk/releases/download/wasi-sdk-20/libclang_rt.builtins-wasm32-wasi-20.0.tar.gz | tar -xz
mkdir -p wasi-sysroot/lib/clang/16.0.4
mv lib wasi-sysroot/lib/clang/16.0.4/

Cross-compile llc & lld

git clone --branch llvmorg-16.0.4 --depth 1 https://github.com/llvm/llvm-project
cd llvm-project

# For the actual build, we need to have llvm-tblgen built for the host
cmake -G Ninja -S llvm -B build-host -DCMAKE_BUILD_TYPE=Release
cmake --build build-host --target llvm-tblgen

# No easy way to set flags just for lld, so we modify the cmake file directly
echo "set_target_properties(lld PROPERTIES LINK_FLAGS --preload-file=../../wasi-sysroot/lib@/lib)" >> llvm/CMakeLists.txt

EMCC_DEBUG=2 \
CXXFLAGS="-Dwait4=__syscall_wait4" \
LDFLAGS="-s NO_INVOKE_RUN -s EXIT_RUNTIME -s INITIAL_MEMORY=64MB -s ALLOW_MEMORY_GROWTH -s EXPORTED_RUNTIME_METHODS=FS,callMain -s MODULARIZE -s EXPORT_ES6 -s WASM_BIGINT" \
emcmake cmake -G Ninja -S llvm -B build \
  -DCMAKE_BUILD_TYPE=Release \
  -DCMAKE_INSTALL_PREFIX=install \
  -DLLVM_TARGET_ARCH=wasm32-emscripten \
  -DLLVM_DEFAULT_TARGET_TRIPLE=wasm32-wasi \
  -DLLVM_ENABLE_PROJECTS=lld \
  -DLLVM_ENABLE_THREADS=OFF \
  -DLLVM_TABLEGEN=$PWD/build-host/bin/llvm-tblgen
cmake --build build

Download build artifacts

cd build
tar -czvf bin.tgz bin/{llc,lld}.*

And then locally:

scp <build-machine-address>:~/llvm-project/build/bin.tgz .
tar -zxf bin.tgz
mv bin/* .
rmdir bin

Now you can stop the build machine instance. You should have llc.js, llc.wasm, lld.data, lld.js, lld.wasm on your local machine.

WASI browser polyfill

We use @wasmer/wasi as the WASI polyfill.

Etc.

For more details on how to use the polyfill and resulting artifacts, feel free to pore through index.js. These references might be helpful:

Good luck!