Lately, I’ve been interested in programming language implementations in general, and LLVM in particular. You’ve probably heard of many projects based on LLVM like emscripten, rubinius, clang, rust or Rubymotion.

Since I learn better by example, I wanted to have a large LLVM-based programming language codebase to experiment and tinker with. Given I’m already familiar with part of its codebase, Macruby seemed like the best option.

Macruby is the predecessor of Rubymotion, and if you’re looking to build iOS or OSX apps, the latter is the best option for you. However if you want to learn more about LLVM and programming language implementations like me, the Macruby source code is perfect.

As I explain how to get your hands on a freshly compiled copy of Macruby, I will try to explain not only what I did to fix the build system, but also how I came to those conclusions, which I think is more valuable.

Environment

In this process I will be using Mac OSX 10.9 with the Xcode command line tools installed. You can easily get the command line tools running:

Compiling LLVM 2.9

Macruby is based on LLVM 2.9 which is a slightly outdated version of LLVM (current version is 3.4). Macruby’s README says the recommended version is the revision 127367 of branch 2.9. However when LLVM 2.9 final came out it did not include said revision and falls only a bit behind:

https://llvm.org/viewvc/llvm-project/llvm/tags/RELEASE_29/final/autoconf/?view=log

As you can see, the latest revision with real code changes was 125172. Apparently there are only minor differences between both revisions and I have successfully compiled Macruby with the final version of LLVM 2.9.

Checkout the code using Subversion:

In order to compile it, I had to made some changes to the codebase. Newer versions of Clang are more strict with certain syntaxes. My strategy here was to try to compile it and backport changes from more recent versions of LLVM whenever the compilation failed. To save you some time I’ve generated a patch you can apply directly to a fresh copy of LLVM 2.9 to make it compile with a recent version on clang.

https://gist.github.com/MarkVillacampa/8680814

You can apply the patch directly running this command:

Or manually:

Next, compile it!

LLVM takes a while to compile, so we make use of all the CPU cores using the -j option and passing sysctl -n machdep.cpu.thread_count which returns the number of CPUs (technically threads) of your computer.

We also pass the --prefix=/usr/local/llvm-29 option. This sets the installation path for LLVM, so it won’t conflict with other files and you can easily remove it afterwards.

Finally let’s install it.

Compiling MacRuby

We’re done with the prerequisites. Let’s actually get our hands on Macruby. First, pull a copy from Github:

The build system itself only needs one change, but a very succint one.

There is a file called kernel.c which defines some very performance-critical functions like getting or setting instance variables and some arithmetic operations. This file gets compiled to LLVM bitcode, and is stored as an static C array. Macruby loads this bitcode at runtime and uses it to optimize certain method calls when it compiles ruby code.

However, LLVM bitcode is usually not compatible between versions, and here comes our problem. We’re compiling Macruby using Clang 5.0, the version of Clang bundled with OSX 10.9. This version of Clang is based on LLVM 3.3. The internal version of LLVM which Macruby uses is 2.9. So when Macruby tries to load the kernel bitcode it will fail.

The solution is to compile kernel.c with a compatible compiler. Yes, I’m saying you have to download a whole compiler just to build one file. Yay.

Grab a copy of clang+llvm-2.9 in the following link, it’ll do the job just fine since it’s based on LLVM 2.9.

http://llvm.org/releases/2.9/clang+llvm-2.9-x86_64-apple-darwin10.tar.gz

Now we’ll change which compiler is used to build kernel.c. Open the rakelib/builder.rake file and on line 117 substitute #{llvm_gcc} for the full path of the clang executable from the compiler you just downloaded. For example:

Note that I have replaced --emit-llvm for -emit-llvm (subtle but required change).

On the next line, you’ll see the build system runs the opt tool against the generated bitcode file. opt is LLVM’s optimization tool. It will use the version of opt we just compiled alongside LLVM 2.9.

Now we’re ready to actually compile Macruby! Let’s do it:

The CFLAGS argument silences an error modern clang versions raise if you try to run a bitwise & operation on an Objective-C pointer.

The llvm_path argument tells the build system where to look for LLVM.

If everything went right you should be able to run:

You’ll notice I’m running minirubyand not macruby here. miniruby is just all of Macruby’s base files statically linked into an executable. Compare that with the macruby executable which contains only some statically linked files and then dynamically links to the libmacruby.dylib shared library. miniruby is used to compile MacRuby’s standard library, which is written in Ruby, during MacRuby’s compilation.

As you can see we haven’t compiled MacRuby’s standard library. It turns out there is a bug, probably related with the version of the Objective-C runtime in 10.9. It causes some methods defined in Ruby to not be properly added to the runtime. As part of my learning process I will try to fix it and get a completely working compilation of Macruby.

Hopefully this will be of any use to someone 🙂

If you have any doubts or want to ask me anything, write in the comments below or reach me through email or twitter.