Category Archives: All

Improve build time of Rust, Java and Intel Fortran projects with Firebuild’s new release!

Rust is a hugely popular compiled programming language and accelerating it was an important goal for Firebuild for some time.

Firebuild‘s v0.8.0 release finally added Rust support in addition to numerous other improvements including support for Doxygen, Intel’s Fortran compiler and restored javac and javadoc acceleration.

Firebuild’s Rust + Cargo support

Firebuild treats programs as black boxes intercepting C standard library calls and system calls. It shortcuts the program invocations that predictably generate the same outputs because the program itself is known to be deterministic and all inputs are known in advance. Rust’s compiler, rustc is deterministic in itself and simple rustc invocations were already accelerated, but parallel builds driven by Cargo needed a few enhancements in Firebuild.

Cargo’s jobserver

Cargo uses the Rust variant of the GNU Make’s jobserver to control the parallelism in a build. The jobserver creates a file descriptor from which descendant processes can read tokens and are allowed to run one extra thread or parallel process per token received. After the extra threads or processes are finished the tokens must be returned by writing to the other file descriptor the jobserver created. The jobserver’s file descriptors are shared with the descendant processes via environment variables:

# rustc's environment variables
...
CARGO_MAKEFLAGS="-j --jobserver-fds=4,5 --jobserver-auth=4,5"
...

Since getting tokens from the jobserver involves reading them as nondeterministic bytes from an inherited file descriptor this is clearly an operation that would depend on input not known in advance. Firebuild needs to make an exception and ignore jobserver usage related reads and writes since they are not meant to change the build results. However, there are programs not caring at all about jobservers and their file descriptors. They happily close the inherited file descriptors and open new ones with the same id, to use them for entirely different purposes. One such program is the widely used ./configure script, thus the case is far from being theoretical.

To stay on the safe side firebuild ignores jobserver fd usage only in programs which are known to use the jobserver properly. The list of the programs is now configurable in /etc/firebuild.conf and since rustc is on the list by default parallel Rust builds are accelerated out of the box!

Writable dependency dir

The other issue that prevented highly accelerated Rust builds was rustc‘s -L dependency=<dir> parameter. This directory is populated in a not fully deterministic order in parallel builds. Firebuild on the other hand hashes directory listings of open()-ed directories treating them as inputs assuming that the directory content will influence the intercepted programs’ outputs. As rustc programs started in parallel scanned the dependency directory in different states depending on what other Rust compilations finished already Firebuild had to store the full directory content as an input for each rustc cache entry resulting low hit rate when rustc was started again with otherwise identical inputs.

The solution here is ignoring rustc scanning the dependency directory, because the dependencies actually used are still treated as input and are checked when shortcutting rustc. With that implemented in firebuild, too, librsvg’s build that uses Rust and Cargo can be accelerated by more than 90%, even on a system having 12 cores/24 threads!:

Firebuild accelerating librsvg’s Rust + Cargo build from 38s to 2.8s on a Ryzen 5900X (12C/24T) system

On the way to accelerate anything

Firebuild’s latest release incorporated more than 100 changes just from the last two months. They unlocked acceleration of Rust builds with Cargo, fixed Firebuild to work with the latest Java update that slightly changed its behavior, started accelerating Intel’s Fortran compiler in addition to accelerating gfortran that was already supported and included many smaller changes improving the acceleration of other compilers and tools. If your favorite toolchain is not mentioned, there is still a good chance that it is already supported. Give Firebuild a try and tell us about your experience!

Update 1: Comparison to sccache came up in the reddit topic about Firebuild’s Rust acceleration , thus by popular demand this is how sccache performs on the same project:

Firebuild 0.8.0 vs. sccache 0.4.2 accelerating librsvg ‘s Rust + Cargo build

All builds took place on the same Ryzen 5900X system with 12 cores / 24 threads in LXC containers limited to using 1-12 virtual CPUs. A warm-up build took place before the vanilla (without any instrumentation) build to download and compile the dependency crates to measure only the project’s build time. A git clean command cleared all the build artifacts from the project directory before each build and ./autogen.sh was run to measure only clean rebuilds (without autotools). See test configuration in the Firebuild performance test repository for more details and easy reproduction.

Firebuild had lower overhead than sccache (2.83% vs. 6.10% on 1 CPU and 7.71% vs. 22.05% on 12 CPUs) and made the accelerated build finish much faster (2.26% vs. 19.41% of vanilla build’s time on 1 CPU and 7.5% vs. 27.4% of vanilla build’s time on 12 CPUs).

Building the Linux kernel in under 10 seconds with Firebuild

Russell published an interesting post about his first experience with Firebuild accelerating refpolicy‘s and the Linux kernel‘s build. It turned out a few small tweaks could accelerate the builds even more, crossing the 10 second barrier with Linux’s build.

Build performance with 18 cores

The Linux kernel’s build time is a widely used benchmark for compilers, making it a prime candidate to test a build accelerator as well. In the first run on Russell’s 18 core test system the observed user+sys CPU time was cut by 44% with an actual increase in wall clock time which was quite unusual. Firebuild performed much better than that in prior tests. To replicate the results I’ve set up a clean Debian Bookworm VM on my machine:

lxc launch images:debian/bookworm –vm -c limits.cpu=18 -c limits.memory=16GB bookworm-vm

Compiling Linux 6.1.10 in this clean Debian VM showed build times closer to what I expected to see, ~72% less wall clock time and ~97% less user+sys CPU time:

$ make defconfig && time make bzImage -j18
real	1m31.157s
user	20m54.256s
sys	2m25.986s

$ make defconfig && time firebuild make bzImage -j18
# first run:
real	2m3.948s
user	21m28.845s
sys	4m16.526s
# second run
real	0m25.783s
user	0m56.618s
sys	0m21.622s

There are multiple differences between Russell’s and my test system including having different CPUs (E5-2696v3 vs. virtualized Ryzen 5900X) and different file systems (BTRFS RAID-1 vs ext4), but I don’t think those could explain the observed mismatch in performance. The difference may be worth further analysis, but let’s get back to squeezing out more performance from Firebuild.

Firebuild was developed on Ubuntu. I was wondering if Firebuild was faster there, but I got only slightly better build times in an identical VM running Ubuntu 22.10 (Kinetic Kudu):

$ make defconfig && time make bzImage -j18
real	1m31.130s
user	20m52.930s
sys	2m12.294s

$ make defconfig && time firebuild make bzImage -j18
# first run:
real	2m3.274s
user	21m18.810s
sys	3m45.351s
# second run
real	0m25.532s
user	0m53.087s
sys	0m18.578s

The KVM virtualization certainly introduces an overhead, thus builds must be faster in LXC containers. Indeed, all builds are faster by a few percents:

$ lxc launch ubuntu:kinetic kinetic-container
...
$ make defconfig && time make bzImage -j18
real	1m27.462s
user	20m25.190s
sys	2m13.014s

$ make defconfig && time firebuild make bzImage -j18
# first run:
real	1m53.253s
user	21m42.730s
sys	3m41.067s
# second run
real	0m24.702s
user	0m49.120s
sys	0m16.840s
# Cache size:    1.85 GB

Apparently this ~72% reduction in wall clock time is what one should expect by simply prefixing the build command with firebuild on a similar configuration, but we should not stop here. Firebuild does not accelerate quicker commands by default to save cache space. This howto suggests letting firebuild accelerate all commands including even "sh” by passing "-o 'processes.skip_cache = []'” to firebuild.

Accelerating all commands in this build’s case increases cache size by only 9%, and increases the wall clock time saving to 91%, not only making the build more than 10X faster, but finishing it in less than 8 seconds, which may be a new world record!:

$ make defconfig && time firebuild -o 'processes.skip_cache = []' make bzImage -j18
# first run:
real	1m54.937s
user	21m35.988s
sys	3m42.335s
# second run
real	0m7.861s
user	0m15.332s
sys	0m7.707s
# Cache size:    2.02 GB

There are even faster CPUs on the market than this 5900X. If you happen to have access to one please leave a comment if you could go below 5 seconds!

Scaling to higher core counts and comparison with ccache

Russell raised the very valid point about Firebuild’s single threaded supervisor being a bottleneck on high core systems and comparison to ccache also came up in comments. Since ccache does not have a central supervisor it could scale better with more cores, but let’s see if ccache could go below 10 seconds with the build times…

firebuild -o ‘processes.skip_cache = []’ and ccache scaling to 24 cores

Well, no. The best time time for ccache is 18.81s, with -j24. Both firebuild and ccache keep gaining from extra cores up to 8 cores, but beyond that the wall clock time improvements diminish. The more interesting difference is that firebuild‘s user and sys time is basically constant from -j1 to -j24 similarly to ccache‘s user time, but ccache‘s sys time increases linearly or exponentially with the number of used cores. I suspect this is due to the many parallel ccache processes performing file operations to check if cache entries could be reused, while in firebuild’s case the supervisor performs most of that work – not requiring in-kernel synchronization across multiple cores.

It is true, that the single threaded firebuild supervisor is a bottleneck, but the supervisor also implements a central filesystem cache, thus checking if a command’s cache entry can be reused can be implemented with much fewer system calls and much less user space hashing making the architecture more efficient overall than ccache‘s.

The beauty of Firebuild is not being faster than ccache, but being faster than ccache with basically no hard-coded information about how C compilers work. It can accelerate any other compiler or program that generates deterministic output from its input, just by observing what they did in their prior runs. It is like having ccache for every compiler including in-house developed ones, and also for random slow scripts.

How to speed up your next build 5-20x with Firebuild?

TL;DR: Just prefix your build command (or any command) with firebuild:

firebuild <build command>

OK, but how does it work?

Firebuild intercepts all processes started by the command to cache their outputs. Next time when the command or any of its descendant commands is executed with the same parameters, inputs and environment, the outputs are replayed (the command is shortcut) from the cache instead of running the command again.

This is similar to how ccache and other compiler-specific caches work, but firebuild can shortcut any deterministic command, not only a specific list of compilers. Since the inputs of each command is determined at run time firebuild does not need a maintained complete dependency graph in the source like Bazel. It can work with any build system that does not implement its own caching mechanism.

Determinism of commands is detected at run-time by preloading libfirebuild.so and interposing standard library calls and syscalls. If the command and all its descendants’ inputs are available when the command starts and all outputs can be calculated from the inputs then the command can be shortcut, otherwise it will be executed again. The interception comes with a 5-10% overhead, but rebuilds can be 5-20 times, or even faster depending on the changes between the builds.

Can I try it?

It is already available in Debian Unstable and Testing, Ubuntu’s development release and the latest stable version is back-ported to supported Ubuntu releases via a PPA.

How can I analyze my builds with firebuild?

Firebuild can generate an HTML report showing each command’s contribution to the build time. Below are the “before” and “after” reports of json4s, a Scala project. The command call graphs (lower ones) show that java (scalac) took 99% of the original build. Since the scalac invocations are shortcut (cutting the second build’s time to less than 2% of the first one) they don’t even show up in the accelerated second build’s call graph. What’s left to be executed again in the second run are env, perl, make and a few simple commands.

The upper graphs are the process trees, with expandable nodes (in blue) also showing which command invocations were shortcut (green). Clicking on a node shows details of the command and the reason if it was not shortcut.

Could I accelerate my project more?

Firebuild works best for builds with CPU-intensive processes and comes with defaults to not cache very quick commands, such as sh, grep, sed, etc., because caching those would take cache space and shortcutting them may not speed up the build that much. They can still be shortcut with their parent command. Firebuild’s strength is that it can find shortcutting points in the process tree automatically, e.g. from sh -c 'bash -c "sh -c echo Hello World!"' bash would be shortcut, but none of the sh commands would be cached. In typical builds there are many such commands from the skip_cache list. Caching those commands with firebuild -o 'processes.skip_cache = []' can improve acceleration and make the reports smaller.

Firebuild also supports several debug flags and -d proc helps finding reasons for not shortcutting some commands:

...
FIREBUILD: Command "/usr/bin/make" can't be short-cut due to: Executable set to be not shortcut, {ExecedProcess 1329.2, running, "make -f debian/rules build", fds=[{FileFD fd=0 {FileOFD ...
FIREBUILD: Command "/usr/bin/sort" can't be short-cut due to: Process read from inherited fd , {ExecedProcess 4161.1, running, "sort", fds=[{FileFD fd=0 {FileOFD ...
FIREBUILD: Command "/usr/bin/find" can't be short-cut due to: fstatfs() family operating on fds is not supported, {ExecedProcess 1360.1, running, "find -mindepth 1 ...
...

make, ninja and other incremental build tool binaries are not shortcut because they compare the timestamp of files, but they are fast at least and every build step they perform can still be shortcut. Ideally the slower build steps that could not be shortcut can be re-implemented in ways that can be shortcut by avoiding tools performing unsupported operations.

I hope those tools help speeding up your build with very little effort, but if not and you find something to fix or improve in firebuild itself, please report it or just leave a feedback!

Happy speeding, but not on public roads! 😉

Firefox on Ubuntu 22.04 (and 22.10, 23.04, and later) from .deb (not from snap)

It is now widely known that Ubuntu 22.04 LTS (Jammy Jellyfish) ships Firefox as a snap, but some people (like me) may prefer installing it from .deb packages to retain control over upgrades or to keep extensions working.

Luckily there is still a PPA serving firefox (and thunderbird) debs at https://launchpad.net/~mozillateam/+archive/ubuntu/ppa maintained by the Mozilla Team. (Thank you!)

You can block the Ubuntu archive’s version that just pulls in the snap by pinning it:

$ cat /etc/apt/preferences.d/firefox-no-snap 
Package: firefox*
Pin: release o=Ubuntu*
Pin-Priority: -1

Now you can remove the transitional package and the Firefox snap itself:

sudo apt purge firefox
sudo snap remove firefox
sudo add-apt-repository ppa:mozillateam/ppa
sudo apt update
sudo apt install firefox

Since the package comes from a PPA unattended-upgrades will not upgrade it automatically, unless you enable this origin:

echo 'Unattended-Upgrade::Allowed-Origins:: "LP-PPA-mozillateam:${distro_codename}";' | sudo tee /etc/apt/apt.conf.d/51unattended-upgrades-firefox

Happy browsing!

Update: I have found a few other, similar guides at https://fostips.com/ubuntu-21-10-two-firefox-remove-snap and https://ubuntuhandbook.org/index.php/2022/04/install-firefox-deb-ubuntu-22-04 and I’ve updated the pinning configuration based on them.

Hello zstd compressed .debs in Ubuntu!

When Julian Andres Klode and I added initial Zstandard compression support to Ubuntu’s APT and dpkg in Ubuntu 18.04 LTS we planned getting the changes accepted to Debian quickly and making Ubuntu 18.10 the first release where the new compression could speed up package installations and upgrades. Well, it took slightly longer than that.

Since then many other packages have been updated to support zstd compressed packages and read-only compression has been back-ported to the 16.04 Xenial LTS release, too, on Ubuntu’s side. In Debian, zstd support is available now in APT, debootstrap and reprepro (thanks Dimitri!). It is still under review for inclusion in Debian’s dpkg (BTS bug 892664).

Given that there is sufficient archive-wide support for zstd, Ubuntu is switching to zstd compressed packages in Ubuntu 21.10, the current development release. Please welcome hello/2.10-2ubuntu3, the first zstd-compressed Ubuntu package that will be followed by many other built with dpkg (>= 1.20.9ubuntu2), and enjoy the speed!

Introducing show-motd, Message Of The Day for WSL and container shells

People logging in to Ubuntu systems via SSH or on the virtual terminals are familiar with the Message Of The Day greeter which contains useful URLs and important system information including the number of updates that need to be installed manually.

However, when starting a Ubuntu container or a Ubuntu terminal on WSL, you are entering a shell directly which is way less welcoming and also hides if there are software updates waiting to be installed:

user@host:~$ lxc shell bionic-container
root@bionic-container:~#

To make containers and the WSL shell friendlier to new users and more informative to experts it would be nice to show MOTD there, too, and this is exactly what the show-motd package does. The message is printed only once every day in the first started interactive shell to provide up-to-date information without becoming annoying. The package is now present in Ubuntu 19.10 and WSL users already get it installed when running apt upgrade.
Please give it a try and tell us what you think!

Bug reports, feature requests are welcome and if the package proves to be useful it will be backported to current LTS releases!

Ubuntu on WSL: X and sound server detection explained

When I started using Ubuntu more than a decade ago I was impressed how well it worked out of the box. It detected most of my machine’s hardware and I could start working on it immediately. The Windows Subsystem for Linux is a much newer Ubuntu platform which can also run graphical applications without any extra configuration inside Ubuntu.

If you set up and start an X server or a PulseAudio server on Windows and start Ubuntu in WSL, the starting Ubuntu instance detects the presence of the servers and caches the configuration while the instance is running. The detection takes a few hundred milliseconds at the first start in the worst case, but using a cached configuration is fast and the delay is unnoticeable in subsequently started shells.

The detection is performed by /etc/profile.d/wsl-integration.sh from the wslu package and the configuration is cached in $HOME/.cache/wslu/integration.

If you would like to use a different X or sound configuration, like redirecting graphical applications to a remote X server you can prepopulate the cached configuration with a script running before /etc/profile.d/wsl-integration.sh or you can disable the detection logic with a similar script by making $HOME/.cache/wslu/integration an empty file.

New tags on the block: update-excuse and friends!

In Ubuntu’s development process new package versions don’t immediately get released, but they enter the -proposed pocket first, where they are built and tested. In addition to testing the package itself other packages are also tested together with the updated package, to make sure the update doesn’t break the other packages either.

The packages in the -proposed pocket are listed on the update excuses page with their testing status. When a package is successfully built and all triggered tests passed the package can migrate to the release pocket, but when the builds or tests fail, the package is blocked from migration to preserve the quality of the release.

Sometimes packages are stuck in -proposed for a longer period because the build or test failures can’t be solved quickly. In the past several people may have triaged the same problem without being able to easily share their observations, but from now on if you figured out something about what broke, please open a bug against the stuck package with your findings and mark the package with the update-excuse tag. The bug will be linked to from the update excuses page so the next person picking up the problem can continue from there. You can even leave a patch in the bug so a developer with upload rights can find it easily and upload it right away.

The update-excuse tag applies to the current development series only, but it does not come alone. To leave notes for a specific release’s -proposed pocket, use the update-excuse-$SERIES tag, for example update-excuse-bionic to have the bug linked from 18.04’s (Bionic Beaver’s ) update excuses page.

Fixing failures in -proposed is big part of the integration work done by Ubuntu Developers and help is always very welcome. If you see your favorite package being stuck on update excuses, please take a look at why and maybe open an update-excuse bug. You may be the one who helped the package making it to the next Ubuntu release!

(The new tags were added by Tiago Stürmer Daitx and me during the last Canonical engineering sprint’s coding day. Fun! 🙂 )

Downloadable Ubuntu WSL tarballs to the latest release and beyond!

Ubuntu LTS releases are already available from the Microsoft Store (link) as apps, but there are other ways of installing Ubuntu for the Windows Subsystem for Linux. You can import the Ubuntu tarball with the wsl command or use graphical interfaces listed at https://wiki.ubuntu.com/WSL#Third-party_tools for custom managing tarballs.

Such Ubuntu tarballs were not publicly available in the past, but now I proudly present the downloadable WSL tarballs for Ubuntu 16.04 LTS (Xenial), 18.04 LTS (Bionic), 19.04 (Disco), and the for the still in development 19.10 (Eoan) release. Thanks to everyone in the Ubuntu Foundations, Certified Public Cloud and Desktop Teams who helped to make this happen!

While we still recommend installing the Ubuntu WSL app from the Microsoft Store, using the tarballs lets you maintain more parallel Ubuntu instances which can be handy for experiments and you can also take a look at the next Ubuntu versions!

If you find a program or HOW-TO that installs other Ubuntu tarballs for WSL, please help by asking their author to use the official tarballs, because only the WSL tarballs include the WSL integration packages as described in the previous post!

Introducing ubuntu-wsl, the package making Ubuntu better and better on WSL

The Ubuntu apps for the Windows Subsystem for Linux provide the very same packages you can find on Ubuntu servers, desktops, cloud instances and containers, and this ensures maximal compatibility with other Ubuntu installations. Until recently there was little work done to integrate Ubuntu with the Windows system running the WSL environment, but now this is changing.

In Ubuntu metapackages collect packages useful for a common purpose by depending on them and ubuntu-wsl is the new metapackage to collect integration packages to be installed on every Ubuntu WSL system. It pulls in wslu, “A collection of utilities for WSL” to let you create shortcuts on the Windows desktop with wslusc, start the default Windows browser with wslview, and do a few other things:

With updates to the ubuntu-wsl metapackage we will add new features to Ubuntu WSL installations to make them even more comfortable to use, thus if you have an older installation please install the package manually:

sudo apt update
sudo apt install ubuntu-wsl

Oh, and one more thing, you can even set up sound and run graphical apps if you make a few manual steps. For details check out https://wiki.ubuntu.com/WSL!