Dump the commit contents into the editor when editing a Jujutsu commit

I’m so happy with this config that I need to share it even though I’ve only been using this for a few hours. Put this in ~/.config/jj/config.toml.

[templates]
draft_commit_description = '''
concat(
  description,
  surround(
    "\nJJ: This commit contains the following changes:\n", "",
    indent("JJ:     ", diff.stat(72)),
  ),
  surround("\nJJ: Diff:\n", "", indent("JJ:  ", diff.git(4)))
)
'''

It’s awesome. Now editing a commit message looks like this.

Bug 1930698 - Add invalidation logging for fuses r?iain

Differential Revision: https://phabricator.services.mozilla.com/D228689

JJ: This commit contains the following changes:
JJ:     js/src/vm/InvalidatingFuse.cpp | 6 ++++++
JJ:     js/src/vm/Logging.h            | 1 +
JJ:     2 files changed, 7 insertions(+), 0 deletions(-)

JJ: Diff:
JJ:  diff --git a/js/src/vm/InvalidatingFuse.cpp b/js/src/vm/InvalidatingFuse.cpp
JJ:  index 6fbc9b2aa0..e7b541ff20 100644
JJ:  --- a/js/src/vm/InvalidatingFuse.cpp
JJ:  +++ b/js/src/vm/InvalidatingFuse.cpp
JJ:  @@ -10,8 +10,9 @@
JJ:   #include "jit/Invalidation.h"
JJ:   #include "jit/JitSpewer.h"
JJ:   #include "vm/JSContext.h"
JJ:   #include "vm/JSScript.h"
JJ:  +#include "vm/Logging.h"
JJ:   
JJ:   #include "gc/StableCellHasher-inl.h"
JJ:   #include "vm/JSScript-inl.h"
JJ:   
JJ:  @@ -33,8 +34,10 @@
JJ:   
JJ:   void js::InvalidatingRuntimeFuse::popFuse(JSContext* cx) {
JJ:     // Pop the fuse in the base class
JJ:     GuardFuse::popFuse(cx);
JJ:  +  JS_LOG(fuseInvalidation, mozilla::LogLevel::Verbose,
JJ:  +         "Invalidating fuse popping: %s", name());
JJ:     // do invalidation.
JJ:     for (AllZonesIter z(cx->runtime()); !z.done(); z.next()) {
JJ:       // There's one dependent script set per fuse; just iterate over them all to
JJ:       // find the one we need (see comment on JS::Zone::fuseDependencies for
JJ:  @@ -70,8 +73,11 @@
JJ:       // before calling invalidate.
JJ:       if (script->hasIonScript()) {
JJ:         JitSpew(jit::JitSpew_IonInvalidate, "Invalidating ion script %p for %s",
JJ:                 script->ionScript(), reason);
JJ:  +      JS_LOG(fuseInvalidation, mozilla::LogLevel::Debug,
JJ:  +             "Invalidating ion script %s:%d for reason %s", script->filename(),
JJ:  +             script->lineno(), reason);
JJ:         js::jit::Invalidate(cx, script);
JJ:       }
JJ:     }
JJ:   }
JJ:  diff --git a/js/src/vm/Logging.h b/js/src/vm/Logging.h
JJ:  index f4b63e3773..a593c249bd 100644
JJ:  --- a/js/src/vm/Logging.h
JJ:  +++ b/js/src/vm/Logging.h
JJ:  @@ -83,8 +83,9 @@
JJ:   
JJ:   #define FOR_EACH_JS_LOG_MODULE(_)                                            \
JJ:     _(debug)                /* A predefined log module for casual debugging */ \
JJ:     _(wasmPerf)             /* Wasm performance statistics */                  \
JJ:  +  _(fuseInvalidation)     /* Invalidation triggered by a fuse  */            \
JJ:     JITSPEW_CHANNEL_LIST(_) /* A module for each JitSpew channel. */
JJ:   
JJ:   // Declare Log modules
JJ:   #define DECLARE_MODULE(X) inline constexpr LogModule X##Module(#X);

JJ: Lines starting with "JJ: " (like this one) will be removed.

Many thanks to Erich at work!

A Case for Feminism in Programming Lanugage Design

I wish I had read this paper by Felienne Hermans and Ari Schlesinger before going to SPLASH.

Felienne’s blog post is worth reading as an introduction, and here’s the stream of her presentation, which I highly recomend -- she's an excellent compelling communicator.

I don't have much to add, beyond a few quotes I felt worthwhile to share:

Coming back to my insider-outsider perspective, I sometimes wonder what we are even researching. What exactly is a programming language for? What does it mean to design a programming language? And I keep coming back the the question: why are women of all colors so under represented in the programming languages community?

The spread-out nature of research on programming languages is problematic, since it prevents the PL community from having a more holistic view of programming language use. We are robbing ourselves of a place for conversations on the different perspectives on the ways people use with programming languages.

SPLASH 2024: Impressions and Feelings

I thought it would be useful to sit down and write up some of my thoughts on SPLASH 2024 while they are still fresh.

Due to happy nuptials (& a pressing desire to get home), I was only able to attend Splash for 2.5 days; Wednesday, Thursday and Friday morning.

The beauty of any conference is of course the Hallway Track, so I have many papers and presentations I need to read or watch that I missed. In this write-up I’ll just highlight papers / presentations I managed to catch. Missing something here says nothing other than I likely missed it :)

REBASE

Wednesday was REBASE. My first time attending REBASE, and I quite liked it. Industry / Academic cross-overs are very valuable in my opinion.

After Rebase ended, a group of us ended up chatting in the room for so long that we missed the student research competition and the food!

Thursday

This day opened with a keynote by Richard P. Gabriel, talking about his career, how he sees AI having experienced a few AI winters.

  • Wasm-R3: Record-Reduce-Replay for Realistic and Standalone WebAssembly Benchmarks was quite cool. As an engine developer it’s right up my alley, but also it addresses a real use-case I see which the generation of benchmarks from real applications.

  • WhiteFox: White-box Compiler Fuzzing Empowered by Large Language Models. This was quite neat, and honestly a decent use for an LLM in my mind. The basic idea is to provide the code to a optimization (in a Deep Learning compiler like PyTorch in the paper) to an LLM, and get it to describe some essential features of a test case including example code. Then using these essential features and example codes, create fuzz-test cases. There’s a feedback loop here to make sure the test cases actually exercise the optimizations as predicted. Their results really seem to speak for themselves -- they’ve been called out by the PyTorch team for good work. Overall I was pretty impressed by the presentation.

  • Abstract Debuggers: Exploring Program Behaviors Using Static Analysis Results This was a really neat piece of work. The basic thrust is that most static analyzers either say “Yep! This is OK” or “Nope, there’s a problem here”. The challenge is that interpreting how a problem exists is often a bit of a pain, and furthermore, all the intermediate work a static analyzer does is hidden within it not providing value to users.

    The authors of this paper ask the question (and provide a compelling demo of) “What if you expose a static analyzer like a debugger?” What if you can set break points, and step through the sets of program states that get to an analysis failure? They make a compelling case that this is actually a pretty great interface, and I’m very excited to see more of this.

    As a fanatic about omniscient debugging, I found myself wondering what the Pernosco of static analysis looks like; alas, I never managed to formulate the question in time in the session, then didn’t get a chance to talk to the presenting author later.

Friday

  • Redressing the balance: a yin-yang perspective on information technology Konrad Hinsen used the idea of Yin and Yang to interrogate the way in which we work in information technology. In his presentation, Yang is the action precipitated by the thought of Yin; his argument is that we have been badly imbalanced in information technology, focused on the Yang of “build fast and break things” and not nearly enough on the balancing Yin of “Think and explore”. As a result, tools and environments for though have been left un-built, where the focus has landed on tools for shipping products.

    His hope is that we can have a vision of software that’s more Yin focused; his domain is scientific software and he’s interested in software with layers -- Documentation, formal models, execution smentics.

  • Mark--Scavenge: Waiting for Trash to Take Itself Out This neat paper proposes a new concurrent GC algorithm that tries to eliminate wasted work caused by the evacuation of objects which end up being dead by the time they are evacuated. This is done by doing evacuation using the set of sparse parges selected from a previous GC cycle, only evacuating objects rediscovered on a second cycle.

    As a last-ditch GC, they can always choose to evacuate a sparse page, making use of headroom.

    It was a quite compelling presentation, with good results for the JVM.

The Things I Missed:

There’s a whole bunch of presentations and papers I missed that I would definitely like to catch up on:

Conclusion

Every year I come to an academic conference as an industry practitioner I am reminded of the value of keeping yourself even a little bit connected to the academic world. There’s interesting work happening there, and it’s always nice to hear dispatches from worlds which may be one possible future!

Gut Checking Jujutsu vs Sapling

To be honest, I continue to have no idea where I will land version control wise. Here’s some pro-cons that are in my head at the moment.

Pro Jujutsu

  • I appreciate the versioned working directory
  • The .git support is really nice.
  • I am getting used to the ability to type a 3-letter change id to do work with changes.

Con Jujutsu

  • No absorb
  • No histedit
    • I'll be honest, I find reworking history to be really exhausting in Jujutsu. It uses a weird conflict marker by default which I find confusing, and generally the requirement that you do all the rebasing yourself vs having a histedit script... not a fan.
  • The transparent conversion of working directory to commit can bite you -- means you can accidentally add a file and not notice!
  • jj's versioned working directory seems to occasionally break the Mozilla build system, as it tries to figure out what tools should be bootstrapped and when, which seems to be based off the revision. This is not implicitly as pro-Sapling position, as I suspect I'd have equal pain with Sapling.

Pro Sapling

  • I kinda miss ISL when working in Jujutsu..
  • absorb!
  • histedit
  • I think the changeset evolution story in Sapling is probably a little easier to understand than in Jujutsu

Con Sapling

  • Stepping into the future... but less far
  • dotgit support still feels sufficiently experimental I don't know I'd be comfortable using it. This means that until we do the switch for real, probably stuck with the weird workflow

Connecting my PiKVM Power Button to Home Assistant

This is mostly a note to myself if I ever want to figure out how I did this.

  1. Edit configuration.yaml. I added a shell service:

    shell_command:
       pikvm_power: "curl -X POST -k -u admin:super_secret_password https://pikvm-ip/api/atx/click?button=power"
  2. Reboot HA; needed for the shell_command.pikvm_power service to appear.

  3. Add a button helper

  4. Add an automation that calls the service when the helper button is pressed.

Success!

Jujutsu Two: A better experience

I've been working with Jujutsu the last month or so. It's actually been really nice. It's yet another reminder that, despite the version control system monoculture encouraged by GitHub, there's still innovation and interesting things happening in this space. It reaffirms my believe that plain git is really no longer where we should be aiming for as a community.

Last time I investigated Jujutsu I had some real show-stopping issues that prevented me from giving it a fair shake. This time I managed to get it set on my Linux machine such that it became my daily driver for the last month.

Experience

First and foremost, the ability to use your existing git-cinnabar enabled unified checkout as your repo, and seamlessly switch between git and jj is a pretty winning feature. Now, it turns out that Sapling experimentally is adding what they're calling dotgit support, but it's still experimental, whereas this is pretty core to Jujutsu.

It took me a little while to really internalize the power of a 'versioned working directory' workflow, but I've come to believe it's actually kind of wonderful.

Here's roughly what it looks like:

  1. jj new central "I would like to start working off of central". This produces a working directory with an associated "change id". Change IDs stay the same over the evolution of a working directory / commit.
  2. jj desc -m "Figure out how to frob the blob" Describe your working directory. This is totally optional.
  3. Do your work. The work is automatically checkpointed along the way any time you run a jj command.
  4. If you're interrupted and have to go work on something else, just go to that revision without worrying about losing work.
  5. When it's time to return to working on what you were, simply reopen the working directory with jj edit <change id>
  6. git pull use Cinnabar to pull in new changes.
  7. jj rebase -s working-dir-change-id -d central
  8. jj desc -m "Bug ABCD - Frobnicate the Blob with the Hasher r?freud" Update your description once you have a bug and a reviewer.
  9. jj commit No message edit here -- the description is used as the commit message

Unfortunately, it's here where we have awkwardness; moz-phab doesn't understand detached heads and freaks out when you try to submit. So at this point you have to create a branch, switch git to it, then submit the change. Almost certainly fixable, but we'll not ask the engineering effectiveness team for this.

Now, this is of course the happy path. There are certainly some brain-bending bits when you fall off of it. For example, the handling of conflicts is sort of strange: you edit a conflicted revision, then squash your resolution into the conflicted change, and it loses its conflict status. Dependent revisions and working directories are then rebased, which may have conflicts or not.

Some Setup Notes:

So the slowness I reported last time, everyone asked if I had set up watchman. I had not. So this time around, first thing:

jj config set --user core.fsmonitor "watchman"

Next: In order to make Jujutsu's log make any sense for Mozilla central, you have to teach it about what commits are 'immutable'. We had to do the same dance for Sapling too -- a side effect of not using the standard main branch name.

I put this into ~/.config/jj/config.toml, though it definitely belongs in the repo's .jj/repo/config.toml

[revset-aliases]
"immutable_heads()" = "central@origin | (central@origin.. & ~mine())"

Jujutsu vs Sapling?

Honestly, I have no idea where I'll land. If Sapling's dotgit support matures, it would be a really nice Mercurial replacement for people. But Jujutsu's versioned working directory is a legitimately interesting paradigmn.

I feel like there's a bit of a philosophical school thing going on here

  • Sapling feels like the crystallization of all the good ideas of Mercurial into a modern tool, suited for large scale development, supported by a large corporation.
  • Jujutsu feels like an evolution of git. Taking the position that today computers are fast, storage is fast, why not track more and be more helpful. Yet it still feels connected to the git heritage in a way that occasionally feels clunky. The developer works at Google, but I didn't get the feeling like jj was going to become the default for Googlers any time soon.

To paint a word picture... Sapling is an electric car with swooping lines, and Jujutsu is the Delorean from Back to the Future -- cobbled together with parts, but capable of amazing things.

Some Other Notes

  • The name of the tool is Jujutsu, not Jujitsu... which I have been thinking it was for 6+ months now 😨. My apologies to Martin von Zweigbergk, author of Jujutsu.
  • Use jj abandon to clean up your tree.
  • I find the default jj log to be quite noisy and a bit unpleasant.
    • Jujutsu's log formatting language is... very powerful (I don't get it all); but it has some cool built-in templates; try out jj log -T builtin_log_detailed or jj log -T builtin_log_oneline -- these defaults are defined here
  • Because your working directory is versioned, you can use jj obslog on it to explore the evolution of your working directory. This is super cool -- you can retrieve some code you deleted that you thought you didn't need, but turns out you did.
  • Jujitsu has a revset language that's similar but not identical to the one used in Mercurial and Sapling. For example, in mercurial to list my published commits, I might do hg log -r 'author(mgaudet) and public(). In jj I used jj log -r 'author(mgaudet) & ::central'.

Notes on a Hardware Upgrade

Just for my own edification, writing down the results of an upgrade:

Debug Browser:

  • Old: 15:42.73

  • New: 4:20.33

Debug Shell:

  • Old:

    • Cold Cache:  1:40.17

    • Hot Cache:  0:48.6

  • New:

    • Cold Cache:  0:46.58

    • Hot Cache: 0:26.60

Old Machine: Idle 48W, Build ~225W.

New Machine; Idle 98W, build ~425W.

Sapling & A Workflow For Mozilla Work

In my continuing quest to not use git, I have spent the last few weeks coming up with a viable workflow for working with Sapling at Mozilla. While it's still has rough edges, I've been enjoying it enough that I figure it's time to share.

Edit to add: It's worth highlighting that this workflow is 1000% unsupported, YMMV and please don't file bugs about this; let's not make the engineering workflow team's life harder.

What

Sapling is an SCM system built by Meta. Apparently it's been around for a while in different forms, but I only heard about it on it's public debut.

Sapling is designed to handle large repos, and has a workflow that's extremely familiar to anyone who has used Mercurial + Evolve.

Experience

My experience with Sapling has actually been... pretty darn good. I'll detail the workflow below, and highlight a few gotchas, but overall Sapling seems like where I might end up sticking in my continuing quest.

What's Good?

So first and foremost, I'm super happy with the user experience of Sapling. So much of my workflow in mercurial simply moved over with no trouble (naked heads, frequent use of histedit, absorb, hiding commits, etc). Some things are even nicer than they are in mercurial: for example, Mozillians often use hg wip, which is an alias that gets installed by bootstrap, which shows a graphical overview of the local state of the tree. In sapling, that's just the default output of a bare sl if you're in a sapling repo -- which sounds silly, but is a legitimately nice workflow improvement.

Even better than all the familiarity is that in my experience almost everything in Sapling is fast, even with a Mozilla central sized Repo. It feels as fast or faster than git, and definitely faster than Mercurial. Because it is so fast it can make some interesting decisions that surprised me. For example, if you amend a commit in the middle of a stack, it will automatically restack all dependent commits immediately.

Sapling also has some really fantastic tooling:

  • The Interactive Smart Log (ISL) gives you a browser based UI for dealing with your tree. I was totally skeptical, but it is very good. You can rebase with drag-and-drop, clean up you tree all from the UI. It's impressive
  • The Sapling VSCode Plugin is really nice. It builds the ISL directly into VSCode, and also adds some really delightful touches, like showing the history annotation for each line of code as you edit it. Using this for handling merge conflicts is legitimately quite nice.

What's Not as Good

Well, firstly: Mozilla's code base has no idea what to do about sapling, to varying levels of problematic. I've made one fix so far, but organizationally I don't want to make more work for the engineering workflow teams, so some things I sort of expect will at best be clunky in a sapling repo.

Some examples:

  • mach bootstrap doesn't error out or anything, but definitely seems to work incorrectly when run inside a sapling repo.
  • mach clang-format relies on figuring out the outgoing set, so it doesn't work at the moment. It's possible to work around this one however.

Sapling itself for sure isn't perfect:

  • I've run into a crash while rebasing once; nothing seemed to be lost though and sl rebase --continue finished the job.
  • The ISL seems finicky at times; it will throw an exception and be broken until reload occasionally.
  • Some aspects of my workflow haven't been implemented in Sapling. For example, I used to make heavy use of hg rebase --stop which would stop a partially completed rebase and leave some dependent changes as un-evolved; this doesn't seem to have an equivalent in Sapling, which provides only --abort and --continue
  • Getting Sapling setup to work properly took some more effort and a few more gotcha's than I expected.
  • Sapling's histedit doesn't use the lovely TUI that mercurial provides, and thus is just... clunky. Interestingly, the sl amend commit in the middle of the stack workflow is kind of nicer for quick edits.
  • I think Sapling's history editing capabilities seem to be only about 50% as powerful as evolve -- I cannot figure out an equivalent to the hg obslog.

One major pain point for me at the moment that I don't have a good answer for is format-on-commit, which I relied pretty heavily on. Apparently Sapling does have hooks, but I haven't yet figured out if they're usable as pre-commit hooks yet.

The Workflow

Basically, the workflow is the following diagram:

I'll explain some more in detail below

Getting Started

  1. Get yourself Sapling
  2. Get yourself a git-cinnabar clone of central: See the So what's next heading of this blog post
  3. sl clone from the local git repo into a new sapling repo.
  4. Do your work inside your sapling repo! Check out the guide here
  5. To make the smartlogs work properly, and to teach Sapling what is 'public', you need to tell it what remote refs are public: sl config --local 'remotenames.publicheads=remote/central'. If you don't do this expect ISL to fall over, and sl to complain about the number of public commits.

Push to try:

  1. sl push --to tmp-PushToTry
  2. cd ../git-clone
  3. git checkout tmp-PushToTry
  4. ./mach try ...
  5. cd ../sapling-clone

Of course, replace tmp-PushToTry as appropriate. And if you've previously used that branch name, or need to update it --force works wonders.

You'll also likely be interested in this git repo setting: git config --local receive.denyCurrentBranch updateInstead which is a nice quality of life improvement rather than getting yelled at.

moz-phab submit

  1. sl push --to FeatureSubmit
  2. cd ../git-clone
  3. `moz-phab submit --upstream central
  4. cd ../sapling-clone
  5. sl pull
  6. sl (use smart log to find the updated commit with the differential ID added)
  7. sl goto updated commit;
  8. sl hide old stack (technically optional, but recommended)

Future Explorations

  • You can probably intuit that it seems totally feasible to script most of the above interactions with the git clone. Definitely a possible future path for me.
    • Hanging out in the Sapling discord has made me aware that there's experimental work happening on a dotgit mode that will have a .git repo; in that world, I suspect a lot of this workflow would be obviated, but it sounds like this is still experimental and I'm not sure how actively it's being developed.
  • Apparently there used to be a Phabricator extension, since deleted, which might be resuscitable. Ideally this would allow bypassing moz-phab submit

Concerns

I do have some reservations about going whole-hog onto sapling.

  1. Sapling is first and foremost Meta's tool. I worry, despite a fairly clear CONTRIBUTING.md that if I need to fix sapling, it'll be a PITA to actually get fixes landed -- but the repo is already filled with a bunch of merged PRs, so this could be just paranoia!
  2. Add-ons (e.g. plugins) are an important workflow aid... however I'm bad at Python, and from chatting in the Sapling discord, it definitely seems like it's a bit rough -- essentially you write against the internal Sapling python API, which is perhaps more than I would like.

Other Notes for Explorers:

  • Launching ISL on a remote machine manually: sl isl --no-open -p 8081 -- provides token for remote access.
  • You can use sl goto and specify a remote git revision and it will just figure it out, though you have to use the full git hash.

Exciting times ahead.

Mozilla: Six Years!

I've now made it to six years at Mozilla. It's been an interesting year. I was off on parental leave for some of it as I had a second child.

Among the interesting things I tackled this year:

This year I handed ownership of the DOM Streams component over to Kagami Rosylight, who is a much more able steward of it in the long term than I could be. They have done a wonderful job.

Traditionally I update my Bugzilla statistics here as well:

  • Bugs filed 808 (+79)
  • Comments made 3848 (+489)
  • Assigned to 432 (+67)
  • Commented on 1458 (+249)
  • Patches submitted 1173 (+121)
  • Bugs poked 2498 (+685)

This year I've dropped the patches reviewed line, because it seems like with Phabricator I am no longer getting a good count on that. There's no way I've reviewed only 94 patches... I have reviewed more patches for Temporal alone in the last year!

You may notice that I've poked a large number of bugs this year. I've started taking time after every triage meeting to try and close old bugs that have lingered in our backlog for ages, and no longer have any applicability in 2023, for example bugs due to old makefile issues when we no longer use makefiles.

This is something more of us in the Triage team have started working on as well, based on the list of 'unrooted' SpiderMonkey bugs (see queries here). It's my sincere hope that sometime late next year our bug backlog will be quite a bit more useful to us.

Exploring Jujitsu (jj)

Edit: The tool is actually called Jujutsu, not Jujitsu… my apologies for getting this wrong throughout here. I’ve left the below intact for posterity, but it’s definitely wrong.

With the news that Firefox development is moving to git, and my own dislike of the git command line interface, I have a few months to figure out if there's a story that doesn't involve the git cli that can get me a comfortable development flow.

There's a few candidates, and each is going to take some time to correctly evaluate. Today, I have started evaluating Jujitsu, a git compatible version control system with some interesting properties.

  • The CLI is very mercurial inspired, and shares a lot of commonalities in supported processes (i.e anonymous head based development)
  • The default log is very similar to mozilla's hg wip
  • It considers the working directory to be a revision, which is an interesting policy.

Here's how I have started evaluating Jujitsu.

  1. First I created a new clone of unified which I called unified-git. Then, using the commands described by glandium in his recent blog post about the history of version control at mozilla I converted that repo to have a git object store in the background.
  2. I then installed Jujitsu. First I did cargo install binstall, then I did cargo cargo-binstall jj to get the binary of jj.
  3. I then made a co-located repository, by initializing jujitsu with the existing git repo: jj init --git-repo=.

After this, I played around, and managed to create a single commit which I have already landed (a comment fix, but still, it was a good exploration of workflow).

There is however, I believe a showstopper bug on my mac, which will prevent me from using jujitsu seriously on my local machine -- I will likely still investigate the potential on my linux build box however.

The bug is this one, and is caused by a poor interaction between jujitsu and insensitive file systems. It means that my working copy will always show changed files existing (at least on a gecko-dev derived repo), which makes a lot of the jujitsu magic and workflow hard.

Some notes from my exploration:

Speed:

This was gently disappointing. While the initial creation of the repo was fast (jj init took 1m11s on my M2 Macbook Pro), every operation by default does a snapshot of the repo state. Likely because of the aforementioned bug, this leads to surprising outcomes: for example, jj log is way slower than hg wip on the same machine (3.8s vs 2s). Of course, if you put jj log --ignore-working-copy, then it's way faster (0.6s), but I don't yet know if that's a usable working system.

Workflow

I was pretty frustrated by this, but in hindsight a lot of the issues came from having the working copy always seeming dirty. This needs more exploration.

  • jj split was quite nice. I was surprised to find out jj histedit doesn't yet exist
  • I couldn't figure out the jj equivalent of hg up . --clean -- this could be every history navigating tool, but because of the bug, it didn't feel like it.

Interaction with Mozilla tools

moz-phab didn't like the fact that my head was detached, and refused to deal with my commit. I had to use git to make a branch (for some reason a Jujitsu branch didn't seem to suffice). Even then, I'm used to moz-phab largely figuring out what commits to submit, but for some reason it really really struggled here. I'm not sure if that's a git problem or a Jujitsu one, but to submit my commit I had to give both ends of a commit range to have it actually do something.

Conclusion

I doubt this will be the last jujitsu post I write -- I'm very interested in trying it in a non-broken state; the fact that it's broken on my Mac however is going to really harm it's ability to become my default.

I've got some other tools I'd like to look into:

  • I've played with Sapling before, but because there's no backing git repo, it may not serve my purposes, as moz-phab wont work (push to try as well, I'll bet) but... maybe if I go the Steve Fink route and write my own phabricator submission tool... maybe it would work.
  • git-branchless looks right up my alley, and is the next took to be evaluated methinks.

Edited: Fixed the cargo-binstall install instruction (previously I said cargo install binstall, but that's an abandoned crate, not the one you want).

CacheIR: The Benefits of a Structured Representation for Inline Caches

In less than a week (😱) myself and my colleague Iain Ireland will be in Portugal, presenting our paper on CacheIR at MPLR, co-located with SPLASH 2023. Here’s our preprint (edit: and official ACM DL link), and here’s the abstract:

Inline Caching is an important technique used to accelerate operations in dynamically typed language implementations by creating fast paths based on observed program behaviour. Most software stacks that support inline caching use low-level, often ad-hoc, Inline-Cache (ICs) data structures for code generation. This work presents CacheIR, a design for inline caching built entirely around an intermediate representation (IR) which: (i) simplifies the development of ICs by raising the abstraction level; and (ii) enables reusing compiled native code through IR matching techniques. Moreover, this work describes WarpBuilder, a novel design for a Just-In-Time (JIT) compiler front-end that directly generates type-specialized code by lowering the CacheIR contained in ICs; and Trial Inlining, an extension to the inline-caching system that allows for context-sensitive inlining of context-sensitive ICs. The combination of CacheIR and WarpBuilder have been powerful performance tools for the SpiderMonkey team, and have been key in providing improved performance with less security risk.

This paper is the paper on CacheIR that I have wanted to exist for years, at least since I wrote this blog post in 2018. Since then, we’ve taken inline caching and pushed it even further with the addition of WarpBuilder, and so we cover even more of the power that CacheIR unlocks. I think this is a really fascinating design point which provides large amounts of software engineering leverage when building your system, and so I’m very happy to see that we’ve managed to publish a paper on this. We didn’t even cover everything about CacheIR in this paper — for example, we didn’t talk about tooling such as the CacheIR Analyzer or CacheIR health tool.

It’s my hope that we’ll seed conversations with this paper and find more academic collaborations and inspire more designs with high leverage. I’d be glad to answer questions or hear comments!

Thanks to our co-authors! Jan (who deserves the credit of having come up with CacheIR), Nathan (who did a bunch of work on the paper) and Nelson, always a happy guide to academia.

Viewing Missed Clang Optimizations in SpiderMonkey

Triggered by this blog post about -fsave-optimization-record, and Max Bernstein asking about it, and then pointing out this neat front end to the data, I figured I'd see what it said for SpiderMonkey.

Here's my my procedure:

First I created a new mozconfig. The most important part being

ac_add_options --enable-application=js

ac_add_options --enable-optimize
ac_add_options --disable-debug

export CFLAGS="$CFLAGS -fsave-optimization-record"
export CXXFLAGS="$CXXFLAGS -fsave-optimization-record"

Then I built SpiderMonkey. Then I asked OptView2 to generate my results for the JS directory:

./optview2/opt-viewer.py --output-dir js --source-dir ~/unified/ ~/unified/obj-opt-shell-nodebug-opt-recordx86_64-pc-linux-gnu/  -j10

After waiting a bit, it filled a directory with HTML files. I've uploaded them to GitHub, and published on GitHub Pages.

It certainly seems like this has interesting information! But there's a ton to go through, so for now just posting this blog post so people can reproduce my method. The OptView2 index isn't amazing either, so it's worth looking at specific files too.

Working in the Open & Psychological Safety

It was really interesting to read the article "The Curious Side Effects of Medical Transparency" as an Open Source developer. The feelings the doctor describes are deeply familiar to me, as we struggle with transparency in open source projects.

These aren't original thoughts, but I don't know how we adequately manage psychological safety while working in the open. You want your team to be able to share ideas, and have discussions without worrying about harrasment or someone misconstruing (intentionally perhaps) the words being used.

At the same time, the whole point of being open is that there's value in open community; if planning happens exclusively in private there's no opportunity to for the community to provide input or to even come to your aid.

I wish I had good answers, or original thoughts here, but I don't, and I'd be happy to read thoughts from anyone who does have answers or good practices.

Mozilla: 5 years

I missed my 5 year anniversary at Mozilla by a few days here.

As is my tradition, here’s my Bugzilla user stats (from today — I am 3 days late from my real anniversary which was the 27th)

  • Bugs filed 729
  • Comments made 3359
  • Assigned to 365
  • Commented on 1209
  • Patches submitted 1052
  • Patches reviewed 94
  • Bugs poked 1813

The last year was a big one. Tackled my biggest project to date, which ironically, wasn’t even a SpiderMonkey project really: I worked on reimplementing the WHATWG Streams standard inside the DOM. With the help of other Mozillians, we now have the most conformant implementation of the Streams specification by WPT testing. I became the module owner of the resulting DOM Streams module.

I also managed to get a change into the HTML spec, which is a pretty neat outcome.

I’m sure there’s other stuff I did… but I’m procrastinating on something by writing this blog post, and I should get back to that.

Faster Ruby: Thoughts from the Outside

(This is Part II of the Faster Ruby posts, which started with a retrospective on Ruby+OMR, a Ruby JIT compiler I worked on five years ago)

As someone who comes from a compiler background, when asked to make a language fast, I’m sympathetic to the reaction: “Just throw a compiler at it!”. However, working on SpiderMonkey, I’ve come to the conclusion that a fast language implementation has many moving parts, and a compiler is just one part of it.

I’m going to get to Ruby, but before I get there, I want to take a tour briefly of some bits and pieces of SpiderMonkey that help make it a fast JavaScript engine; from that story, you may be able to intuit some of my suggestions for how Ruby ought to move forward!

Good Bones in a Runtime

It’s taken me many years of working on SpiderMonkey to internalize some of the implications of various design choices, and how they drive good performance. For example, let’s discuss the object model:

In SpiderMonkey, a JavaScript Object consists of two pieces: A set of slots, which store values, and a shape, which describes the layout of the object (which property ends up in which slot)

Shapes are shared across many objects with the same layout:

var a = [];
for (var i = 0; i < 1000; i++) { 
    var o = {a: 1, b: 2};
  a.push(o)
}

In the above example, there are a thousand objects in the array, but all those objects share the same shape.

Recall as well, that JavaScript is a prototype language; each object has a prototype; so there’s a design decision: for a given object, where do you store the prototype?

It could well be in a slot on the object, but that would bloat objects. Similar to how layouts are shared across many different objects, there are many objects that share a prototype. In the above example, every object in the array has a prototype of Object.protoype. We therefore associate the prototype of an objet not with the object itself, but rather with the shape of the object. This means that when you mutate the prototype of an object (Object.setPrototypeOf), we have to change the shape of the object.

Given that all property lookup is based on either the properties of an object, or the prototype chain of an object, we now have an excellent key upon which to build a cache for property access. In SpiderMonkey, these inline caches are associated with property access bytecodes; each stub in the inline cache chain for a bytecode trying to do a property load like o.b ends up looking like this:

if (!o.hasShape(X)) { try next stub; } 
return o.slots(X.slotOf('b'))

Inline Caches are Stonkingly Effective

I’ve been yammering on about inline caches to pretty much anyone who will listen for years. Ever since I finally understood the power of SpiderMonkey’s CacheIR system, I’ve realized that inline caches are not just a powerful technique for making method dispatch fast, but they’re actually fundamental primitives for handling a dynamic language’s dynamism.

So let’s look briefly at the performance possibilities brought by Inline Caches:

Octane Scores (higher is better):
Interpreter, CacheIR, Baseline, Ion: 34252  (3.5x) (46x total)
Interpreter, CacheIR, Baseline:      9887   (2.0x) (13x total)
Interpreter, CacheIR:                4890   (6.6x)
Interpreter:                         739

Now: Let me first say outright, Octane is a bad benchmark suite, and not really representative of the real web… but it runs fast and produces good enough results to share in a blog post (details here).

With that caveat however, you can see the point of this section: well designed inline caches can be STONKINGLY EFFECTIVE: just adding our inline caches improves performance by more than 6x on this benchmark!

The really fascinating thing about inline caches, as they exist in SpiderMonkey, is that they serve to accelerate not just property accesses, but also most places where the dynamism of JavaScript rears its head. For example:

function add(a,b) { return a + b; } 
var a = add(1,1);
var b = add(1,"1");
var c = add("1","1");
var d = add(1.5,1);

All these different cases have to be handled by the same bytecode op, Add.

loc     op
——   ——
main:
00000:  GetArg 0                        # a
00003:  GetArg 1                        # a b
00006:  Add                             # (a + b)
00007:  Return                          #

So, in order to make this fast, we add an Inline Cache to Add, where we attach a list of type-specialized stubs. So the first stub would be be specialized to the Int32+Int32 case, the second to the Int32+String and so on and so forth.

Since typically types are relatively stable at a particular bytecode op, this strategy is very effective for speeding up execution time.

Making Ruby Fast: Key Pieces

Given the above story, you would be unsurprised to hear that I would suggest starting with improving the Ruby Object model, providing shapes.

The good news for Ruby is that there are people from Shopify pushing this exact thing. This blog post, The Future Shape of Ruby Objects, from Chris Seaton is a far more comprehensive and Ruby focused introduction to shapes than I wrote above, and the bug tracking this is here.

The second thing I’d do is invest in just enough JIT compilation technology to allow the creation of easy to reason about inline caches. Because I come from SpiderMonkey, I would almost certainly shamelessly steal the general shape of CacheIR, as I think Jan de Mooij has really hit on something special with its design. This would likely be a very simple template-JIT, done method-at-a-time.

When I worked on Ruby+OMR I didn’t have a good answer for how to handle the dynamism of Ruby, due to a lack of practical experience. There was a fair amount of hope that we could recycle the JIT profiling architecture from J9, accumulating data from injected counters in a basic compilation of a method, and feeding into a higher-optimizing recompilation that would specialize further. It’s quite possible that this could work! However, having seen the inline caching architecture of SpiderMonkey, I realize now that JIT profiling would have been maybe the costliest way we could generate the data we would need for type specialization. I may well have read this paper, but I don’t think I understood it.

Today in SpiderMonkey, all our type profiling is done through our inline caches. Our top tier compiler frontend, WarpBuilder, analyzes the inline cache chains to determine what is the actual important cases we should speculate on. We even do a neat trick with ICs to power smart inlining. Today, the thing I wish a project like OMR would provide most is the building blocks for a powerful inline cache system.

In the real world, YJIT is a really interesting JIT for Ruby being built around the fascinating Basic Block Versioning (BBV) architecture that Maxime Chevalier-Boisvert built during her PhD, an architecture I and other people who have worked on SpiderMonkey really admired as innovative. As I understand it, YJIT doesn’t need to lean on inline caching nearly as much as SpiderMonkey does, as much of the optimizations provided naturally fall out of the versioning of basic blocks. Still, in her blog post introducing YJIT, Maxime does say that even YJIT would benefit from shapes, something I definitely can believe.

Open Questions, to me at least

  • Does Ruby in 2022 still have problems with C-extensions? Once upon a time we were really concerned about how opaque C-extensions were. TruffleRuby used the really neat Sulong technology to solve this.

    Does the C-extension API need to be improved to allow a fast implementation to exist? Unsure.

    SpiderMonkey has the advantage of working in a ‘closed world’ mostly, where native code integrations are fairly tightly coupled. This doesn’t describe Ruby Gems that lean on the C-APIs.

  • What kind of speedup is available for big Rails applications? If 90% of the time in an application is spent in database calls, then there’s little opportunity for improvement via JIT technologies.

Conclusion

I’ve been out of the Ruby game for a long time. Despite that, I find myself thinking back to it frequently. Ruby+OMR was, in hindsight, perhaps a poor architectural fit. As dynamic as Java is, languages like JavaScript and Ruby mean that the pressure on compilation technology is appreciably different.

With the lessons I’ve learned, it seems to me that a pretty fast Ruby is probably possible. JavaScript is a pretty terrible language to make fast, and it’s achieved it (having a huge performance war between implementations, causing huge corporations to pour resources into JS performance helped… maybe Ruby needs a performance war). I’m glad to see the efforts coming out of Shopify — I really think they’ll pay huge dividends over the next few years. I wish that team much luck.

(There’s some really excellent discussion about this post over at Hacker News)

Faster Ruby: A Ruby+OMR Retrospective

Years ago I worked on a project called Ruby+OMR. The goal of that project was to integrate Eclipse OMR, a toolkit for building fast language runtimes, into Ruby, to make it faster. I worked on the JIT compiler, but there was also work to integrate the OMR Garbage Collector to replace the Ruby one.

After the project had trickled down to a stop, I wrote a retrospective blog post about the project, but never published it. Then, I moved on from IBM and started working at Mozilla, on SpiderMonkey, their JavaScript engine.

Working at Mozilla I’ve learned enormous amounts about how dynamic languages can be made fast, and what kind of changes are the most important to seeing performance.

Now feels like a reasonable time to update and expand that retrospective, and then I have a second follow up blog post I'll post tomorrow about how I’d make Ruby fast these days if I were to try, from the perspective of someone who’s not been involved in the community for five years.

Retrospective

It has been five years since I stopped working on Ruby+OMR, which is far enough in the past that I should refresh people’s memories.

Eclipse OMR is a project that came out of IBM. The project contains a series of building blocks for building fast managed language runtimes: Garbage collection technology, JIT compiler technology, and much more.

The origin of the project was the J9 Java Virtual machine (later open sourced as OpenJ9). The compiler technology, called Testarossa, was already a multi-language compiler, being used in production IBM compilers for Java, COBOL, C/C++, PL/X and more.

The hypothesis behind OMR was this: If we already had a compiler that could be used for multiple languages, could we also extend that to other technologies in J9? Could we convert the JIT compiler, GC and other parts, turning them into a library that could be consumed by other projects, allowing them to take advantage of all the advanced technology that already existed there?

Of course, this wasn’t a project IBM embarked on for 100% altruistic reasons: Runtimes built on top of OMR would, by their very nature, come with good support for IBM’s hardware platforms, IBM Z and POWER , a good thing considering that there had been challenges getting another popular language runtime onto those platforms.

In order to demonstrate the possibilities of this project, we attempted to connect OMR to two existing language runtimes: CPython, and MRI Ruby. I honestly don’t remember the story of what happened with CPython+OMR; I know it had more challenges than Ruby+OMR.

My Ruby+OMR Story

By the time I joined the Ruby+OMR Project, the initial implementation was well underway, and we were already compiling basic methods.

I definitely remember working on trying to help the project get out the door… but honestly, I have relatively little recollection of concrete things I did in those days. Certainly I recall doing lots of work to try to improve performance, running benchmarks, making it crash less.

I do know that we decided to make sure we landed with a Big Bang. So we submitted a talk to RubyKagi 2015, which is the premiere conference for Ruby developers in Japan, and a conference frequented by many of the Ruby Core team.

I would give a talk on JIT technology, and Robert Young and Craig Lehman gave a talk on the GC integration. Just days before the talks, we open sourced our Ruby work (squashing the commit history, which as I try to write this retrospective, I understand and yet wish we hadn’t needed to).

I spent ages building my RubyKaigi talk. It felt so important that we land with our best feet forward. I iterated on my slides many times, practiced, edited and practiced some more.

The thing I remember most from that talk was the moment when I looked down into the audience, and saw Matz, the creator of Ruby, sitting in the front row, his head down and eyes closed. I thought I had managed to put him to sleep. Somewhere in the video of that talk you can spot it happening: Suddenly I start to stumble over my slides, and my voice jumps a half-register, before I manage to recover.

That Ruby Kaigi was also interesting: that was the one where Matz announced his goal Ruby3x3: Ruby 3 would be 3x faster than Ruby 2.0. It seemed like our JIT compiler would be a potentially key part of this!

We continued working on Ruby, and I returned to RubyKaigi ten months later, in September of 2016. I gave a talk, this time, about trying to nail down how specifically we would measure Ruby 3x3. To date, this is probably the favourite talk I have ever given; a relatively focused rant on the challenges of measuring computer performance and the various ways you can mislead yourself.

It was at this RubyKaigi that we had some conversations with the Ruby Core team about trying to integrate OMR into the Ruby Core. Overall, they weren’t particularly receptive. There were a number of concerns. In June of 2017, those concerns became a part of a a talk Matz gave in Singapore, where he called them the ‘hidden rules’ of Ruby 3x3:

  • Memory Requirements: He put it this way: Ruby's memory requirements are driven by Heroku's smallest dyno, which had 512mb of RAM at the time.

  • Dependency: Ruby is long lived, 25 years old almost, and so there was a definite fear of dependency on another project. He put it this way: If Ruby were to add another dependency, that dependency ought to be as stable as Ruby itself.

  • Maintainability: Overall maintainability matters: Unmaintainable code stops evolution, so the Ruby Core team must be able to maintain whatever JIT is proposed.

By this point, the OMR team had already scaled effort on Ruby+OMR to effectively zero, but if we hadn’t done that, this talk would have been the death-knell to Ruby+OMR, purely on the second two points. While we had a road to improved memory usage, we were by definition a new project, and a complex one at that. We’d never become the default JIT compiler for Ruby.

The rest of the talk focused on a project being done by a Toronto based Red Hat developer named Vladimir Makarov, called MJIT. MJIT added a JIT compiler to Ruby by translating the bytecode of a Ruby method to a small C file, invoking GCC or Clang to compile that C File into a shared object, and then loading the newly compiled shared object to back the Ruby method.

Editorializing, MJIT was a fascinating approach. It's not quite a bytecode level JIT, because it feeds the underlying compiler (gcc) not raw bytecode, but C code that executes the same code that the bytecode would, as well as a pre-compiled header with all the required definitions. Since MJIT is looking at C code, it is free do do all sorts of interesting optimization at the C level, that a bytecode level JIT similar to Testarossa would never see. This turns out to be a really interesting work around for a problem that Evan Phoenix pointed out in his 2015 RubyKaigi Keynote, which he called the Horizon Problem. In short, the issue is that a JIT compiler can only optimize what it sees: but in a bytecode JIT for Ruby, like Ruby+OMR huge swathes (possibly even the majority) of the important semantics are hidden away as calls to C Routines, and therefore provide barriers to optimization. MJIT would be limited in what optimizations were possible by the contents of the pre-compiled header, which ultimately would define most of the 'optimization horizon'.

Furthermore, MJIT solved in a relatively nice way many of the maintenance problems that concerned the Ruby core community: By producing C code, the JIT process would be relatively easily debuggable, by being able to reason via C code, which the Ruby Core developers are obviously proficient at.

I haven’t paid a lot of attention to the Ruby community since 2017, but MJIT did get integrated into Ruby, and at least according to the git history, appears to still be maintained.

I was very excited to see Maxime Chevalier-Boisvert announce YJIT, as I loved her idea of basic block versioning. I’m excited to see that project grow inside of Ruby. One thing that project has done excellently is include core developers early, and get into the official Ruby tree early.

What did Ruby+OMR accomplish?

Despite Ruby+OMR’s failure to form the basis of Ruby’s JIT technology, or replace Ruby’s GC technology, the project did serve a number of purposes:

  • Ruby was an important test bed for a lot of OMR. It served as a proving ground for ideas over and over again, and helped the team firm up ideas about how consumption of the OMR libraries should work. Ruby made OMR better by forcing us to think about and work on our consumption story.
  • We managed to influence the Ruby community in a number of ways:
    • We showed that GC technology improvements were possible, and that they could bring performance improvement.
    • We helped influence some of the Ruby community's thoughts on benchmarking, with my talk at RubyKaigi having been called out explicitly in the development of a Rails benchmark that was used to track Ruby performance for a few years hence.

What should we have done differently in Ruby+OMR?

There's a huge number of lessons I learned from working on Ruby+OMR.

  • At the time we did the work on Ruby+OMR, the integration story between OMR and a host language was pretty weak. It required coordination between two repos, a fairly gross ‘glue’ code that was required to make the two systems talk to each other.

    A new interface, called JitBuilder was developed that may have helped, but by the time it arrived on the scene we were already knee deep in our integration into Ruby.

  • We should have made it dramatically easier, much earlier, to have people be able to try out Ruby+OMR. The Ruby community uses packaging systems to match Ruby versions to their app, like RVM and rbenv, and so we would have been very well served by pushing hard to get acceptance into these package managers early.

  • Another barrier to having people try out Ruby+OMR with the JIT enabled was our lack of asynchronous compilation. Not having asynchronous compilation left us in a state where we couldn’t be run, or basically even tested, for latency sensitive tasks like a Rails server application.

    I left tackling this one far too late, and never actually succeeded in getting it up and running. For future systems, I suspect it would be prudent to tackle async compilation very early, to ensure the design is able to cope with it robustly.

One question people have asked about Ruby+OMR is how difficult it was to keep up with Ruby’s evolution. Overall, my recollection is that it wasn’t too challenging, because we chose an initial compiler design that limited the challenge: Ruby+OMR produced IL from Ruby bytecode (which didn’t change a lot release to release), and a lot of the Ruby bytecodes were implemented in the JIT purely by calling directly into appropriate RubyVM routines. This meant that the OMR JIT compiler naturally kept up with relative ease, as we weren’t doing almost anything fancy that would have posed a challenge. Longer term, integration challenges would have gotten larger, but we had hoped at some point we’d end up in-tree, and have an easier maintenance story.

Conclusion

I greatly enjoyed working on Ruby+OMR, and I believed for the majority of my time working on it that we were serious contenders to become the default JIT for Ruby. The Ruby community is a fascinating group of individuals, and I really enjoyed getting to know some people there.

Ultimately, the failure of the Ruby+OMR project was, in my opinion, our lack of maturity. We simply hadn’t nailed down a cohesive story that we could tell to projects that was compelling, rather than scary. It’s too bad, as there are still pieces of the Testarossa compiler technology that I miss, almost five years since I’ve stopped working with it.

Edit History

  • Section on MJIT updated August 8, 2022, 10:45am to clarify a bit what I found to be special about MJIT after an illuminating conversation with Chris Seaton on twitter

Throttling Home Assistant Automations

Suppose you have a Home Assistant Automation, for example one that sends a notification to your phone, that you’d only like to run at most once every four hours or so.

You might google Debouncing an automation, because that’s the word that jumps into your head first, and end up here, which suggests a template condition like this:

 condition:
    condition: template
    value_template: "{{ (as_timestamp(now()) - as_timestamp(state_attr('automation.doorbell_alert', 'last_triggered') | default(0)) | int > 5)}}"

But then you have to do math, and it’s awkward.

There’s a much nicer way!

 condition:
    condition: template
    value_template: "{{now() - state_attr('automation.doorbell', 'last_triggered') > timedelta(hours=4, minutes = 1)}}"

Four Years at Mozilla

Tomorrow will be my fourth anniversary of working at Mozilla. Time flies.

This year has seen me work on everything from frontend features like class static initialization blocks to tackling the large task of re-hosting the Streams implementation in the DOM (one day, I will blog about that project).

Bugzilla Statistics

As is my tradition, here's this year's Bugzilla User Statistics. I've also done last year's because I'd gathered the data to write this post for last year, and then never posted

Year 3

  • Bugs filed: 459 (+183)
  • Comments made: 2113 (+624)
  • Assigned to: 208 (+90)
  • Commented on: 631 (+222)
  • Patches submitted: 762 (+230)
  • Bugs poked: 784 (+277)

Year 4

  • Bugs filed: 601 (+142)
  • Comments made: 2718 (+605)
  • Assigned to: 275 (+67)
  • Commented on: 930 (+299)
  • Patches submitted: 894 (+132)
  • Bugs poked: 1241 (+457)