Bors Merge Bot

Overview

To guard against merge skews we merge pull requests using a bot. Craig is that bot, an instance of bors-ng. It compiles approved PRs as if they were merged, and merges them only after the CI on the original PR passes and the merge commit compiles successfully (which is used as a rough proxy for there being no merge-skews). This is in contrast to the default GitHub workflow, which tests the PR itself, but doesn't test the merge to master until after merging.

Workflow

For master branch

The normal workflow is pretty simple: once a PR has received an LGTM, add a comment saying bors r=<reviewer>[,<reviewer>]*. You can also use bors r+ which will make your name show up as the reviewer in the git log (See the Bors docs for a full list of commands). Bors will typically wait till the CI on your PR turns green before attempting to compile the merge commit. If you're amending + force pushing a commit and bors r+ immediately, you might see it not wait. What we think happens is that bors looks towards the previous commit status instead of the newest one (because on webhook race behavior - this is all speculation). If you don't see the following message, you've done it too soon, and risk bors failing after the fact.

Waiting for PR status (Github check) to be set, probably by CI. Bors will automatically try to run when all required PR statuses are set.

Craig will take the approved PR (along with any other PRs approved at around the same time), merge them to master (or whatever the PR's target branch is), and push that as the branch named staging. This will trigger the TeamCity step for the merge candidate. If everything goes right and the build compiles, Craig will fast-forward master to staging, and each of the PRs will be marked as merged. If at any time you need to cancel your in progress merge, type bors r-.

If the combined changes end up breaking the build, you've found a situation where Craig saved us from "merge skew". Craig will then proceed to bisect the PRs to identify which one(s) can merge safely and which one(s) caused the problem. If your PR merges successfully at this stage, great! Otherwise, Craig will comment with details about the failed build. At that point you'll need to investigate and resolve the issue. If the failure was due to merge skew, you'll need to resolve the merge.

If you'd like Craig to build your PR without merging to master, you can say bors try.

For non-master branches

Bors is only configured to work for the master branch. For all other branches, manually merge using the GitHub Merge button.

Priority

Generally you shouldn’t provide a priority for your PR when merging with bors. As an exception to this rule, some PR’s should be merged with the standard priority p=99.

Pull requests should be bors’ed with the standard priority 99 (bors r+ p=99) if and only if they a) skip or fix a flaky test or otherwise make CI more stable, and b) the flakiness being fixed is somewhat likely to be encountered in CI or as part of a bors (staging) CI run.

The rationale for these criteria is that bors splits up batches by priority. Should you choose a higher priority than all other PR’s in the queue, then bors will put the PR in a batch by itself, eschewing anything else already in the queue. This is not desirable unless that PR fixes an important issue that could conceivably be encountered in a staging build. If not, then all the other (potentially more important) PR’s in the queue were “jumped” for no reason. We choose one standard priority number 99 (arbitrarily) as bors can group together PR’s with the same priority, so if there are 4 test-skip PR’s in the queue and 6 other PR’s, bors knows to batch the 4 test-skip PR’s together and then only try merging the other 6 PR’s after the fact.

The criterion (b) is somewhat vague by design. If you’re in doubt, setting p=99 is probably OK especially if the flake has already been seen several times in CI or in automation. If the flake is encountered only under rare circumstances then it’s probably not necessary to configure a higher priority, and we benefit from letting the queue progress normally.

In general, there is no need to use a non-standard priority other than p=99 or the default. Do not set a high priority just because you would like your change to get in faster: while it may be convenient for you, you end up slowing down everyone else (and in all likelihood you are not saving yourself that much time anyway).

Using single on, even with p=99, is not necessary. In general, we want different PR’s with the same priority to be batched together, and single on disables this desirable behavior.

The above policy applies broadly and in general across PR’s to cockroach. However, breaking these rules can be legitimately useful at times for solving operational problems, or for fast-tracking very high-priority changes that legitimately need to take precedence over everything else for a reason unrelated to CI stability. If you think your PR merits an exception to the policy, ask in the #dev-inf channel to make sure.

FAQ

Craig (bors) told me "Permission denied"!

Other FAQs:

  • How is Bors configured? There's not many configuration options, but what there is can be found in bors.toml in the .github directory.

  • When does Bors process the config file? At the time commands are issued, Bors looks for a bors.toml in the PR to be merged. This means if you haven't changed it, the config file from your PR's target branch is used.

  • Is there a log of what Bors does somewhere? Yep! For this repo, you can find it here.

  • How do I add Craig to other repos? Craig is already listening to all the repos under @cockroachdb. There are two steps to making Craig respond to commands somewhere else: part one is to add a bors.toml config file to the repo that tells Bors what CI status to wait on, and part two is adding everyone as reviewers on the repo's settings page on the Bors dashboard.

  • Should I use bors for backports? No, use the green merge button instead.

  • My question isn't answered here! Ask on the mailing list, or if you're an internal Cockroach dev, on Slack in the channel #bors.

Known Limitations

  • #163: currently Bors will not automatically close issues if they are listed only in commit messages, however they will be closed if they appear in the PR description.

  • #165: user management is currently relatively manual. If you need permissions to act as a reviewer, ask someone to add you.

  • See workflow race condition above. Maybe we could add an arbitrary sleep in bors code to prevent that from happening.

  • If bors fails for the above reason, it doesn't notify the PRs that it's done so, it fails silently. This is annoying.

More information

Copyright (C) Cockroach Labs.
Attention: This documentation is provided on an "as is" basis, without warranties or conditions of any kind, either express or implied, including, without limitation, any warranties or conditions of title, non-infringement, merchantability, or fitness for a particular purpose.