Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

We’re partway through a migration of our We’ve migrated the cockroach build system to Bazel. Bazel is a modern build system with better performance characteristics and correctness guarantees than we currently have with make/go build. Today, you can perform almost all day-to-day CRDB dev tasks with Bazel rather than with make. make is deprecated and we will remove this option for building Cockroach at some point in the next release cycle.

...

Follow the directions on "Getting and building CockroachDB from source" or "Building from source on macOS" to get your development environment set up.

Note that you do need a full installation of XCode Xcode to build on macOS. (No, a command-line tools instance does not suffice.) If you’ve already installed a command-line tools instance as part of setting up Homebrew or other tools, switch to the full XCode Xcode instance and accept the XCode Xcode license agreement with:

Code Block
languagebash
sudo xcode-select -s /Applications/Xcode.app && sudo xcodebuild -license accept

(You may also have to start the Xcode application one time after installing or upgrading to initialize it. After this it does not need to be opened again.)

Getting started

Introduction to dev

...

./dev build short also works as an aliasand will stage the binary at ./cockroach. Bazel will pretty-print build output to the terminal.

You can also build the full cockroach binary which includes the Javascript build.

Code Block
bazel build pkg/cmd/cockroach --config=with_ui

./dev build (or equivalently, ./dev build cockroach) is a synonym for thiswill do the same and will stage the binary at ./cockroach like above.

dev build is a light wrapper for bazel build that supports aliases for common targets (for example, ./dev build crlfmt instead of the harder-to-remember bazel build @com_github_cockroachdb_crlfmt//:crlfmt). dev also copies binaries out of the Bazel output directories for you into your workspace; for example, bazel build pkg/cmd/cockroach-short puts the binary in _bazel/bin/pkg/cmd/cockroach-short/cockroach-short_/cockroach-short, but if you dev build short the binary will be staged at ./cockroach instead.

Warning

If you encounter build errors related to the cluster-ui dependency, you may have to clean the cache. Run ./dev

...

ui clean --all && ./dev cache reset and retry the build.

Info

To build cockroach with the UI on older versions of the code, adding --config with_ui to the bazel build may be necessary.

Run ./dev help build for more information about what you can do with dev build. Note that you can pass additional arguments directly to bazel build by adding them after --:

...

--cross takes an optional argument which is the platform to cross-compile to: --cross=linux, --cross=windows, --cross=macos, --cross=linuxarm, --cross=macosarm. dev will copy the built binaries into the artifacts directory in this case. Note that cross-building requires Docker. Cross-compiling should work on M1 Macs, but this support is experimental, so report issues if you should observe any.For more debugging tips on building with Bazel, see “a docker-compatible system installed like Rancher Desktop.

For more debugging tips on building with Bazel, see “How to ensure your code builds with Bazel”.

...

  • dev test has a --stress flag for running tests under stress and --race for running tests with the race detector enabled.

  • Next to the test.log file produced by your test, you can find a test.xml file. This file contains specific information on each test run and its status, as well as timing information.

  • The -v argument to dev test will result in more verbose logging as well as more detailed information written to the test.xml. You can make this the default behavior on your machine by adding test --test_env=GO_TEST_WRAP_TESTV=1 to your .bazelrc.user file.

  • As with dev build, dev test allows you to pass additional arguments directly to Bazel by putting them after --: for example, dev test pkg/sql/types -- --verbose_failures --sandbox_debug.

  • To get test results printed as tests are being run add -v -- --test_output streamed to the test command. Note that this reduces test parallelism.

  • To attach a debugger to a hung dev test process tack -- -c dbg to the end of your command and it will disable stripping which breaks dlv (https://github.com/bazelbuild/intellij/issues/2313).

  • For more tips on debugging test failures, see “How to ensure your tests can run in the Bazel sandbox

Other tasks

Code Block
# RunBuild acceptance testsdev
./dev build acceptancedev
# RunBuild compose testscrlfmt
./dev composebuild crlfmt
# RunBuild benchmarks for pkg/sql/parser
roachprod
./dev bench pkg/sql/parserbuild roachprod
# GenerateRun code (run this before submitting your PR).acceptance tests
./dev generateacceptance
# Run lintscompose tests
./dev lintcompose
# logic tests!Run benchmarks for pkg/sql/parser
./dev testlogic --files=$FILES --subtests=$SUBTESTS --config=$CONFIG
# Open a container running the "bazelbuilder" image. Requires Docker.
./dev builder

dev vs. make

This is a (non-exhaustive) 1-to-1 mapping of dev commands to their make equivalents. Feel free to add to this (smile)

...

dev/bazel command

...

equivalent non-bazel command

...

./dev build

...

make build

...

./dev build short

...

make buildshort

...

./dev build pkg/sql/...

...

make build PKG=./pkg/sql/...

...

./dev test

...

make test

...

./dev test pkg/sql/parser -f TestParse

...

make test PKG=./pkg/sql/parser TESTS=TestParse

...

./dev test pkg/sql/parser -f TestParse --test-args '-test.count=5 -show-logs'

...

make test PKG=./pkg/sql/parser TESTS=TestParse TESTFLAGS='-count=5 -show-logs'

...

./dev bench pkg/sql/parser -f BenchmarkParse

...

make bench PKG=./pkg/sql/parser BENCHES=BenchmarkParse

...

./dev build --cross

...

build/builder.sh mkrelease

...

./dev builder

...

build/builder.sh

...

./dev testlogic base --files=fk --subtests=20042 --config=local

...

make testbaselogic FILES=fk SUBTESTS=20042 TESTCONFIG=local

...

Add build --define gotags=bazel,gss,X,Y to .bazelrc.user, then running dev

...

make ... TAGS=X,Y

...

Add gc_goopts = ["S"], to the go_library target in the BUILD.bazel file for the package you’re interested in, then running dev

...

make ... GOFLAGS=-gcflags=-S

...

Update the go_repository() declaration in DEPS.bzl for your dependency to point to a new remote and commit (see top-level comment in DEPS.bzl for more information), then build/test

...

Update local sources in vendor including your changes, then build/test

General dev tips

The top-level dev script uses Bazel to build pkg/cmd/dev before running unless another dev binary with the same increasing integer ID has already been built. Generally dev will invoke the dev binary “as of” that commit, which should usually be the correct behavior. However, if the default behavior does not work for some reason, you can find all the built versions of dev under bin/dev-versions.

A (hopefully) fast and error proof dev workflow

1. Switch to a new branch

2. If your workflow involves an IDE, generate your protos ./dev gen protobuf

  • Your IDE relies on generated files for many tasks (e.g. code navigation, IntelliSense, debugging), and will complain unless you have re-generated those files.

    • If you need to re-generate all generated go files, use the slower ./dev gen go

    • If the above fails, run the slowest ./dev gen to update all of your generated files.

    • You may recall that with make , this step was not necessary. If you’re curious why, see this slack thread.

3. If your workflow involves UI development, you’ll want additionally do the following:

Code Block
./dev gen protobuf 
./dev generate js
# start a cockroach node, e.g.
./dev build && ./cockroach start-single-node
# in separate window, start UI watch for incremental UI builds
./dev ui watch
# now you're ready to write UI code!

4. Write some code!

  • If you add new files or imports, run ./dev gen bazel before compiling or running a test. compilepkg: missing strict dependencies: is usually the indicator that ./dev gen bazel needs to be re-run.

    • to skip this step, see tip below on ALWAYS_RUN_GAZELLE

  • Build the binary: ./dev build short

5. Run a test

  • On an IDE: your normal workflow should work if your generated files are up to date (See step 2).

  • From the command line: ./dev test [path/to/pkg] --filter [test_name]

6. Before opening/updating a PR:

  • Run ./dev lint --short (maybe additionally make lintshort as dev's linter doesn’t have 100% coverage yet)

  • Assert your workspace is clean by running ./dev gen bazel . If you modified other generated files, run the appropriate ./dev gen [file_type] command.

General Bazel tips

  • Bazel has a configuration file called .bazelrc. You can put a global configuration file at ~/.bazelrc or a per-repository file at .bazelrc.user in the root of your cockroach repo.

  • Tired of running ./dev gen bazel? Set the ALWAYS_RUN_GAZELLE env-var to automatically run ./dev gen bazel before every dev test or dev build incantation. Note this does add a tiny delay – noticeable when iterating on small tests through dev test.

    • i.e. echo 'export ALWAYS_RUN_GAZELLE=1' >> ~/.zshrc

  • If you have ccache installed, bazel will fail with an error like ccache: error: Failed to create temporary file for /home/alyshanjahani/.ccache/tmp/message_li.stdout: Read-only file system. To avoid this you should get the ccache links out of your PATH manually (i.e. uninstall ccache), and then you might need to do bazel clean --expunge.

    • Alternatively, if you would like to use Bazel with ccache, you can enable support for writing outside the sandbox by adding the following to your $HOME/.bazelrc or <repo>/.bazelrc.user file:
      - For MacOS/Darwin:

      Code Block
      build --sandbox_writable_path=/Users/<USER>/Library/Caches/ccache/

      - For Linux:

      Code Block
      build --sandbox_writable_path=/home/<USER>/.ccache

If you’re using a different ccache directory (ccache --get-config cache_dir) point to that instead.

dev vs. Bazel

You can always use bazel directly instead of going through dev but there are some things you might want to keep in mind:

...

You should still ask dev doctor if your machine is up-to-snuff before you try to bazel build. The checks it performs aren’t dev-specific. dev doctor also sets up a local cache for you.

...

dev prints out the (relevant) calls to bazel it makes before it does so. You can therefore run dev once just to learn how to ask Bazel to perform your build/test and then just directly call into bazel on subsequent iterations.

  • When running tests under stress, race, or --rewritedev does the legwork to invoke with the necessary flags with bazel. This involves running under another binary (stress), running with certain gotags (race), or allowing certain paths outside the bazel sandbox to be written to (testdata). Feel free to see the actual bazel command invoked and tweak as necessary.

...

If you want to build with the UI, you must include the --config with_ui argument to bazel build. (dev takes care of this for you if you are using it.)

...

bench pkg/sql/parser
# Generate code and docs (run this before submitting your PR).
./dev generate
# Generate changes to BUILD.bazel files
./dev generate bazel --short
# Run lints
./dev lint
# logic tests!
./dev testlogic --files=$FILES --subtests=$SUBTESTS --config=$CONFIG
# Open a container running the "bazelbuilder" image. Requires Docker/Rancher Desktop/Podman/etc.
./dev builder
# Remove artifacts from building the UI
./dev ui clean --all
# Start the Bazel cache server after rebooting
./dev cache

To pass -gcflags to the build of a library, add gc_goopts = ["S"], to the go_library target in the BUILD.bazel file for the package you’re interested in, then run devas usual.

To override a dependency for local builds when doing automation, update the go_repository() declaration in DEPS.bzl for your dependency to point to a new remote and commit (see top-level comment in DEPS.bzl for more information), then build/test as usual.

General dev tips

The top-level dev script uses Bazel to build pkg/cmd/dev before running unless another dev binary with the same increasing integer ID has already been built. Generally dev will invoke the dev binary “as of” that commit, which should usually be the correct behavior. However, if the default behavior does not work for some reason, you can find all the built versions of dev under bin/dev-versions.

A (hopefully) fast and error proof dev workflow

1. Switch to a new branch

2. If your workflow involves an IDE, generate your protos ./dev gen protobuf

  • Your IDE relies on generated files for many tasks (e.g. code navigation, IntelliSense, debugging), and will complain unless you have re-generated those files.

    • If you need to re-generate all generated go files, use the slower ./dev gen go

    • If the above fails, run the slowest ./dev gen to update all of your generated files.

    • If this fails too, try git clean. If GOLAND complains about dependent packages, try git clean -dfx pkg instead. Then repeat the steps above.

    • You may recall that with make , this step was not necessary. If you’re curious why, see this slack thread.

3. If your workflow involves UI development, you’ll want additionally do the following:

Code Block
./dev gen protobuf 
./dev generate js
# start a cockroach node, e.g.
./dev build && ./cockroach start-single-node
# in separate window, start UI watch for incremental UI builds
./dev ui watch
# now you're ready to write UI code!

4. Write some code!

  • If you don’t have crlfmt already, you’ll need to ./dev build crlfmt to use it for formatting.

  • If you add new files or imports, run ./dev gen bazel before compiling or running a test. compilepkg: missing strict dependencies: is usually the indicator that ./dev gen bazel needs to be re-run.

    • to skip this step, see tip below on ALWAYS_RUN_GAZELLE

  • Build the binary: ./dev build short

5. Run a test

  • On an IDE: your normal workflow should work if your generated files are up to date (See step 2).

  • From the command line: ./dev test [path/to/pkg] --filter [test_name]

6. Before opening/updating a PR:

  • Run ./dev lint --short (maybe additionally make lintshort as dev's linter doesn’t have 100% coverage yet)

  • Assert your workspace is clean by running ./dev gen bazel . If you modified other generated files, run the appropriate ./dev gen [file_type] command.

Rapidly iterating with dependencies

The file DEPS.bzl tells Bazel how to download dependencies. For production, we point to .zip files that are mirrored on our internal infrastructure, protecting us against dependency yanking/”left-pad”-style failures. However, for local development, you have a few other options.

The top-level comment at the top of DEPS.bzl explains how to point to a custom remote for a dependency, for example:

Code Block
go_repository(
    name = "com_github_cockroachdb_sentry_go",
    build_file_proto_mode = "disable_global",
    importpath = "github.com/cockroachdb/sentry-go",
    vcs = "git",
    remote = "https://github.com/rickystewart/sentry-go",  # Custom fork.
    commit = "6c8e10aca9672de108063d4953399bd331b54037",  # Custom commit.
)

In this example, github.com/cockroachdb/sentry-go will point to the given remote and commit instead of using the production version of the library. Note the remote can be either a normal git https remote or it can be a local clone.

In this case, iterating can be cumbersome as you have to update the commit whenever you want to pull a new version of the dependency. You can use the Bazel flag --override_repository to optimize for this case, so you can make changes locally on your machine and immediately re-build cockroach with your latest local changes instead of updating the dependency to point to a new commit whenever you want to test your changes. The following explanation is copy-pasted from internal Slack:

The process doesn't vary per dependency so I'll demonstrate with github.com/google/btree. First I'm going to clone that repo and check out the version of the code I want.

Code Block
google$ git clone https://github.com/google/btree
Cloning into 'btree'...
remote: Enumerating objects: 163, done.
remote: Counting objects: 100% (40/40), done.
remote: Compressing objects: 100% (22/22), done.
remote: Total 163 (delta 16), reused 24 (delta 10), pack-reused 123
Receiving objects: 100% (163/163), 77.18 KiB | 1.07 MiB/s, done.
Resolving deltas: 100% (84/84), done.
google$ cd btree
btree$ git checkout v1.0.1
Note: switching to 'v1.0.1'.
........
HEAD is now at 479b5e8 Minor documentation fix, DescendGreaterThan starts with the last item in the tree and decends to the least item greater than the pivot
btree$ pwd
/Users/ricky/go/src/github.com/google/btree

Back in cockroach I update .bazelrc.user to point to the clone I just made. The form of the flag is --override_repository=REPO_NAME=/path/to/local/repo. The flag tells Bazel to ignore where REPO_NAME "really is", and instead just use the local clone. Note that DEPS.bzl declares the "name of the repo" which in this context is Java-style, like com_github_google_btree. I am going to add it to .bazelrc.user so I don't have to remember to add the flag every time, although it's a normal Bazel flag so you can just include it on the command-line too.

Code Block
cockroach$ echo 'build --override_repository=com_github_google_btree=/Users/ricky/go/src/github.com/google/btree' >> .bazelrc.user

The first thing I'll do is build just to demonstrate that what I've done so far is a no-op.

Code Block
cockroach$ ./dev build short 
$ bazel build //pkg/cmd/cockroach-short:cockroach-short
INFO: Invocation ID: bc808544-70de-4fc9-968c-66a815f64437
ERROR: /Users/ricky/go/src/github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl/BUILD.bazel:4:11: //pkg/ccl/changefeedccl:changefeedccl depends on @com_github_google_btree//:btree in repository @com_github_google_btree which failed to fetch. no such package '@com_github_google_btree//': No WORKSPACE file found in /private/var/tmp/_bazel_ricky/be70b24e7357091e16c49d70921b7985/external/com_github_google_btree

Oh, whoops. The BUILD.bazel file and WORKSPACE files are missing because I didn't run Gazelle. Let me fix that. From back in the btree directory:

Code Block
# NB: The WORKSPACE file needs to exist, it can be empty though.
btree$ touch WORKSPACE
btree$ go install github.com/bazelbuild/bazel-gazelle/cmd/gazelle@latest
go: downloading github.com/bazelbuild/bazel-gazelle v0.29.0
go: downloading github.com/bazelbuild/buildtools v0.0.0-20230111132423-06e8e2436a75
go: downloading github.com/bmatcuk/doublestar/v4 v4.6.0
btree$ ~/go/bin/gazelle -go_prefix=github.com/google/btree -repo_root=.
# Validate the BUILD.bazel file was created
btree$ git status
HEAD detached at v1.0.1
Untracked files:
  (use "git add <file>..." to include in what will be committed)
	BUILD.bazel
	WORKSPACE

nothing added to commit but untracked files present (use "git add" to track)

Now the build will succeed back in cockroach.

You only need to use gazelle to generate files once, unless as part of your changes you create a new file, update a dependency, or do something else that changes the actual build process. In that case you can re-run gazelle to fix it.

When you're done with testing your local changes, you can remove the --override_repository line from .bazelrc.user.

General Bazel tips

  • Bazel has a configuration file called .bazelrc. You can put a global configuration file at ~/.bazelrc or a per-repository file at .bazelrc.user in the root of your cockroach repo.

  • Stripping: stripping of symbols in built binaries is enabled by default, as disabling stripping slows down linking drastically. You can disable stripping with the Bazel flags -c dbg or -c opt (if you are making a binary you wish to debug, you will use -c dbg), or you can force-enable it with --strip=never.

    • Binaries built with dev build --cross, by the release process, or for nightly roachtests are built with -c opt and will therefore be unstripped.

  • While Bazel is the “official” build system, you do not have to use it for normal development. For example, many people do development from their IDE’s, and this is expected to “just work”. Note that since not all generated code is checked into the repo, you’ll first have to generate code to get much of it to build from a non-Bazel build system. We refer to this as the “escape hatch”. This escape hatch is specifically supported so if you have difficulty running a test in another build system after generating code, that’s a bug you should report. You can run the following commands to make this happen:

    • dev gen go

      • Generates all .go code that goes into the build, including cgo code

    • dev gen cgo

      • Generates some stub files that tell cgo how to link in the c-deps; part of dev gen go

    • dev gen protobuf

      • Generates all .pb.go/.pb.gw.go files; part of dev gen go

  • Tired of running ./dev gen bazel? Set the ALWAYS_RUN_GAZELLE env-var to automatically run gazelle before every dev test or dev build incantation. Note this does add a tiny delay – noticeable when iterating on small tests through dev test.

    • i.e. echo 'export ALWAYS_RUN_GAZELLE=1' >> ~/.zshrc

    • Note that gazelle is only a subset of the aactions that dev gen bazel performs. This by itself is able to handle most updates to the code, but is not able to handle things like vendoring new dependencies (dev gen bazel can do this for you).

  • If you have ccache installed, bazel will fail with an error like ccache: error: Failed to create temporary file for /home/alyshanjahani/.ccache/tmp/message_li.stdout: Read-only file system. To avoid this you should get the ccache links out of your PATH manually (i.e. uninstall ccache), and then you might need to do bazel clean --expunge.

    • Alternatively, if you would like to use Bazel with ccache, you can enable support for writing outside the sandbox by adding the following to your $HOME/.bazelrc or <repo>/.bazelrc.user file:
      - For MacOS/Darwin:

      Code Block
      build --sandbox_writable_path=/Users/<USER>/Library/Caches/ccache/

      - For Linux:

      Code Block
      build --sandbox_writable_path=/home/<USER>/.ccache

If you’re using a different ccache directory (ccache --get-config cache_dir) point to that instead.

dev vs. Bazel

You can always use bazel directly instead of going through dev but there are some things you might want to keep in mind:

  • You should still ask dev doctor if your machine is up-to-snuff before you try to bazel build. The checks it performs aren’t dev-specific. dev doctor also sets up a local cache for you.

  • dev prints out the (relevant) calls to bazel it makes before it does so. You can therefore run dev once just to learn how to ask Bazel to perform your build/test and then just directly call into bazel on subsequent iterations.

    • When running tests under stress, race, or --rewritedev does the legwork to invoke with the necessary flags with bazel. This involves running under another binary (stress), running with certain gotags (race), or allowing certain paths outside the bazel sandbox to be written to (testdata). Feel free to see the actual bazel command invoked and tweak as necessary.

  • If you want to build a test without running it, you must include the the --config test argument to bazel build. (dev takes care of this for you if you are using it.)

Managing CPU resources available to a test under Bazel

In Bazel, all tests under a given Go package belong to the same test target. Test targets can be sharded into multiple shards and each shard will run a subset of the tests in a given test target. Shards can run in parallel but tests within each shard run sequentially.

By default, each shard gets 1 CPU core and as a result each test has 1 CPU core available to it. This can be adjusted by adding the following to the go_test rule for that test target (found in BUILD.bazel)

Code Block
tags = ["cpu:n"]

Note: Adjusting the number of CPU cores using the method above will adjust the number of CPU cores available to all tests in that test target. If you need to adjust the number of cores for a single test (few tests), extract it into a separate package and adjust the number of cores there to avoid reserving extra CPU cores for tests that don’t need them.