Your first CockroachDB PR
This document is a long-winded guide to preparing and submitting your first contribution to CockroachDB. It's primarily intended as required reading for new Cockroach Labs engineers, but may prove useful to external contributors too.
The development cycle
At the time of this writing, CockroachDB is several hundred thousand lines of Go code, plus a smattering of C++ and TypeScript. This section is a whirlwind tour of what lives where.
First things first
Since CockroachDB is open source, most of the instructions for hacking on CockroachDB live in this repo, cockroachdb/cockroach. You should start by reading the top level of this section of the wiki.
Then, look at our Go (Golang) coding guidelines and the Google Go Code Review guide it links to. These are linked to from the other pages in this wiki, but they're easy to miss. If you haven't written any Go before CockroachDB, you may want to hold off on reviewing the style guide until you've written your first few functions in go.
Repository layout
Here's what's in each top-level directory in this repository:
build/
Build support scripts.c-deps/
Glue to convince our build system to build non-Go dependencies. At the time of writing, "non-Go dependencies" means C or C++ dependencies.cloud/kubernetes/
Kubernetes configuration to auto-launch CockroachDB clusters.docs/
Documentation for CockroachDB developers. See "Internal documentation" below.monitoring/
Configuration to integrate monitoring frameworks, namely Prometheus and Grafana, with CockroachDB. This configuration powers our internal monitoring dashboard as well.pkg/
First-party Go code. See "Internal documentation" below for details.scripts/
Handy shell scripts that aren't part of the build process. You'll likely interact with scripts/gceworker.sh most, which spins up a personal Linux VM for you to develop on in the GCE cloud.
Other important repositories
Besides cockroachdb/cockroach, the cockroachdb GitHub organization is home to several other important open-source components of Cockroach:
cockroachdb/docs, which houses the code behind our user-facing documentation at cockroachlabs.com/docs. At the time of writing, our stellar docs team handles essentially all documentation.
cockroachdb/examples-go, which contains small, self-contained Go programs that exercise CockroachDB via the PGWire protocol. You're likely to hear most about block_writer, which writes uniformly random values into a table, and photos, which simulates a more-realistic workload of a photo-sharing site, where some photos and users are orders of magnitude more popular. The other example programs are of a similar scope and purpose, but block_writer and photos are deemed important enough to run constantly against our production clusters.
cockroachdb/examples-orms, which showcases ORMs a toy API that uses an ORM to prepare its responses in several different languages.
Most of the remaining repositories under the cockroachdb organization are forks of existing Go libraries with some small, Cockroach-specific patches.
Internal documentation
Documentation on the first-party Go packages that make up CockroachDB is, as of this writing, essentially nonexistent. This is par for the course with code that's evolving as quickly as Cockroach, but it's something we're hoping to improve over time, especially as internal packages stabilize.
The internal documentation that we do have lives in cockroachdb/cockroach/docs. At the time of writing, most of this documentation covers the high-level architecture of the system. Only a few documents hone in on specifics, and even those only cover the features that were found to cause significant developer frustration. For most first-party packages, you'll need to read the source for usage instructions.
For our internal docs, I recommend the following reading order.
First, browse through the design document, which describes the architecture of the entire system at the highest possible level. You'll likely find there's too much information here to digest in one sitting: you should instead strive to remember what topics are covered, so you can refer to it later with more specific questions in mind.
Then, look through the docs/tech-notes folder and determine if any of the tech notes are relevant to your starter project. Again, you'll likely find that the tech notes contain too much information to process, so instead try to identify the sections that are likely to be useful as you make progress on your starter project.
The one exception to this rule is docs/tech-notes/contexts.md. It's worth learning why so many of our function signatures take a context as their first parameter, like so:
func doOperation(ctx context.Context, ...)
Otherwise, plumbing contexts everywhere will feel like a chore with no upsides.
Finally, I feel obligated to reproduce this disclaimer from the tech notes README:
Standard disclaimer: each document contains parts from one or more author. Each part was authored to reflect its author's perspective on the project at the time it was written. This understanding is necessarily subjective: its context is both the state of the project and the authors', and their reviewer