Introduction and strategy
Multi-tenant CockroachDB is a new way to structure the CockroachDB technology that achieves isolation between logical clusters. This is most useful when we share a common distributed storage across competing customers.
(As an analogy, multi-tenant CockroachDB achieves the virtualization of CockroachDB SQL in a similar way that containers or VMs achieve a virtualization of hosted servers.)
Today (Early 2022), the multi-tenant architecture is only available inside the CockroachCloud Serverless product. However, eventually, we wish to evolve CockroachDB to serve all application traffic using the multi-tenant architecture, including inside CockroachCloud Dedicated and licensed CockroachDB self-hosted customers.
In the words of our CTO, “Multi-tenant CockroachDB is the way CockroachDB should have been designed from the start.”
This also means that we are now focusing our development to maximize the application developer experience on top of multi-tenant.
Care must be taken to distinguish the internal product architecture, discussed here, from the ability to actually run two or more tenants side-by-side:
Cockroach Labs would retain exclusive right to define more than one tenant side-by-side on a shared storage cluster, via the Serverless product offering.
In CockroachCloud Dedicated and for self-hosted deployments, applications will be able to utilize a single pre-defined virtual cluster layered on top of the multi-tenant architecture, without the capability to define more tenants.
Overview of run-time components
Summary table
Deployment components: the deployment/SRE view
Description | In-code abstraction | In-memory instance | Unix process | Running container |
Routes SQL clients to the right server | “SQL proxy” | “SQL proxy instance” | “SQL proxy server” | “SQL proxy pod” |
Runs SQL queries | “SQL”or “SQL gateway” | “SQL instance” | “SQL server”or “SQL-only server” to highlight server contains no KV instance | “SQL pod”(implies “SQL-only server”) |
Runs KV queries | “KV components” (plural) | “KV instance” | “KV server” but the term is inclusive of mixed servers, we don't yet support KV-only servers. | N/A, we don't currently run KV-only servers. |
Runs both SQL and KV queries | NEW: “Mixed SQL/KV servers” | NEW: “Mixed SQL/KV pods” | ||
Stores data for multiple tenants, 1 unit | NEW: “Shared storage/DB server” | NEW: “Shared storage/DB pod” | ||
Stores data for all tenant, fleet of all servers | NEW: “Shared storage cluster” | NEW: “Shared storage cluster” |
We also use the word “node” to designate either a unix process or Docker container, when the distinction does not matter.
Logical components: the account administrator's view
What's virtualized | New name for the virtualized logical concept | Previous terminology | New name for the physical infrastructure |
The CockroachDB cluster service, as a whole | “Virtual cluster” | “Cluster” | N/A: the underlying infrastructure is not visible to end-users any more. |
Run-time state for a (virtual) cluster | “Tenant servers/pods” | “Servers/pods” | NEW: “Shared storage/DB servers/pods” |
On-disk state for a (virtual) cluster | “Tenant-specific data” | “CockroachDB data” | NEW: “Shared storage/DB data” |
new: the virtual cluster used to administer other virtual clusters = system cluster Beware of the difference between “Shared storage cluster” (deployed system) and “System cluster” (logical cluster an administrator connects to, to create additional virtual clusters) | |||
Ownership (not data) | “Tenant” | “User” |
Architectural terms
SQL Proxy
Role:
Accepts incoming connections from client apps
Determines which tenant the connection is for
Routes each connection to a SQL instance
Segue: Instances, servers, pods and nodes
“Instance”: a run-time realization of a data structure in the source code. Think: class vs object.
TCP/UDP ports are attached to instances.“Server”: a unix process started from an executable file. Contains diverse instances.
CPU/memory/IOPS accounting commonly happens here.“Pod”: a container, a kind of reduced virtual machine that can be managed by Kubernetes.
Usually contains 1 process, can contain more.
IP addresses and storage volumes are attached to containers.
For example:
We use the word “Node” when the distinction between “server” and “pod” does not matter.
SQL
NB: The name is just “SQL”.
Derived as “SQL instance”, “SQL server”, “SQL pod”, “SQL node” depending on the run-time properties of interest.
Role:
Accepts incoming connections from SQL proxy.
Responsible for SQL query execution for client apps.
Performs KV data requests to a shared storage cluster.
Also offers tenant-specific HTTP APIs.
Also known as “SQL-only server, pod, node” when the process only contains a SQL instance.
Shared storage cluster
Role (collective):
Accepts (KV) data requests from SQL instances.
Shared by many tenants.
Responsible for persisting (storing) data.
Abstract concept: KV-only server, pod, node
“KV instance”: Accepts and serves KV requests for SQL instances. This does exist.
“KV-only server”: This does not exist yet: we have not yet built the capability to run a process containing only a KV instance.
Storage server, pod, node
“Storage server”: a process that contains both a KV and SQL instance.
Alternatively: “mixed KV/SQL server”.
Multiple storage servers make collectively a “shared storage cluster”.
The SQL component here is “System SQL”
invisible to tenants.
used to administer tenants and KV.