Log and error redactability
Our customers routinely send crash reports and log files to us, but they want confidence that this data does not contain confidential information. How do we achieve this?
For this purpose, the CockroachDB source code uses redactability. This is a crdb-specific combination of data types and APIs on top of Go’s string manipulation, logging and errors APIs.
Redactability makes it possible to remove sensitive information from a string after the string has been constructed.
This wiki page explains how to maintain redactability when adding or modifying CockroachDB’s source code. For more details, see the section References at the bottom.
Main concepts / definition
We use the word “sensitive” or “unsafe” to designate information that's potentially PII-laden or confidential, and “safe” for information that's certainly known to not be unsafe.
Notice the “priority order” in this definition: information is unsafe by default, until proven safe. For example, the basic string
type in Go will be considered unsafe.
A confidentiality leak occurs when unsafe information is incorrectly marked as safe.
The APIs discussed below make it possible to annotate information with proofs/promises that things are safe.
In summary:
A redactable string (or byte array) is a string where unsafe information is enclosed between special delimiters. For example,
var s RedactableString = “hello ‹secret›“
contains the safe word “hello” and unsafe word “secret”.String redaction is a function that deletes the data between delimiters to produce a redacted string.
For example,RedactableString(“hello ‹secret›”).Redact()
returns”hello ‹×›”
The remaining APIs can:
introduce the guarantee of safety from scratch,
promise that unsafe information is, in fact, safe; and that redactable strings are, in fact, redactable.
(See below for the definition of “promise”. This is unrelated to the similarly-named javascript concept.)transform information and compose redactable strings in a way that is proven to preserve redactability without leaking sensitive information.
Where can users observe redactable information?
Redactability (i.e. information where sensitive and safe bits can be separated from each other) can be found:
In log files or network log entries produced by CockroachDB with the
redactable: true
configuration flag set. Here is an example redactable log entry:I210413 09:51:03.906798 14 heapprofiler.go:49 ⋮ [n2] 34 writing go heap profiles to ‹/home/kena/cockroach/cockroach-data/logs/heap_profiler› at least every 1h0m0s ^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ this symbol means that the rest of the entry is redactable unsafe information
In crash reports, prior to sending to Cockroach Labs for monitoring and error tracking. The unsafe information is redacted on the way out.
For example:(The filename was marked as redactable-unsafe in the error message, and was redacted into “×” before sending. The web display erased the redaction markers.)
In Go
error
objects produced / managed by the CockroachDB errors library.
Why? Because errors eventually get translated into log entries crash reports, see above. If the errors were not redactable to start with, we couldn’t make log entries / crash reports redactable.
(The redaction markers inside errors are hidden when looking via the.Error()
method, but appear when usingredact.Sprint()
or crdb’slog
functions)At some point in the future (relative to the date of this writing, May 2021), in distributed traces produced by CockroachDB, which can be inspected during troubleshooting.
(At the time of this writing, traces are not redactable and thus should be considered thoroughly unsafe as per the definition above.)Certain data structures and strings held in RAM inside CockroachDB, when they are likely to be included in log messages or error payloads.
How to make information redactable?
Any data inside CockroachDB’s source code that may be included in an error message or a log entry should be made redactable.
Otherwise, it will be considered as unsafe by our tooling and removed when customers send log entries / errors to technical support.
More redactability = more observability + more troubleshootability.
The various APIs try to minimize the work needed by CockroachDB programmers, but sometimes extra care must be taken.
Simple cases
Redactability is mostly noticeable when emitting a log entry. A good way to check the redactability properties of an object is thus to log it and see what happens.
Here is what CockroachDB’s APIs already provide for you:
The constant literal string used as first argument to
errors.New
/Newf
/Wrap
etc, as well aslog.Info
,Infof
etc is axiomatically considered safe.The redactable contents of
error
objects are automatically recognized and propagated when constructing further error objects or log entries from them.
(Certain common error types from the Go runtime are also properly recognized to separate their safe vs unsafe payloads.)Certain data types outside of CockroachDB’s own source code have been marked as always-safe using the “safe type registry” (
redact.RegisterSafeType
), because we consider that they can never been traced back to individual customers or PII. This includes, for example:Go’s native booleans, integers & float types
time.Time
,time.Duration
os.Interrupt
Remaining data types are considered unsafe by default.
To make more information redactable, the CockroachDB programmer should thus spend extra effort to annotate information as safe or redactable that would be considered as unsafe otherwise.
This is especially the case with struct types and other Go types that alias basic types.
API Basics
To mark a data type as safe or redactable when it would be considered unsafe otherwise, use:
For simple types that alias a Go numeric type, you can axiomatically mark the type as always-safe by marking it with the
redact.SafeValue
interface.For more complex types or types that alias the Go
string
type, implement aSafeFormat
method (i.e. theredact.SafeFormatter
interface).
The primitives available in the body ofSafeFormat
provably generate redactable strings. See below for examples.
Compose
RedactableString
values upfront, then store them until later, instead of composing astring
value, storing it into a struct, and then later trying to include it into an error or log message.To compose a redactable string from a mix of safe and unsafe information, use:
redact.Sprint()
/redact.Sprintf()
to create aRedactableString
from various bits using fmt.Print / Printf-like formatting.redact.Join
() /redact.JoinTo
() to adjoin a list of various bits using a delimiter and form aRedactableString
a bit like strings.Joinredact.StringBuilder
to compose aRedactableString
programmatically likestrings.Builder
orbytes.Buffer
.
As you start learning about these mechanisms, you will slowly start noticing that Go’s native fmt.Stringer
interface (and the String()
method) becomes less and less relevant in your code — none of the logging or error code ever uses it if your objects implement SafeFormatter
or SafeValue
. In fact, we are likely to slowly phase out String()
methods over time.
Examples
Before | After |
---|---|
// type MetricSnap does not implement SafeFormat and its representation
// as string is thus considered fully unsafe by default.
func (m MetricSnap) String() string {
suffix := ""
if m.ConnsRefused > 0 {
suffix = fmt.Sprintf(", refused %d conns", m.ConnsRefused)
}
return fmt.Sprintf("infos %d/%d sent/received, bytes %dB/%dB sent/received%s",
m.InfosSent, m.InfosReceived,
m.BytesSent, m.BytesReceived,
suffix)
}
| // SafeFormat implements the redact.SafeFormatter interface.
func (m MetricSnap) SafeFormat(w redact.SafePrinter, _ rune) {
// Notice how similar the code below is to the original code on the
// left. The SafePrinter API has been designed to make it easy
// to “migrate” existing String() methods into SafeFormat().
//
// Why this “does the right thing” without special annotations:
// - The format string for w.Printf() is a literal constant and considered safe.
// - The numeric arguments are simple integers and thus considered safe.
// As a result, the entire string produced is automatically considered
// safe. No special “this is safe” annotations are needed.
w.Printf("infos %d/%d sent/received, bytes %dB/%dB sent/received",
m.InfosSent, m.InfosReceived,
m.BytesSent, m.BytesReceived)
if m.ConnsRefused > 0 {
w.Printf(", refused %d conns", m.ConnsRefused)
}
}
func (m MetricSnap) String() string {
// StringWithoutMarkers applies the SafeFormat method
// then removes the redaction markers to produce a “flat” string.
// This helps avoid code duplication between String()
// and SafeFormat().
//
// Note: The resulting String() method is only rarely
// called, since most relevant uses of MetricSnap
// will now use .SafeFormat() directly.
return redact.StringWithoutMarkers(m)
} |
|
When to use SafeFormatter
vs SafeValue
When in doubt, implement a SafeFormatter
method. This creates redactable strings that provably do not leak confidential information.
The SafeValue
marker interface is reserved to “leaf” data types which are so simple that they can be argued by just looking at the source code that they never can contain sensitive information. We do this e.g. for roachpb.NodeID
, descpb.DescID
and other such integer types.
Generally, avoid using the SafeValue
interface for non-simple types. The main problem this general rule solves it that that nothing prevents a programmer from later adding more data into values of that type and start leaking confidential information without noticing.
For the same reason, generally avoid using redact.Safe
and its aliases errors.Safe
/ log.Safe
. The promise made at the time the call is introduced that its argument is safe can be too easily broken “at a distance” by someone else later, for example by changing the type definition of the argument to start leaking unsafe information.
General rules
Proofs vs promises
A promise is when a person (e.g. a member of the CockroachDB team) expresses in the source code that some information is safe or redactable according to their opinion or understanding.
A proof is a function or algorithm that takes a combination of safe/unsafe information and is guaranteed, by construction (and as long as it compiles without type errors), to avoid confidentiality leaks.
Whenever we have a choice between a “proof API” or a “promise API”, we always prefer the proof, because it ensures that the code is not sensitive to human mistakes.
An axiom is an argument expressed in the code that a bit of information is safe or unsafe in a way that provably always true regardless of which data is processed by CockroachDB. Axioms thus have the same general quality as proofs and are thus superior to promises. We prefer axioms where the argument that it makes can be verified locally at the position in the code where it is made, without relying on knowledge pulled from elsewhere.
For example:
Bad | Good |
---|---|
| |
|
|
Redactability in error objects
The default error constructors from CockroachDB’s error library (Newf
, New
, AssertionFailedfd
, etc.) automatically implement redactability:
(axiom) The constant literal string argument to the non-formatting variants (eg
New
,Wrap
) is considered safe.(axiom)The characters in the constant literal format string 1st argument to the formatting bariants (e.g.
Newf
,Wrapf
,AssertionFailedf
) are considered safe.(proof) The remainder arguments are turned into redactable information as per the rules below.
(axiom) The data type name of the error objects (as well as their Go package import path) are considered safe. Although users don’t see this, it is included in crash reports for further troubleshootability.
The points above emphasize “constant literal string”. The fact a string is a constant literal (i.e. statically embedded in the CockroachDB executable) is what makes it safe. We enforce this property using a linter.
Redactability in log entries
CockroachDB’s log
functions first transform their parameters into an error object internally, as per the rules above. Then, that error object is formatted into a log entry.
Therefore, the rules for error objects described above apply equally when constructing log entries:
(axiom) The constant literal string argument to the non-formatting variants, as well as the first formatting argument to the formatting variants, is considered safe.
(proof) The remainder arguments are turned into redactable information as per the rules below.
The redactability properties of that error object are preserved throughout the logging system.
Redactability through the redact
package
Marking things as safe or unsafe
The best way to mark things as safe or unsafe is to implement a SafeFormatter
method or use redact.Sprint
/ redact.Sprintf
.
The other ways below are detailed only for reference but are error-prone.
(promise) Constant literal strings used to construct
redact.SafeString
values are considered safe.
This is not axiomatic because if the constant literal is modified separately from the cast to accidentally contain redaction markers, confidentiality leaks may occur.
Useredact.Sprintf
in case of doubt.(promise) Constant literal strings casted to
redact.RedactableString
are considered redactable.
This is not axiomatic because if the constant literal is modified separately from the cast to accidentally contain redaction markers, confidentiality leaks may occur.
Useredact.Sprintf
in case of doubt.(promise) Variable strings casted to
redact.SafeString
values are considered safe; those casted toredact.RedactableString
are considered redactable.
Again, this is highly dependent on programmer knowledge that the string doesn’t contain redaction markers, or unsafe information not marked as unsafe.(promise) Values of the types marked as always-safe via
redact.RegisterSafeType
are always considered safe.
This is used e.g. in CockroachDB to mark bool, integer types, float types, time.Time, time.Duration and os.Interrupt as safe.(promise) Values of the types that implement the
redact.SafeValue
interface are considered safe.
(See the section above “When to useSafeFormatter
vsSafeValue
” about why the SafeValue interface should only be used for the simplest types.)(promise) The result of
redact.Safe
applied to a value is considered safe.
Generally,redact.Safe
(and its aliaseserror.Safe
andlog.Safe
are not recommended. Implement aSafeFormatter
orSafeValue
instead.)(axiom) The result of
redact.Unsafe
applied to a safe or redactable value becomes unsafe.
(This function is provided for symmetry and testing, but is unlikely to be useful in practice.)
Combining things together
(proof)
redact.Sprintf
creates a redactable string using printf-like formatting (see below).
(NB: CockroachDb’s error library usesredact.Sprintf
internally, hence the overlap in rules.)(proof)
redact.Sprint
formats its positional arguments using the “extra argument” rules of printf-like formatting, see below.(proof)
redact.Join
,redact.JoinTo
preserve the redactability of their arguments, and can be constructed to create delimited lists of values.(proof) the
redact.SafePrinter
object available insideSafeFormat
methods composes redactability provably.(proof) the
redact.StringBuilder
type composes redactability provably.
Recursive rules during printf-like formatting
(axiom) The format string is considered safe.
(proof) An argument of type
RedactableString
orRedactableBytes
is included as-is. Whatever produced a value of that type made it properly redactable.(proof) An argument of type
error
is formatted usingerrors.FormatError
which takes care of exposing safe bits from known error types.(proof) An argument that implements the
SafeFormatter
interface is formatted using that interface.(proof) An argument whose Go type has been registered as safe (see above for details) is considered safe.
(proof) Certain additional rules apply to make types safe by default (see above “marking things safe or unsafe” for details).
(proof) At some point in the future relative to the date of this writing (May 2021), array and struct types whose elements are redactable will be formatted recursively in a way that preserves redactability.
(At the point of this writing, arrays and structs are considered unsafe unless one of the previous rules apply to them)(axiom) Other values are considered as unsafe.
The future of redactability in CockroachDB
See the last section in the blog post: both our external customers and our internal product team want to introduce a new separation inside log and error data:
Operational sensitive data: customer-owned data that can identify one of our customers, but not their end-users or the data that they store inside CockroachDB.
For example, the IP addresses of our customers' client apps running in the cloud are operational sensitive data.Application sensitive data: customer-owned data that contains PII or can identify end-users of our customers, with whom we don’t have a contractural relationship.
This is the most protected type of data and access is heavily regulated by law in most jurisdictions.
For example, the contents of the SQL tables are application sensitive data.
We wish to introduce this distinction because it would enable us to ingest operational data into telemetry without redaction; most of our customers have expressed willingness to share operational data (but not application data) with us.
Until we make this distinction in the code, we are unable to distinguish them so any sensitive data must be considered as application sensitive data by default.
If/when we study this distinction further, we will need to be extra careful about the following:
The names of databases, schemas, tables, types and columns inside the SQL schema (the “SQL metaschema”) must be considered application sensitive data by default, because sometimes SQL applications generate these names dynamically from data stored in the tables. We cannot blindly promote the SQL metaschema to the status of operational data without our customer’s explicit consent.
Same applies to the
application_name
field of SQL sessions.The names of SQL users used to log into CockroachDB must be considered application sensitive data by default, because in most deployments they are derived from actual people and thus contain PII.
References
The RFC that describes the motivation, the approach etc in detail (April 2020):
Slide deck that presented the approach to the team (August 2020):
Blog post that introduces the concepts to customers (Jan 2021):
Log and error redaction in CockroachDB v20.2The Go library which implements the core redactability API: https://github.com/cockroachdb/redact
This is independent of CockroachDB and can be reused in other Go projects.The location in the source code of CockroachDB’s errors library where error objects integrate redactability: https://github.com/cockroachdb/errors/blob/master/errutil/redactable.go
Copyright (C) Cockroach Labs.
Attention: This documentation is provided on an "as is" basis, without warranties or conditions of any kind, either express or implied, including, without limitation, any warranties or conditions of title, non-infringement, merchantability, or fitness for a particular purpose.