Using viewcore to analyze core dumps
Viewcore is a tool for analyzing and exploring the memory in Go core dumps that was originally created by the Go team. It’s useful for debugging memory leaks, reproducible OOM issues, and workloads that use more memory than expected. Here are some of its capabilities:
Understand how much memory is live, and how much is garbage
Understand which objects are taking up the most memory in the heap
Understand the types of objects in the heap
Browse the object graph and see all values within every object at the time of the core dump
Understand which objects are retaining an object, preventing it from being collected by the GC
See objects on the stacks of all goroutines
Visualize the retained object graph using
pprof
To use viewcore, there are a few steps:
Create a
cockroach
binary created with optimizations disabled for best resultsCreate a core dump from a running
cockroach
that’s doing something interestingRun viewcore and explore the heap
When to use viewcore
You should use viewcore
when studying OOMs or high memory behavior that is reproducible with synthetic data or workloads like TPCH
or TPCC
. Once you have an OOM that you can reproduce at will, you should induce the OOM condition, and get a core dump a little before the OOM would happen. At this point, spending some quality time staring at the core dump with viewcore
will likely lead to revelations.
Some examples of bugs found with viewcore
:
Do not try to get core dumps from customers! Core dumps should be treated as toxic waste, since they can contain arbitrarily sensitive, in-the-clear user data.
How to install viewcore
As of 2021, viewcore
is not actively maintained by the Go team, and is completely broken for Go versions newer than Go 1.11. @Jordan Lewis developed some bugfixes and patches to improve matters, which are located here:
https://github.com/golang/debug/pull/7 (fixes to make viewcore work at all)
GitHub - jordanlewis/debug at crl-stuff (patches to make viewcore work with Cockroach, improved types and pprof visualizers)
You should download and install the code in the linked crl-stuff
branch to use viewcore
.
Create a cockroach binary with optimizations disabled
In order for viewcore
to be able to accurately inspect the heap, it’s best to disable “compiler optimizations” and inlining, with the following invocation:
make build GOFLAGS="-gcflags=all='-N -l'"
The tool works without this special build, but not nearly as well, so given the chance you should use a special build to generate cores.
How to create a core dump
You need to be running on Linux to create a core dump. As of 2021, you cannot create a Go core dump on Mac.
The basic steps to create a Go core dump are as follows:
ensure that your shell has
ulimit -c unlimited
setrun Cockroach (or your other Go program) with the
GOTRACEBACK=crash
environment variable set, e.g.GOTRACEBACK=crash ./cockroach start-single-node --insecure
Send a fatal signal to the program (that isn’t a
SIGKILL
), like this:killall -SIGSEGV cockroach
. I like to use-SIGSEGV
because it feels particularly evil, but I think other signals work too.You should see
(core dumped)
somewhere at the end ofcockroach
's stderr. If you don’t, it means you missed one of the first 2 steps.Now you should have a core file. Its location is defined by
/proc/sys/kernel/core_pattern
.
On Roachprod
Roachprod makes it easy to collect a Go core dump. Steps 1 and 2 are already complete by default, and the core files are written to /mnt/data1/cores
. So all you have to do is send -SIGSEGV
to a cockroach that was invoked via roachprod start
, and it’ll dump core.
Alternate approach: gcore
Instead of killing the process, you can use gcore
/gdb
to get the core dump. To do this:
Install
gdb
withsudo apt-get install gdb
Look up the pid of the cockroach process.
sudo gcore -o /mnt/data1/cores $PID
Run Viewcore
Now you’re ready to viewcore! Invoke it like this to get the interactive prompt (and remember to make sure your core was generated with the passed in binary! You will get opaque errors otherwise):
viewcore path/to/core --exe path/to/binary
Once it loads, which might take a little while, run help
for a command summary.
Note: there are a few slow operations in viewcore that might take seconds to minutes. They are all cached, so be patient if things are moving slowly. Things that are slow:
On first load, the program runs a full “gc trace” of the entire heap to discover live and dead objects
Operations that produce type information (like histogram or peek) need to type all GC roots and propagate typings through the entire heap
Operations that allow introspection of what objects retain another need to produce the reverse edges map, another expensive full-heap iteration and map creation
breakdown
The breakdown command produces a high-level summary of the memory in the core. The key lines here are the live
vs garbage
quantities, which tell you how much actual data is reachable from GC roots (stack vars or global vars) and therefore not collectable, vs how much data is not reachable but hasn’t yet been collected.
(viewcore) breakdown
all 5.4 GB 100.00%
text 95 MB 1.77%
readonly 60 MB 1.12%
data 14 MB 0.26%
bss 1.7 GB 30.84% (grab bag, includes OS thread stacks, ...)
heap 3.4 GB 63.83%
in use spans 141 MB 2.62%
alloc 98 MB 1.83%
live 67 MB 1.25%
garbage 31 MB 0.57%
free 42 MB 0.78%
round 586 kB 0.01%
manual spans 3.5 MB 0.07% (Go stacks)
alloc 3.0 MB 0.06%
free 508 kB 0.01%
free spans 3.3 GB 61.14%
retained 40 MB 0.75% (kept for reuse by Go)
released 3.2 GB 60.39% (given back to the OS)
ptr bitmap 113 MB 2.11%
span table 3.5 MB 0.07%
histogram
Histogram produces a histogram of all types in the program sorted by total size. Pass the --top n
argument to limit to the top n types.
(viewcore) histo --top 10
count size bytes live% sum% type
363 66 kB 24 MB 35.32 35.32 [65536]uint8
895 8.2 kB 7.3 MB 10.89 46.20 [1+1023?]float64
60 49 kB 2.9 MB 4.38 50.58 [6144]int64
2 1.0 MB 2.1 MB 3.11 53.70 [1048576]uint8
359 4.9 kB 1.7 MB 2.59 56.29 [1025+191?]int32
178 8.2 kB 1.5 MB 2.16 58.45 [1024]int
1 1.1 MB 1.1 MB 1.63 60.08 [1089]bucket<github.com/cockroachdb/cockroach/pkg/geo/geopb.SRID,github.com/cockroachdb/cockroach/pkg/geo/geoprojbase.ProjInfo>
2 524 kB 1.0 MB 1.56 61.64 [524288]uint8
4 262 kB 1.0 MB 1.56 63.20 [262144]uint8
16 41 kB 655 kB 0.97 64.17 github.com/cockroachdb/pebble/internal/record.block
peek
Peek (only in crl-stuff
branch) takes a type and shows a breakdown of all object types that retain objects of the input type, and a breakdown of all object types that are retained by objects of the input type. This is somewhat akin to pprof’s peek
.
(viewcore) peek github.com/cockroachdb/cockroach/pkg/col/coldata.Bytes
count size bytes