Home

Awesome

MemorySnapshotAnalyzer

MemorySnapshotanalyzer is an analyzer for memory snapshots. While designed to support multiple backends/file formats, the initial version only has support for Unity Memory Snapshots captured through the Unity Memory Profiler window or Unity's MemoryProfiler.TakeSnapshot API.

Building and Running

MemorySnapshotAnalyzer was developed in C# and tested with both Visual Studio 2022 and .NET 6.0 on Windows, as well as VSCode and .NET 7.0 on MacOS X.

Use the solution file MemorySnapshotAnalyzer.sln to build and run the console application. On MacOS/Linux, you can build and run from the shell using the ./run.sh script.

The Read-Eval Print Loop

The interface to the tool is organized around a "read-eval-print loop" that reads commands from the console. Type help to list the available commands.

Interactive Command Syntax

The syntax for command lines is, admittedly, a bit idiosyncratic; see CommandProcessing/CommandLineParser.cs for the grammar (note that not all forms specified in the grammar are fully implemented yet). A quick guide:

Note that some commands take indices of different kinds (as well as addresses). Make sure to not confuse these indices with one another:

Some commands take a "type index or pattern" argument. This can be either:

Pagination

Output of commands is paginated to the console window height. Type q to get back to the command prompt or hit space for the next screenful of output. Commands can be interrupted using Ctrl-C at the time they output another line of text.

Loading Snapshots into Contexts

MemorySnapshotAnalyzer allows for several snapshots to be loaded at the same time. For instance, this allows snapshots taken at different times during execution of the same application to be compared.

When the tool starts up, it runs in a context with ID 0 and no snapshot loaded. You can use context command with an integer ID to switch to another context (which will be created if it didn't exist).

Use the load command to load a snapshot; by default, this will be loaded into a new context unless the current context had no snapshot loaded. You can use load with the 'replace option to force loading into the current context.

The Analysis Stack

MemorySnapshotAnalyzer allows analysis of what's in a heap snapshot at different levels of abstraction, ranging from bytes within ranges of committed memory up to object graphs. The levels of analysis, and which of them have been computed within a given context, are listed by the context command. Note that analyses at these levels are computed implicitly by commands as needed, which will be increasingly more expensive.

Selecting the Heap to Analyze

Some heap snapshots contain multiple heaps. You can select the heap(s) to analyze as follows:

Analysis Options

Some of these analysis are configurable. To set the options for analysis, use the options command. To see the currently configured options, use the context command.

Reference Classifiers

In many scenarios, a given object can be referenced from multiple other objects (such as caches or parent pointers), and it cannot be inferred automatically which is the object that is the primary "owner". This can make for cluttered backtraces, and - worse - make many objects float all the way up to the top of the dominator tree. With this option, you can provide configuration information that marks certain object fields as "owning" references. If an object is found to be referenced by both owning and non-owning references, the non-owning references are discarded from the backtrace.

There are two options for specifying reference classifiers for a given codebase:

The referenceclassifier 'load command takes the name of a configuration file with the following format:

Specific rule keywords are:

referenceclassifier 'save can be used to save a set of reference classifiers to a file.

referenceclassifier 'fromdll can be used to extract reference classifiers from all *.dll files directly within a given directory.

Identifying the Need for New "Owns" Rules

To help identify where configuring more owning references could be useful, you can use the heapdomstats command. When run without arguments, this provides statistics on the most frequent types of objects that have floated all the way to the top of the dominator tree. Running listobj 'dominatedby -1 (and, optionally, 'type and a type index or pattern), this lists the specific object instances that have floated to the top. Then use backtrace 'depth 1 on some of the given object indices to see whether one of the references in the backtrace clearly should be considered the "owning" reference, and add the referring field to the configuration file. To verify the effectiveness of the expanded configuration, reload the updated file by re-issuing options 'referenceclassifier, and rerun heapdomstats again.

Enabling Specific Groups for Individual Contexts

All reference classifiers are loaded into a single global store, shared across all analysis contexts. The contents of the store can be examined with the referenceclassifier 'list command. When more reference classifiers are loaded using referenceclassifier 'load, they are added to the global store.

However, for individual analysis contexts, you can choose to enable a specific set of groups. E.g., referenceclassifier 'disable 'group "*" would disable all loaded reference classifier groups in the current context, to be able to see the raw object graph in backtraces. Continuing from the previous example, subsequently running referenceclassifier 'enable 'group "thirdparty.*" would enable just the rules for third-party frameworks.

Default Third-Party Reference Classifier Configuration

MemorySnapshotAnalyzer ships with a thirdparty.rcl reference classifier configuration file that is automatically read (and enabled for the current context) on startup. It classifies some references in third-party frameworks. Specifically:

Tips for Chasing Down Memory Leaks Using Reference Classifiers and Lifelines

When an object on a managed heap is not considered, according to the design of the program, as one that - after this point in time in the program's execution - should no longer be accessed, but that is still reachable in the object graph. The "leaked" object will, in turn, possibly hold a possibly-substantial graph of objects live (the subtree under the object's node in the dominator tree).

Here are some tips that can be helpful for identifying leaked objects and fixing the cause for the leak:

  1. If you suspect objects of a specific type to be leaked, construct a reference classifier configuration file that identifies the "owning references" through which instances of this type would usually be reached, before they are leaked. Then use listobj 'unowned to dump instances of this type that are still reachable on the heap, but only in other (unintended) ways than the owning reference.
  2. Use backtrace 'lifelines to dump a compressed view of representative, somewhat short paths to either roots or strongly-owned nodes (according to the reference classifier in use).
  3. Inspect the lifeline diagram to find references that should no longer exist (e.g., be nulled out, or removed from a collection).
  4. Modify your program accordingly, rerun your scenario, and run the same analysis on a new memory snapshot. Confirm whether the object has become eligible for garbage collection, or inspect the updated lifeline diagram to find other references to break.
  5. To identify multiple leaks with a single snapshot while removing the clutter of already-identified leaks, use the ignorebackref command, passing a child index and a parent index as arguments. Backtraces will be recomputed while ignoring the given parent-child reference.

Visualizing the Heap Dominator Tree as a Treemap

This file can be used to visualize the dominator tree as a treemap. In the case of a heap dominator tree, the sizes of nodes correspond to the sizes in bytes of objects or subtrees, and nesting corresponds to the dominator relationship - a node that is dominated by another will visually be nested within that other node's rectangle.

When the heap is structured relatively hierarchically, possibly with the help of "owns" reference classifier rules, this visualization nicely displays the relative size cost of different functional elements (components) of an application.

The output of the heapdom "foo.html" command is written to an HTML file "foo.html" that embeds JavaScript representing the dominator tree as a JSON data structure. To display the treemap, open this file in a Browser. (Note that nodes can have a subnode with the name intrinsic to indicate the node's own intrinsic size, which - for arrays - can be significantly larger than the sizes of contained objects.)

A word of warning: Chrome restricts the memory consumed by an individual browser tab to 4 GiB, which has occasionally not been sufficient in practice to render dominator trees consisting of more than ~1.75M nodes. Use the heapdom options to limit depth, width, or minimum object size of graphs larger than that, or options 'rootobject, to reduce the amount of data displayed. You will see nodes with the name elided appear that summarize the size of the objects or subtrees that have been elided from the output. Also note that Chrome may fail to load trees that are deeper than 128 nodes, which accordingly is the default depth limit for the heapdom command. In our experience, Firefox Developer Edition may be better at dealing with large (very detailed) treemaps.

Within the treemap.html file, you can also edit the value of the colorGrouping constant to select different rectangle fill color schemes, to visualize dominator trees by:

License

MemorySnapshotAnalyzer is licensed under the MIT License.