Squashed 'third_party/gperftools/' content from commit 54505f1

Change-Id: Id02e833828732b0efe7dac722b8485279e67c5fa
git-subtree-dir: third_party/gperftools
git-subtree-split: 54505f1d50c2d1f4676f5e87090b64a117fd980e
diff --git a/doc/heap_checker.html b/doc/heap_checker.html
new file mode 100644
index 0000000..ea2ade6
--- /dev/null
+++ b/doc/heap_checker.html
@@ -0,0 +1,534 @@
+<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
+<HTML>
+
+<HEAD>
+  <link rel="stylesheet" href="designstyle.css">
+  <title>Gperftools Heap Leak Checker</title>
+</HEAD>
+
+<BODY>
+
+<p align=right>
+  <i>Last modified
+  <script type=text/javascript>
+    var lm = new Date(document.lastModified);
+    document.write(lm.toDateString());
+  </script></i>
+</p>
+
+<p>This is the heap checker we use at Google to detect memory leaks in
+C++ programs.  There are three parts to using it: linking the library
+into an application, running the code, and analyzing the output.</p>
+
+
+<H1>Linking in the Library</H1>
+
+<p>The heap-checker is part of tcmalloc, so to install the heap
+checker into your executable, add <code>-ltcmalloc</code> to the
+link-time step for your executable.  Also, while we don't necessarily
+recommend this form of usage, it's possible to add in the profiler at
+run-time using <code>LD_PRELOAD</code>:</p>
+<pre>% env LD_PRELOAD="/usr/lib/libtcmalloc.so" <binary></pre>
+
+<p>This does <i>not</i> turn on heap checking; it just inserts the
+code.  For that reason, it's practical to just always link
+<code>-ltcmalloc</code> into a binary while developing; that's what we
+do at Google.  (However, since any user can turn on the profiler by
+setting an environment variable, it's not necessarily recommended to
+install heapchecker-linked binaries into a production, running
+system.)  Note that if you wish to use the heap checker, you must
+also use the tcmalloc memory-allocation library.  There is no way
+currently to use the heap checker separate from tcmalloc.</p>
+
+
+<h1>Running the Code</h1>
+
+<p>Note: For security reasons, heap profiling will not write to a file
+-- and is thus not usable -- for setuid programs.</p>
+
+<h2><a name="whole_program">Whole-program Heap Leak Checking</a></h2>
+
+<p>The recommended way to use the heap checker is in "whole program"
+mode.  In this case, the heap-checker starts tracking memory
+allocations before the start of <code>main()</code>, and checks again
+at program-exit.  If it finds any memory leaks -- that is, any memory
+not pointed to by objects that are still "live" at program-exit -- it
+aborts the program (via <code>exit(1)</code>) and prints a message
+describing how to track down the memory leak (using <A
+HREF="heapprofile.html#pprof">pprof</A>).</p>
+
+<p>The heap-checker records the stack trace for each allocation while
+it is active. This causes a significant increase in memory usage, in
+addition to slowing your program down.</p>
+
+<p>Here's how to run a program with whole-program heap checking:</p>
+
+<ol>
+  <li> <p>Define the environment variable HEAPCHECK to the <A
+       HREF="#types">type of heap-checking</A> to do.  For instance,
+       to heap-check
+       <code>/usr/local/bin/my_binary_compiled_with_tcmalloc</code>:</p>
+       <pre>% env HEAPCHECK=normal /usr/local/bin/my_binary_compiled_with_tcmalloc</pre>
+</ol>
+
+<p>No other action is required.</p>
+
+<p>Note that since the heap-checker uses the heap-profiling framework
+internally, it is not possible to run both the heap-checker and <A
+HREF="heapprofile.html">heap profiler</A> at the same time.</p>
+
+
+<h3><a name="types">Flavors of Heap Checking</a></h3>
+
+<p>These are the legal values when running a whole-program heap
+check:</p>
+<ol>
+  <li> <code>minimal</code>
+  <li> <code>normal</code>
+  <li> <code>strict</code>
+  <li> <code>draconian</code>
+</ol>
+
+<p>"Minimal" heap-checking starts as late as possible in a
+initialization, meaning you can leak some memory in your
+initialization routines (that run before <code>main()</code>, say),
+and not trigger a leak message.  If you frequently (and purposefully)
+leak data in one-time global initializers, "minimal" mode is useful
+for you.  Otherwise, you should avoid it for stricter modes.</p>
+
+<p>"Normal" heap-checking tracks <A HREF="#live">live objects</A> and
+reports a leak for any data that is not reachable via a live object
+when the program exits.</p>
+
+<p>"Strict" heap-checking is much like "normal" but has a few extra
+checks that memory isn't lost in global destructors.  In particular,
+if you have a global variable that allocates memory during program
+execution, and then "forgets" about the memory in the global
+destructor (say, by setting the pointer to it to NULL) without freeing
+it, that will prompt a leak message in "strict" mode, though not in
+"normal" mode.</p>
+
+<p>"Draconian" heap-checking is appropriate for those who like to be
+very precise about their memory management, and want the heap-checker
+to help them enforce it.  In "draconian" mode, the heap-checker does
+not do "live object" checking at all, so it reports a leak unless
+<i>all</i> allocated memory is freed before program exit. (However,
+you can use <A HREF="#disable">IgnoreObject()</A> to re-enable
+liveness-checking on an object-by-object basis.)</p>
+
+<p>"Normal" mode, as the name implies, is the one used most often at
+Google.  It's appropriate for everyday heap-checking use.</p>
+
+<p>In addition, there are two other possible modes:</p>
+<ul>
+  <li> <code>as-is</code>
+  <li> <code>local</code>
+</ul>
+<p><code>as-is</code> is the most flexible mode; it allows you to
+specify the various <A HREF="#options">knobs</A> of the heap checker
+explicitly.  <code>local</code> activates the <A
+HREF="#explicit">explicit heap-check instrumentation</A>, but does not
+turn on any whole-program leak checking.</p>
+
+
+<h3><A NAME="tweaking">Tweaking whole-program checking</A></h3>
+
+<p>In some cases you want to check the whole program for memory leaks,
+but waiting for after <code>main()</code> exits to do the first
+whole-program leak check is waiting too long: e.g. in a long-running
+server one might wish to simply periodically check for leaks while the
+server is running.  In this case, you can call the static method
+<code>NoGlobalLeaks()</code>, to verify no global leaks have happened
+as of that point in the program.</p>
+
+<p>Alternately, doing the check after <code>main()</code> exits might
+be too late.  Perhaps you have some objects that are known not to
+clean up properly at exit.  You'd like to do the "at exit" check
+before those objects are destroyed (since while they're live, any
+memory they point to will not be considered a leak).  In that case,
+you can call <code>NoGlobalLeaks()</code> manually, near the end of
+<code>main()</code>, and then call <code>CancelGlobalCheck()</code> to
+turn off the automatic post-<code>main()</code> check.</p>
+
+<p>Finally, there's a helper macro for "strict" and "draconian" modes,
+which require all global memory to be freed before program exit.  This
+freeing can be time-consuming and is often unnecessary, since libc
+cleans up all memory at program-exit for you.  If you want the
+benefits of "strict"/"draconian" modes without the cost of all that
+freeing, look at <code>REGISTER_HEAPCHECK_CLEANUP</code> (in
+<code>heap-checker.h</code>).  This macro allows you to mark specific
+cleanup code as active only when the heap-checker is turned on.</p>
+
+
+<h2><a name="explicit">Explicit (Partial-program) Heap Leak Checking</h2>
+
+<p>Instead of whole-program checking, you can check certain parts of your
+code to verify they do not have memory leaks.  This check verifies that
+between two parts of a program, no memory is allocated without being freed.</p>
+<p>To use this kind of checking code, bracket the code you want
+checked by creating a <code>HeapLeakChecker</code> object at the
+beginning of the code segment, and call
+<code>NoLeaks()</code> at the end.  These functions, and all others
+referred to in this file, are declared in
+<code>&lt;gperftools/heap-checker.h&gt;</code>.
+</p>
+
+<p>Here's an example:</p>
+<pre>
+  HeapLeakChecker heap_checker("test_foo");
+  {
+    code that exercises some foo functionality;
+    this code should not leak memory;
+  }
+  if (!heap_checker.NoLeaks()) assert(NULL == "heap memory leak");
+</pre>
+
+<p>Note that adding in the <code>HeapLeakChecker</code> object merely
+instruments the code for leak-checking.  To actually turn on this
+leak-checking on a particular run of the executable, you must still
+run with the heap-checker turned on:</p>
+<pre>% env HEAPCHECK=local /usr/local/bin/my_binary_compiled_with_tcmalloc</pre>
+<p>If you want to do whole-program leak checking in addition to this
+manual leak checking, you can run in <code>normal</code> or some other
+mode instead: they'll run the "local" checks in addition to the
+whole-program check.</p>
+
+
+<h2><a name="disable">Disabling Heap-checking of Known Leaks</a></h2>
+
+<p>Sometimes your code has leaks that you know about and are willing
+to accept.  You would like the heap checker to ignore them when
+checking your program.  You can do this by bracketing the code in
+question with an appropriate heap-checking construct:</p>
+<pre>
+   ...
+   {
+     HeapLeakChecker::Disabler disabler;
+     &lt;leaky code&gt;
+   }
+   ...
+</pre>
+Any objects allocated by <code>leaky code</code> (including inside any
+routines called by <code>leaky code</code>) and any objects reachable
+from such objects are not reported as leaks.
+
+<p>Alternately, you can use <code>IgnoreObject()</code>, which takes a
+pointer to an object to ignore.  That memory, and everything reachable
+from it (by following pointers), is ignored for the purposes of leak
+checking.  You can call <code>UnIgnoreObject()</code> to undo the
+effects of <code>IgnoreObject()</code>.</p>
+
+
+<h2><a name="options">Tuning the Heap Checker</h2>
+
+<p>The heap leak checker has many options, some that trade off running
+time and accuracy, and others that increase the sensitivity at the
+risk of returning false positives.  For most uses, the range covered
+by the <A HREF="#types">heap-check flavors</A> is enough, but in
+specialized cases more control can be helpful.</p>
+
+<p>
+These options are specified via environment varaiables.
+</p>
+
+<p>This first set of options controls sensitivity and accuracy.  These
+options are ignored unless you run the heap checker in <A
+HREF="#types">as-is</A> mode.
+
+<table frame=box rules=sides cellpadding=5 width=100%>
+
+<tr valign=top>
+  <td><code>HEAP_CHECK_AFTER_DESTRUCTORS</code></td>
+  <td>Default: false</td>
+  <td>
+    When true, do the final leak check after all other global
+    destructors have run.  When false, do it after all
+    <code>REGISTER_HEAPCHECK_CLEANUP</code>, typically much earlier in
+    the global-destructor process.
+  </td>
+</tr>
+
+<tr valign=top>
+  <td><code>HEAP_CHECK_IGNORE_THREAD_LIVE</code></td>
+  <td>Default: true</td>
+  <td>
+    If true, ignore objects reachable from thread stacks and registers
+    (that is, do not report them as leaks).
+  </td>
+</tr>
+
+<tr valign=top>
+  <td><code>HEAP_CHECK_IGNORE_GLOBAL_LIVE</code></td>
+  <td>Default: true</td>
+  <td>
+    If true, ignore objects reachable from global variables and data
+    (that is, do not report them as leaks).
+  </td>
+</tr>
+
+</table>
+
+<p>These options modify the behavior of whole-program leak
+checking.</p>
+
+<table frame=box rules=sides cellpadding=5 width=100%>
+
+<tr valign=top>
+  <td><code>HEAP_CHECK_MAX_LEAKS</code></td>
+  <td>Default: 20</td>
+  <td>
+    The maximum number of leaks to be printed to stderr (all leaks are still
+    emitted to file output for pprof to visualize). If negative or zero,
+    print all the leaks found.
+  </td>
+</tr>
+
+
+</table>
+
+<p>These options apply to all types of leak checking.</p>
+
+<table frame=box rules=sides cellpadding=5 width=100%>
+
+<tr valign=top>
+  <td><code>HEAP_CHECK_IDENTIFY_LEAKS</code></td>
+  <td>Default: false</td>
+  <td>
+    If true, generate the addresses of the leaked objects in the
+    generated memory leak profile files.
+  </td>
+</tr>
+
+<tr valign=top>
+  <td><code>HEAP_CHECK_TEST_POINTER_ALIGNMENT</code></td>
+  <td>Default: false</td>
+  <td>
+    If true, check all leaks to see if they might be due to the use
+    of unaligned pointers.
+  </td>
+</tr>
+
+<tr valign=top>
+  <td><code>HEAP_CHECK_POINTER_SOURCE_ALIGNMENT</code></td>
+  <td>Default: sizeof(void*)</td>
+  <td>
+    Alignment at which all pointers in memory are supposed to be located.
+    Use 1 if any alignment is ok.
+  </td>
+</tr>
+
+<tr valign=top>
+  <td><code>PPROF_PATH</code></td>
+  <td>Default: pprof</td>
+<td>
+    The location of the <code>pprof</code> executable.
+  </td>
+</tr>
+
+<tr valign=top>
+  <td><code>HEAP_CHECK_DUMP_DIRECTORY</code></td>
+  <td>Default: /tmp</td>
+  <td>
+    Where the heap-profile files are kept while the program is running.
+  </td>
+</tr>
+
+</table>
+
+
+<h2>Tips for Handling Detected Leaks</h2>
+
+<p>What do you do when the heap leak checker detects a memory leak?
+First, you should run the reported <code>pprof</code> command;
+hopefully, that is enough to track down the location where the leak
+occurs.</p>
+
+<p>If the leak is a real leak, you should fix it!</p>
+
+<p>If you are sure that the reported leaks are not dangerous and there
+is no good way to fix them, then you can use
+<code>HeapLeakChecker::Disabler</code> and/or
+<code>HeapLeakChecker::IgnoreObject()</code> to disable heap-checking
+for certain parts of the codebase.</p>
+
+<p>In "strict" or "draconian" mode, leaks may be due to incomplete
+cleanup in the destructors of global variables.  If you don't wish to
+augment the cleanup routines, but still want to run in "strict" or
+"draconian" mode, consider using <A
+HREF="#tweaking"><code>REGISTER_HEAPCHECK_CLEANUP</code></A>.</p>
+
+<h2>Hints for Debugging Detected Leaks</h2>
+
+<p>Sometimes it can be useful to not only know the exact code that
+allocates the leaked objects, but also the addresses of the leaked objects.
+Combining this e.g. with additional logging in the program
+one can then track which subset of the allocations
+made at a certain spot in the code are leaked.
+<br/>
+To get the addresses of all leaked objects
+  define the environment variable <code>HEAP_CHECK_IDENTIFY_LEAKS</code>
+  to be <code>1</code>.
+The object addresses will be reported in the form of addresses
+of fake immediate callers of the memory allocation routines.
+Note that the performance of doing leak-checking in this mode
+can be noticeably worse than the default mode.
+</p>
+
+<p>One relatively common class of leaks that don't look real
+is the case of multiple initialization.
+In such cases the reported leaks are typically things that are
+linked from some global objects,
+which are initialized and say never modified again.
+The non-obvious cause of the leak is frequently the fact that
+the initialization code for these objects executes more than once.
+<br/>
+E.g. if the code of some <code>.cc</code> file is made to be included twice
+into the binary, then the constructors for global objects defined in that file
+will execute twice thus leaking the things allocated on the first run.
+<br/>
+Similar problems can occur if object initialization is done more explicitly
+e.g. on demand by a slightly buggy code
+that does not always ensure only-once initialization.
+</p>
+
+<p>
+A more rare but even more puzzling problem can be use of not properly
+aligned pointers (maybe inside of not properly aligned objects).
+Normally such pointers are not followed by the leak checker,
+hence the objects reachable only via such pointers are reported as leaks.
+If you suspect this case
+  define the environment variable <code>HEAP_CHECK_TEST_POINTER_ALIGNMENT</code>
+  to be <code>1</code>
+and then look closely at the generated leak report messages.
+</p>
+
+<h1>How It Works</h1>
+
+<p>When a <code>HeapLeakChecker</code> object is constructed, it dumps
+a memory-usage profile named
+<code>&lt;prefix&gt;.&lt;name&gt;-beg.heap</code> to a temporary
+directory.  When <code>NoLeaks()</code>
+is called (for whole-program checking, this happens automatically at
+program-exit), it dumps another profile, named
+<code>&lt;prefix&gt;.&lt;name&gt;-end.heap</code>.
+(<code>&lt;prefix&gt;</code> is typically determined automatically,
+and <code>&lt;name&gt;</code> is typically <code>argv[0]</code>.)  It
+then compares the two profiles.  If the second profile shows
+more memory use than the first, the
+<code>NoLeaks()</code> function will
+return false.  For "whole program" profiling, this will cause the
+executable to abort (via <code>exit(1)</code>).  In all cases, it will
+print a message on how to process the dumped profiles to locate
+leaks.</p>
+
+<h3><A name=live>Detecting Live Objects</A></h3>
+
+<p>At any point during a program's execution, all memory that is
+accessible at that time is considered "live."  This includes global
+variables, and also any memory that is reachable by following pointers
+from a global variable.  It also includes all memory reachable from
+the current stack frame and from current CPU registers (this captures
+local variables).  Finally, it includes the thread equivalents of
+these: thread-local storage and thread heaps, memory reachable from
+thread-local storage and thread heaps, and memory reachable from
+thread CPU registers.</p>
+
+<p>In all modes except "draconian," live memory is not
+considered to be a leak.  We detect this by doing a liveness flood,
+traversing pointers to heap objects starting from some initial memory
+regions we know to potentially contain live pointer data.  Note that
+this flood might potentially not find some (global) live data region
+to start the flood from.  If you find such, please file a bug.</p>
+
+<p>The liveness flood attempts to treat any properly aligned byte
+sequences as pointers to heap objects and thinks that it found a good
+pointer whenever the current heap memory map contains an object with
+the address whose byte representation we found.  Some pointers into
+not-at-start of object will also work here.</p>
+
+<p>As a result of this simple approach, it's possible (though
+unlikely) for the flood to be inexact and occasionally result in
+leaked objects being erroneously determined to be live.  For instance,
+random bit patterns can happen to look like pointers to leaked heap
+objects.  More likely, stale pointer data not corresponding to any
+live program variables can be still present in memory regions,
+especially in thread stacks.  For instance, depending on how the local
+<code>malloc</code> is implemented, it may reuse a heap object
+address:</p>
+<pre>
+    char* p = new char[1];   // new might return 0x80000000, say.
+    delete p;
+    new char[1];             // new might return 0x80000000 again
+    // This last new is a leak, but doesn't seem it: p looks like it points to it
+</pre>
+
+<p>In other words, imprecisions in the liveness flood mean that for
+any heap leak check we might miss some memory leaks.  This means that
+for local leak checks, we might report a memory leak in the local
+area, even though the leak actually happened before the
+<code>HeapLeakChecker</code> object was constructed.  Note that for
+whole-program checks, a leak report <i>does</i> always correspond to a
+real leak (since there's no "before" to have created a false-live
+object).</p>
+
+<p>While this liveness flood approach is not very portable and not
+100% accurate, it works in most cases and saves us from writing a lot
+of explicit clean up code and other hassles when dealing with thread
+data.</p>
+
+
+<h3>Visualizing Leak with <code>pprof</code></h3>
+
+<p>
+The heap checker automatically prints basic leak info with stack traces of
+leaked objects' allocation sites, as well as a pprof command line that can be
+used to visualize the call-graph involved in these allocations.
+The latter can be much more useful for a human
+to see where/why the leaks happened, especially if the leaks are numerous.
+</p>
+
+<h3>Leak-checking and Threads</h3>
+
+<p>At the time of HeapLeakChecker's construction and during
+<code>NoLeaks()</code> calls, we grab a lock
+and then pause all other threads so other threads do not interfere
+with recording or analyzing the state of the heap.</p>
+
+<p>In general, leak checking works correctly in the presence of
+threads.  However, thread stack data liveness determination (via
+<code>base/thread_lister.h</code>) does not work when the program is
+running under GDB, because the ptrace functionality needed for finding
+threads is already hooked to by GDB.  Conversely, leak checker's
+ptrace attempts might also interfere with GDB.  As a result, GDB can
+result in potentially false leak reports.  For this reason, the
+heap-checker turns itself off when running under GDB.</p>
+
+<p>Also, <code>thread_lister</code> only works for Linux pthreads;
+leak checking is unlikely to handle other thread implementations
+correctly.</p>
+
+<p>As mentioned in the discussion of liveness flooding, thread-stack
+liveness determination might mis-classify as reachable objects that
+very recently became unreachable (leaked).  This can happen when the
+pointers to now-logically-unreachable objects are present in the
+active thread stack frame.  In other words, trivial code like the
+following might not produce the expected leak checking outcome
+depending on how the compiled code works with the stack:</p>
+<pre>
+  int* foo = new int [20];
+  HeapLeakChecker check("a_check");
+  foo = NULL;
+  // May fail to trigger.
+  if (!heap_checker.NoLeaks()) assert(NULL == "heap memory leak");
+</pre>
+
+
+<hr>
+<address>Maxim Lifantsev<br>
+<!-- Created: Tue Dec 19 10:43:14 PST 2000 -->
+<!-- hhmts start -->
+Last modified: Fri Jul 13 13:14:33 PDT 2007
+<!-- hhmts end -->
+</address>
+</body>
+</html>