+ Regroups functions to deal with Erlang's memory allocators, or
+ particularly, to try to present the allocator data in a way that
+ makes it simpler to discover the presence of possible problems.
+
- Regroups functions to deal with Erlang's memory allocators, or
- particularly, to try to present the allocator data in a way that
- makes it simpler to discover the presence of possible problems.
+ Provides production-safe tracing facilities, to dig into the
+ execution of programs and function calls as they are running.
@@ -79,6 +85,16 @@
Recon Application
information that can be useful in determining the most common causes
of node failure.
+
+
queue_fun.awk
+
+ Awk script to tun on an Erlang Crash dump as
+ awk -v threshold=<queue size> -f queue_fun.awk <crashdump> and will
+ show what function processes with queue sizes larger or equal to
+ <queue size> were operating at the time of the crash dump. May help
+ find out if most processes were stuck blocking on a given function
+ call while accumulating messages forever.
+
load a snapshot from a given file. The format of the data in the
file can be either the same as output by snapshot_save(),
or the output obtained by calling
- {erlang:memory(),[{A,erlang:system_info({allocator,A})} || A <- element(3,erlang:system_info(allocator))]}.
+ {erlang:memory(),[{A,erlang:system_info({allocator,A})} || A <- erlang:system_info(alloc_util_allocators)++[sys_alloc,mseg_alloc]]}.
and storing it in a file.
If the latter option is taken, please remember to add a full stop at the end
of the resulting Erlang term, as this function uses file:consult/1 to load
diff --git a/recon_lib.html b/recon_lib.html
index 42c4345..c3aac77 100644
--- a/recon_lib.html
+++ b/recon_lib.html
@@ -13,7 +13,7 @@
The Erlang Trace BIFs allow to trace any Erlang code at all. They work in
+two parts: pid specifications, and trace patterns.
+
+
Pid specifications let you decide which processes to target. They can be
+ specific pids, all pids, existing pids, or new pids (those not
+spawned at the time of the function call).
+
+
The trace patterns represent functions. Functions can be specified in two
+ parts: specifying the modules, functions, and arguments, and then with
+ Erlang match specifications to add constraints to arguments (see
+ calls/3 for details).
+
+
What defines whether you get traced or not is the intersection of both:
In order to see the content we want, we should change the trace patterns
+ to use a fun that matches on all arguments in a list (_) and returns
+ return_trace(). This last part will generate a second trace for each
+call that includes the return value:
Note that in the pattern above, no specific function ('_') was
+ matched against. Instead, the fun used restricted functions to those
+ having two arguments, the first of which is either a list or an integer
+ greater than 1.
+
+
The limit was also set using {10,100} instead of an integer, making the
+rate-limitting at 10 messages per 100 milliseconds, instead of an absolute
+value.
+
+
Any tracing can be manually interrupted by calling recon_trace:clear(),
+or killing the shell process.
+
+
Be aware that extremely broad patterns with lax rate-limitting (or very
+ high absolute limits) may impact your node's stability in ways
+ recon_trace cannot easily help you with.
+
+
In doubt, start with the most restrictive tracing possible, with low
+limits, and progressively increase your scope.
+
+
See calls/3 for more details and tracing possibilities.
The tracer process receives trace messages from the node, and enforces
+limits in absolute terms or trace rates, before forwarding the messages
+to the formatter. This is done so the tracer can do as little work as
+possible and never block while building up a large mailbox.
+
+
The tracer process is linked to the shell, and the formatter to the
+tracer process. The formatter also traps exits to be able to handle
+all received trace messages until the tracer termination, but will then
+shut down as soon as possible.
+
+ In case the operator is tracing from a remote shell which gets
+ disconnected, the links between the shell and the tracer should make it
+ so tracing is automatically turned off once you disconnect.
+
Allows to set trace patterns and pid specifications to trace
+function calls.
+
+
The basic calls take the trace patterns as tuples of the form
+ {Module, Function, Args} where:
+
+
+
Module is any atom representing a module
+
Function is any atom representing a function, or the wildcard
+ '_'
+
Args is either the arity of a function (0..255), a wildcard
+ pattern ('_'), a
+ match specification,
+ or a function from a shell session that can be transformed into
+ a match specification
+
+
+
There is also an argument specifying either a maximal count (a number)
+ of trace messages to be received, or a maximal frequency ({Num, Millisecs}).
+
+
Here are examples of things to trace:
+
+
+
All calls from the queue module, with 10 calls printed at most:
+ recon_trace:calls({queue, '_', '_'}, 10)
+
All calls to lists:seq(A,B), with 100 calls printed at most:
+ recon_trace:calls({lists, seq, 2}, 100)
+
All calls to lists:seq(A,B), with 100 calls per second at most:
+ recon_trace:calls({lists, seq, 2}, {100, 1000})
+
All calls to lists:seq(A,B,2) (all sequences increasing by two)
+ with 100 calls at most:
+ recon_trace:calls({lists, seq, fun([_,_,2]) -> ok end}, 100)
+
All calls to iolist_to_binary/1 made with a binary as an argument
+ already (kind of useless conversion!):
+ recon_trace:calls({erlang, iolist_to_binary, fun([X]) when is_binary(X) -> ok end}, 10)
+
Calls to the queue module only in a given process Pid, at a rate
+ of 50 per second at most:
+ recon_trace:calls({queue, '_', '_'}, {50,1000}, [{pid, Pid}])
+
Print the traces with the function arity instead of literal arguments:
+ recon_trace:calls(MFA, Max, [{args, arity}])
+
Matching the filter/2 functions of both dict and lists modules,
+ across new processes only:
+ recon_trace:calls([{dict,filter,2},{lists,filter,2}], 10, [{pid, new]})
+
Tracing the handle_call/3 functions of a given module for all new processes,
+ and those of an existing one registered with gproc:
+ recon_trace:calls({Mod,handle_call,3}, {10,100}, [{pid, [{via, gproc, Name}, new]}
+
Show the result of a given function call:
+ recon_trace:calls({Mod,Fun,fun(_) -> return_trace() end}, Max, Opts)
+ or
+ recon_trace:calls({Mod,Fun,[{'_', [], [{return_trace}]}]}, Max, Opts),
+ the important bit being the return_trace() call or the
+ {return_trace} match spec value.
+
+
+
There's a few more combination possible, with multiple trace patterns per call, and more
+options:
+
+
+
{pid, PidSpec}: which processes to trace. Valid options is any of
+ all, new, existing, or a process descriptor ({A,B,C},
+ "<A.B.C>", an atom representing a name, {global, Name},
+ {via, Registrar, Name}, or a pid). It's also possible to specify
+ more than one by putting them in a list.
+
{timestamp, formatter | trace}: by default, the formatter process
+ adds timestamps to messages received. If accurate timestamps are
+ required, it's possible to force the usage of timestamps within
+ trace messages by adding the option {timestamp, trace}.
+
{args, arity | args}: whether to print arity in function calls
+ or their (by default) literal representation.
+
{scope, global | local}: by default, only 'global' (fully qualified
+ function calls) are traced, not calls made internally. To force tracing
+ of local calls, pass in {scope, local}. This is useful whenever
+ you want to track the changes of code in a process that isn't called
+ with Module:Fun(Args), but just Fun(Args).
+
+
+ Also note that putting extremely large Max values (i.e. 99999999 or
+ {10000,1}) will probably negate most of the safe-guarding this library
+ does and be dangerous to your node. Similarly, tracing extremely large
+ amounts of function calls (all of them, or all of io for example)
+ can be risky if more trace messages are generated than any process on
+ the node could ever handle, despite the precautions taken by this library.