Releases: mstange/samply
0.13.1 - 2025-02-01
Release Notes
This release adds Windows support. It uses ETW via xperf
to record the system activity to an ETL file. Then samply converts the ETL file.
Samply asks for Adminstrator privileges during profiling. This is necessary for ETW to work.
Thanks to @jrmuizel for getting this off the ground. Most of the Windows implementation was initially written by him. ETW is rather lightly documented, so this required a lot of research.
Also thanks to @vvuk, who integrated Jeff's code into samply and contributed hugely to getting this ready for production!
And thanks to the authors of the https://github.com/n4r1b/ferrisetw crate; samply uses etw-reader which started out as a fork of ferrisetw.
Known issues:
- By default, you won't get Windows symbols, but you can use
samply record --windows-symbol-server https://msdl.microsoft.com/download/symbols
to fix this - this will download symbols for Windows system libraries and kernel stacks from Microsoft's server. I'm planning to add a config file for samply so that symbol servers can be configured more permanently, but it doesn't exist yet. - Missing symbols for precompiled .NET code: This is getsentry/pdb#153, which has a potential patch in getsentry/pdb#154.
- CoreCLR support could be better - some of it isn't working correctly any more (see #483)
Breaking changes
- The minimum supported Rust version is now 1.77.
Features
- Windows: Initial support.
- macOS: Support attaching to running processes and their subprocesses (#190, by @vvuk, and #425, by @tmm1)
- macOS: Add
samply setup
to code-sign samply so that attaching to running processes can work (#217 + #353, by @vvuk) - All platforms:
samply import
has much better support for Android simpleperf now - All platforms: Add
--main-thread-only
flag - All platforms: Add
--include-args
argument - Windows, Linux: Add
--per-cpu-threads
flag - All platforms: Add
--symbol-dir
,--windows-symbol-server
,--windows-symbol-cache
,--breakpad-symbol-server
,--breakpad-symbol-dir
,--breakpad-symbol-cache
, and--simpleperf-binary-cache
arguments (various PRs, including some by @ishitatsuyuki) - All platforms: Add
--address
option to specify the IP address at which the local server is listening (#234, by @Rjected) - All platforms: Add
--unstable-presymbolicate
flag (#202, by @vvuk)
Fixes
- Fix build errors related to
zerocopy
andzerocopy_derive
(#356, by @mox692) - macOS: Fix library enumeration on macOS 15 Sequoia (#403, by @Maaarcocr)
Install samply 0.13.1
Install prebuilt binaries via shell script
curl --proto '=https' --tlsv1.2 -LsSf https://github.com/mstange/samply/releases/download/samply-v0.13.1/samply-installer.sh | sh
Install prebuilt binaries via powershell script
powershell -ExecutionPolicy Bypass -c "irm https://github.com/mstange/samply/releases/download/samply-v0.13.1/samply-installer.ps1 | iex"
Download samply 0.13.1
File | Platform | Checksum |
---|---|---|
samply-aarch64-apple-darwin.tar.xz | Apple Silicon macOS | checksum |
samply-x86_64-apple-darwin.tar.xz | Intel macOS | checksum |
samply-x86_64-pc-windows-msvc.zip | x64 Windows | checksum |
samply-aarch64-unknown-linux-gnu.tar.xz | ARM64 Linux | checksum |
samply-x86_64-unknown-linux-gnu.tar.xz | x64 Linux | checksum |
samply-x86_64-unknown-linux-musl.tar.xz | x64 MUSL Linux | checksum |
0.12.0 - 2024-04-16
Release Notes
Breaking changes
- The minimum supported Rust version is now 1.74.
samply load perf.data
is now calledsamply import perf.data
.- The
--port
alias has changed from-p
to-P
.
Features
- Linux: Allow attaching to running processes with
samply record -p [pid]
(#18, by @ishitatsuyuki) - Linux, macOS: Support Jitdump in
samply record
. - Linux: Support Jitdump in
samply import perf.data
withoutperf inject --jit
. - Linux, macOS: Support
/tmp/perf-[pid].map
(#34 + #36, by @bnjbvr) - Linux, macOS: Support specifying environment variables after
samply record
. - Linux, macOS: Add
--iteration-count
and--reuse-threads
flags tosamply record
. - Linux: Support symbolication with
.dwo
and.dwp
files. - Linux: Support unwinding and symbolicating VDSO frames.
- Linux, macOS: Support overwriting the launched browser with
$BROWSER
(#50, by @ishitatsuyuki) - Linux, macOS: Add
--profile-name
argument tosamply record
andsamply import
to allow overriding the profile name (#68, by @rukai) - Linux, macOS: Support Scala Native demangling (#109, by @keynmol)
- macOS: Support
--main-thread-only
insamply record
, for lower-overhead sampling - macOS, Linux: Unstable support for adding markers from
marker-[pid].txt
files which are opened (and, on Linux, mmap'ed) during profiling. - Linux: Support kernel symbols when importing
perf.data
files with kernel stacks, if/proc/sys/kernel/kptr_restrict
is0
. - Android: Support importing
perf.data
files recorded with simpleperf's--trace-offcpu
flag.
In progress
- Linux: Groundwork to support profiling Wine apps (by @ishitatsuyuki)
Fixes
- Linux, macOS: Don't discard information from processes with reused process IDs (e.g. due to exec).
- Linux: Support recording on more types of machines, by falling back to software perf events in more cases. (#70, by @rkd-msw)
- Linux: Fix out-of-order samples. (#30 + #62, by @ishitatsuyuki)
- Linux: Fix unwinding and symbolicating in processes which have forked without exec.
- Linux: Capture startup work of launched processes more reliably.
- Linux: Fix debuglink symbolication in certain cases. (#38, by @zecakeh)
- Linux: Fix stackwalking if unwinding information is stored in compressed
.debug_frame
sections. (#10, by @bobrik) - macOS: Fix symbolication of system libraries on x86_64 macOS 13+.
- Android: Allow building samply for Android. (#76, by @flxo)
- macOS: Fix Jitdump symbolication for functions which were JITted just before the sample was taken (#128, by @vvuk)
- macOS, Linux: More reliable handling of Ctrl+C during profiling.
- macOS: Support recording workloads with deep recursion by eliding the middle of long stacks and not running out of memory.
- x86_64: Improve disassembly of relative jumps by displaying the absolute target address (#54, by @jrmuizel)
- macOS: Use yellow instead of blue, for consistency with Linux which uses yellow for user stacks and orange for kernel stacks.
Other
Install samply 0.12.0
Install prebuilt binaries via shell script
curl --proto '=https' --tlsv1.2 -LsSf https://github.com/mstange/samply/releases/download/samply-v0.12.0/samply-installer.sh | sh
Install prebuilt binaries via powershell script
powershell -c "irm https://github.com/mstange/samply/releases/download/samply-v0.12.0/samply-installer.ps1 | iex"
Download samply 0.12.0
File | Platform | Checksum |
---|---|---|
samply-aarch64-apple-darwin.tar.xz | Apple Silicon macOS | checksum |
samply-x86_64-apple-darwin.tar.xz | Intel macOS | checksum |
samply-x86_64-pc-windows-msvc.zip | x64 Windows | checksum |
samply-x86_64-unknown-linux-gnu.tar.xz | x64 Linux | checksum |
samply-x86_64-unknown-linux-musl.tar.xz | x64 MUSL Linux | checksum |
v0.11.0
This release comes with the following fixes:
- Fixed a panic when closing the profiler tab during loading. (#11)
- On Linux,
samply load perf.data
will now include kernel symbols from/proc/kallsyms
if run with root privileges. - On Linux, if the
DEBUGINFOD_URLS
environment variable is set, samply will fetch symbols from the listed debuginfod servers. - In the profile JSON, additional properties are set to hide some unnecessary UI elements.
- In the profile JSON, macOS library information now has the right "arch" values. Furthermore, symbolication of macOS system libraries from the dyld shared cache will only check the dyld cache files for that architecture.
v0.10.1
This release raises the minimum supported Rust version to 1.61.
It comes with the following fixes compared to 0.9:
- On macOS 13, system libraries will have symbols again. (ajdusted dyld shared cache paths)
- On Linux, there will be fewer panics during recording.
- On Linux, there will be a useful error message if perf event paranoid settings are inadequate.
- CLI argument parsing is improved when recording executables with arguments. 0.9 sometimes required the use of
--
.
v0.8.0
This release rewrites stackwalking to make use of various types of unwinding info. Stacks should now be higher quality, and frame pointers are no longer required.