Usage
Analyze an existing recording
java -jar jvmlens.jar analyze recording.jfr
The output is markdown: ranked hot paths (application-frame attributed), allocation sites, lock contention, GC pressure, and a one-line heuristic cause.
Each ranked row shows its absolute hit count next to the share (e.g.
99% (1040 samples)), so a high percentage built on one stray hit is obvious. Every
section is tagged with how the data was obtained — [sampled] (statistical: CPU and
allocation) or [measured] (exact: lock blocked time, GC) — so the reliability of each
signal is explicit.
Reading the rows:
-
Source-line anchoring — a hot-path row’s teaser lists the top leaves where time actually goes with counts, each carrying its source line (
com.example.Svc.compute:88 30/168); allocation sites carry the allocation call-site’s line (:120 · byte[] 4.2 GB). The locator turns "diagnose → find the spot → edit" into one step for a coding agent. -
Leaf confidence — when no single leaf holds more than ~20% of a path’s samples the teaser is flagged
⚠ diffuse, so you don’t chase a frame that isn’t where the cost is. -
Allocation confidence — a short recording yields few allocation samples; a
⚠ Only N allocation samplescaveat means the per-site byte shares are noisy (the total bytes stay reliable).
Differential analysis (before → after)
The agent optimize→measure loop — "did the fix work, what changed" — is a single flag.
-b / --baseline diffs <file.jfr> (the after) against a baseline (the before) and
names what moved, instead of making you eyeball two summaries:
java -jar jvmlens.jar analyze --baseline before.jfr after.jfr
java -jar jvmlens.jar analyze -b before.jfr after.jfr -f prompt # wrapped for an LLM
# JVM profile diff (before.jfr → after.jfr)
## Totals
- Exec samples: 236 → 691 (+455, +193%)
- Allocation: 9.6 GB → 3.4 GB (-6.2 GB, -64%)
- GC pause: 1663 ms → 35 ms (-1628 ms, -98%)
## Allocation sites
- `GoFmt.floatString` — 517.6 KB → 299.8 KB (▼ 42%) [share 22%→51%]
It diffs the totals (incl. total allocation bytes — the absolute memory anchor), hot
paths, allocation sites, locks, and every extended section. Each row is anchored on its
absolute weight (bytes / ms / samples) with the direction (▲/▼) and the share change as
secondary context, ranked by the size of the absolute change; NEW / GONE are called out,
and rows that barely moved drop. Anchoring on absolute matters: in an optimize loop the total
shrinks, so a site whose absolute bytes fell can show a rising share — share alone would
mislabel a real win as a regression. -a/-x scoping applies to both recordings.
CI perf-gate (--assert)
Add --assert "<rules>" (with --baseline) and jvmlens exits non-zero on regression —
a backend-free perf gate for a PR, ground the SaaS APMs can’t take (they own prod, not the
pull request). Rules are comma-separated metric < threshold:
| Metric | Fails when |
|---|---|
|
the after GC pause time (ms) is not under the limit |
|
GC pause increased by more than the limit (%) |
|
total allocation bytes increased by more than the limit (%) — the absolute memory gate |
|
retained (old-object) samples grew by more than the limit |
|
a hot path’s share rose by more than the limit (percentage points) |
|
a new hot path appears at/above the limit (% share) |
Prefer the absolute gates (gc-, alloc-pct, oldobj-delta) for memory/GC; the
-pp ones are share-based and can shuffle when the leader shrinks in an optimize loop.
java -jar jvmlens.jar analyze -b before.jfr after.jfr \
--assert "gc-pct < 10, regression-pp < 5, new-hotpath-pp < 20" || echo "perf regressed"
Each rule prints ✅/❌ with the actual value; the process exits 1 if any rule fails,
0 if all pass (2 for bad arguments) — drop it straight into a CI step.
JMH benchmarks (a directory of forks)
Optimizing a JVM library is almost always a JMH loop, and JMH’s -prof jfr writes one
.jfr per fork (often all named profile.jfr in per-benchmark subdirs). Point analyze
(and --baseline) at a directory and jvmlens finds every .jfr under it and merges
the forks into one summary, so the signal isn’t split:
java -jar benchmarks.jar -prof "jfr:dir=/tmp/run-after" # JMH writes per-fork .jfr there
java -jar jvmlens.jar analyze /tmp/run-after -a com.example # merged summary of all forks
# before/after across two JMH runs — the diff header uses the directory names, so it stays
# unambiguous even though every fork file is `profile.jfr`:
java -jar jvmlens.jar analyze --baseline /tmp/run-before /tmp/run-after \
--assert "alloc-pct < 0, gc-pct < 10"
(JMH records the whole fork including warm-up; for measurement-only signal, configure JMH to
delay the recording past warm-up, compare steady-state runs, or pass --skip-warmup <ms> to
analyze — it drops the first <ms> of each recording, measured per file from that file’s
earliest event, so each fork’s own warm-up is trimmed and hot paths reflect steady state.)
Or skip the separate step entirely with the JMH profiler plugin — it records the fork and
prints the summary inline when the trial ends. Put jvmlens-jmh.jar (a tiny
engine+profiler jar, no Spring/picocli/jmh) on the benchmark’s classpath and:
java -cp benchmarks.jar:jvmlens-jmh.jar org.openjdk.jmh.Main \
-prof "org.alexmond.jvmlens.jmh.JvmlensProfiler:appPackage=com.example;report=cpu"
appPackage (+-separate several) scopes to your code; report focuses the summary. An
unknown option key is a hard error with a did-you-mean (no silent misconfiguration). The
plugin runs in JMH’s host process and reuses the same engine as analyze.
For a before→after diff entirely inside JMH — no separate analyze — keep the fork’s
recording with keep=<path> (so it can seed the next run’s baseline) and diff against a prior
one with baseline=<prev.jfr> (the profiler prints the change report instead of the summary):
# run 1 (baseline): keep the recording
... -prof "org.alexmond.jvmlens.jmh.JvmlensProfiler:appPackage=com.example;keep=/tmp/before.jfr"
# run 2 (after the fix): print the diff vs run 1
... -prof "org.alexmond.jvmlens.jmh.JvmlensProfiler:appPackage=com.example;baseline=/tmp/before.jfr"
Fix-direction hints (--hints, opt-in)
--hints appends a hedged ## Likely fix directions [possible] section that maps recognized
hot-frame / allocation shapes to a one-line direction, each grounded in the row that
triggered it — e.g. DoubleToDecimal/formatUnsignedInt → "number→string formatting", a
LinkedList$ListItr.<init> self-time → "per-iteration iterator allocation",
AbstractStringBuilder.ensureCapacity → "presize the buffer". It is off by default (the
report stays clean data) and every line is tagged [possible] — a direction to investigate,
never an assertion.
java -jar jvmlens.jar analyze recording.jfr --hints
Budget-dialing the size
--top-k <n> keeps only the top n rows per section; --max-tokens <n> shrinks top-k until
the summary fits roughly n tokens (chars/4). For an always-on agent that adjusts limits at
runtime, the agent control plane’s topn does the same per-dimension (see Runtime control).
java -jar jvmlens.jar analyze recording.jfr --top-k 3
java -jar jvmlens.jar analyze recording.jfr --max-tokens 250
Report focus
-r / --report narrows the output to one concern (reusing the same sections the MCP
server exposes):
| Report | Shows |
|---|---|
|
Everything. |
|
Hot paths + leaf methods (sampled). |
|
Allocation sites + types. |
|
Lock contention + contended monitors (measured). |
|
GC pressure and the allocation that drives it. |
|
External (network + file) blocking I/O by endpoint (measured). |
|
Virtual-thread pinning sites, by pinned time (measured). |
|
Top SQL statements (agent JDBC instrumentation). |
|
Top HTTP endpoints (agent servlet instrumentation). |
|
Top messaging operations (agent Kafka/JMS instrumentation). |
|
Top cache operations (agent Spring-Cache instrumentation). |
|
Top Micrometer timers (consumed from an existing registry). |
|
Deadlocked threads and their wait-for cycle (agent recordings). |
The agent always runs a deadlock check (ThreadMXBean.findDeadlockedThreads) — no
option needed. A true deadlock is distinct from ordinary lock contention: the threads block
forever and never acquire the monitor, so JFR’s JavaMonitorEnter never fires; the
ThreadMXBean check is the reliable signal (and only sees the JVM it runs in, hence
agent-only). When present, a Deadlocked threads (wait-for cycle) section names each stuck
thread and the lock it waits on / who holds it.
The full report also appends a hedged Cross-dimension correlation note when two or more
dimensions carry signal — it co-locates the dominant endpoint / query / I/O / hot path / lock
/ GC so an LLM sees the candidate chain in one place. It’s co-occurrence, not proof (jvmlens
has no per-request trace linkage), so it suggests rather than asserts.
The io and pinning sections appear in the full report whenever the recording carries
those events. io aggregates jdk.SocketRead/jdk.SocketWrite (by remote host:port) and
jdk.FileRead/jdk.FileWrite (by path), ranked by blocked time with a bytes/op teaser.
pinning aggregates jdk.VirtualThreadPinned by site, surfacing the pinnedReason
(MONITOR / NATIVE_METHOD on JDK 24+) — the carrier-pinning that silently caps virtual-thread throughput.
java -jar jvmlens.jar analyze --report cpu recording.jfr
java -jar jvmlens.jar analyze -r memory -f json recording.jfr
Live capture from a running JVM
profile <pid> attaches to a running JVM, captures a timed JFR recording, and
summarizes it — no pre-recorded .jfr needed:
java -jar jvmlens.jar profile 12345 # 20s, markdown
java -jar jvmlens.jar profile -d 30 -w 5 12345 # warm up 5s, record 30s
java -jar jvmlens.jar profile -d 30 -k run.jfr 12345 # keep the recording
-w / --warmup waits before recording so startup/JIT noise is skipped.
-e / --engine selects the capture engine: jfr (default, prod-safe, also works over
remote JMX) or async (async-profiler — higher fidelity, adds native frames; local
<pid> only, and writes JFR so the same summarizer consumes it):
java -jar jvmlens.jar profile --engine async -d 30 12345
Benchmark a workload without JMH (bench)
bench is the no-JMH harness: point it at any class’s main(String[]) and it runs a
warmup→timed loop, captures a JFR over only the timed phase, and summarizes — so an
ordinary app or library with no benchmark module doesn’t need a hand-rolled driver:
java -jar jvmlens.jar bench --main com.example.RenderDriver -w 20 -i 200 -a com.example
# load the workload from its own classpath (it needn't be on jvmlens's), keep the JFR,
# pass args to its main after `--`:
java -jar jvmlens.jar bench --main com.example.RenderDriver --cp target/classes:$(cat cp.txt) \
-w 20 -i 200 -a com.example --jfr /tmp/before.jfr -- arg1 arg2
Each main invocation is one iteration. -w / --warmup iterations run before the
recording starts (so JIT/classload churn stays out of the steady-state signal); -i /
--iters are timed. --cp / --classpath loads the workload through a separate class
loader; --jfr <file> keeps the recording (else a temp file) so it can be a --baseline for
the next run; --no-analyze captures without printing. A one-line timing summary (iters,
ms/iter) goes to stderr; the report to stdout, so it stays pipeable.
Remote servers (run on the host)
For a JVM deployed elsewhere, run jvmlens on that host through the access channel you
already have and let it ship back the compact summary — no JMX ports, no extra start
flags, works on any running JVM (and --engine async works too, since the profiler is
local to the target):
ssh prod-host 'java -jar jvmlens.jar profile <pid> -f prompt'
kubectl exec pod -- java -jar jvmlens.jar profile 1 --engine async -f prompt
docker exec ctr java -jar jvmlens.jar watch 1 --on-gc-ms 200
This plays to jvmlens’s strength: the output is a few hundred tokens, so there is nothing heavy to move over the network. (A networked MCP endpoint and an in-process agent are on the roadmap for always-on remote querying.)
Continuous watch (rolling profile)
watch <pid> keeps a continuous JFR ring buffer on the target and dumps + summarizes
a rolling window every interval — the foundation of the production "dump-on-trigger"
mode (condition-based triggers build on this):
java -jar jvmlens.jar watch 12345 # every 30s, last 120s, forever
java -jar jvmlens.jar watch -i 60 --max-age 300 12345 # every 60s, last 5min
java -jar jvmlens.jar watch -n 5 -i 10 12345 # 5 snapshots, then stop
-i / --interval sets the dump cadence, --max-age the ring-buffer window, and
-n / --snapshots a fixed count (0 = until interrupted). Each snapshot is summarized
with the same -f / -a / -x options as analyze.
Dump on trigger
By default every interval is emitted. Pass any threshold and watch instead stays quiet
and emits only when a window breaches it — the production "dump-on-trigger" mode:
| Option | Fires when |
|---|---|
|
total GC pause time in the window reaches |
|
the top hot path reaches |
|
retained (old-object) samples reach |
# emit a summary only when GC pauses exceed 200ms or a leak shows up
java -jar jvmlens.jar watch --on-gc-ms 200 --on-old-objects 5 12345
Scoping application code
By default a hot path is "application code" if it is outside the JDK and common
frameworks (Spring, Apache, BouncyCastle, Jackson, logging, …). To focus on your
own packages — or trim more noise — use -a / --app-package (include-only) and
-x / --exclude (both repeatable, comma-separable); they apply to analyze
and profile alike:
java -jar jvmlens.jar analyze -a org.alexmond recording.jfr
java -jar jvmlens.jar analyze -x com.thirdparty recording.jfr
A summary built from very few execution samples is flagged with a ⚠ adequacy
caveat — its hot-path shares are statistically noisy; record longer or under
steady-state load.
In-process agent
For always-on profiling — especially in containers — load jvmlens as a Java agent. It keeps a continuous JFR ring buffer inside the target and writes a fresh LLM-ready summary to a file every interval. No attach, no JMX, nothing external:
java -javaagent:jvmlens-agent.jar=out=/var/log/jvmlens.md,interval=60 -jar your-app.jar
Options (comma-separated key=value): out (latest-summary file), interval (seconds
between summaries), settings (JFR config, default profile), snapshot (see below),
db (instrument JDBC — see Database (SQL) below), web (instrument HTTP — see Web
(HTTP endpoints) below), messaging (time Kafka/JMS send + poll/receive), cache (time
Spring Cache get/put/evict), micrometer (summarize an existing Micrometer registry —
no extra instrumentation; degrades to nothing if Micrometer is absent), history (see
Long-running monitor below), paused (launch without emitting — start it after warm-up;
see Runtime control below), and control (a file the agent watches for in-flight
commands). The messaging and cache dimensions aggregate by Class.method operation and
render top operations by total time. The agent jar is the separate
jvmlens-agent.jar artifact (engine + agent + a relocated ByteBuddy); it can also be
attached dynamically via the Agent-Class entry.
Runtime control (in-flight adjustment)
Like a desktop profiler’s live controls, the agent can be steered at runtime — without a
restart — through a control file it watches (control=<file>). No ports, no JMX: an
operator appends commands over whatever access they already have, via the jvmlens control
CLI (run it on the host: kubectl exec, ssh, …):
java -javaagent:jvmlens-agent.jar=out=/agent/jvmlens.md,control=/agent/jvmlens.control,paused -jar app.jar
# then, on the host:
java -jar jvmlens.jar control /agent/jvmlens.control start # begin (e.g. after warm-up)
java -jar jvmlens.jar control /agent/jvmlens.control enable db # turn a dimension on (lazy-instruments)
java -jar jvmlens.jar control /agent/jvmlens.control topn db 5 # top 5 SQL queries with their stats
java -jar jvmlens.jar control /agent/jvmlens.control settings default # lighter sampling (profile = denser)
java -jar jvmlens.jar control /agent/jvmlens.control scope app com.example # adjust app-frame filtering
java -jar jvmlens.jar control /agent/jvmlens.control dump # emit a summary now
java -jar jvmlens.jar control /agent/jvmlens.control status # read current state back
Commands: start / stop, clear (reset the window + stores), dump (emit now),
enable <dim> / disable <dim> (db/web/messaging/cache/micrometer/snapshot/deadlock),
settings <profile|default> (sampling density), interval <seconds>, scope app|exclude
<prefix> / scope reset (filtering), topn [<category>] <n> / topn reset (rows per
section — category is cpu/perf, memory/mem, locks, io, pinning, or a plugin like db/web),
and status. Each command makes the agent publish its state to <control-file>.status,
which the CLI reads back and prints — so topn db 5 returns the resulting limits to you.
Launching paused and then start after warm-up is the clean fix for short cold runs
profiling startup rather than the workload — no more guessing a --warmup duration.
Long-running monitor (history + trend)
out is overwritten each interval — only the latest window survives. For a multi-day
watch, add history=<file.jsonl> and the agent instead appends one compact sample per
interval (covering all three dimensions — CPU, memory, wait), so nothing is lost:
java -javaagent:jvmlens-agent.jar=out=/var/log/jvmlens.md,history=/var/log/jvmlens.jsonl,interval=300 \
-jar your-app.jar
Let it run, then trend reduces the accumulated run to a change-over-time report — what
moved across the days, not a single snapshot:
java -jar jvmlens.jar trend /var/log/jvmlens.jsonl # markdown digest
java -jar jvmlens.jar trend -f prompt /var/log/jvmlens.jsonl # wrapped for an LLM
java -jar jvmlens.jar trend -f json /var/log/jvmlens.jsonl # the raw samples
The digest reports each dimension’s first-third→last-third direction (rising / flat / falling), whether the hot path stayed stable or shifted, when lock contention appeared, and a hedged retention indicator — old-object growth alongside rising GC pressure is flagged as possible retention growth, never a confident "leak".
Database (SQL)
Add db to the agent options and it instruments java.sql.Statement.execute* (ByteBuddy)
to time JDBC calls, aggregating them by sanitized SQL shape (literals parameterized, so
no values reach the summary) into a Top SQL (by total time) section — each shape with its
call count, average latency, and a hedged possible N+1 flag for high-count low-latency
shapes:
java -javaagent:jvmlens-agent.jar=out=/var/log/jvmlens.md,db,interval=60 -jar app.jar
The SQL comes from the statement argument (plain Statement) or the statement’s
toString() (most PreparedStatement drivers — H2, PostgreSQL); unknown shapes degrade to
?. This is profiling signal, not a query log — it never records literal values.
Web (HTTP endpoints)
Add web and the agent instruments HttpServlet.service (ByteBuddy; both jakarta.servlet
and javax.servlet, read reflectively so jvmlens needs no servlet dependency — Spring MVC’s
DispatcherServlet is covered by this one point). Requests aggregate by route shape
(numeric / UUID / long-token path segments become {}, query strings dropped) into a
Top HTTP endpoints (by total time) section, each with request count, average latency, and
an error count (status ≥ 400):
java -javaagent:jvmlens-agent.jar=out=/var/log/jvmlens.md,web,db -jar app.jar
Variable snapshots
Beyond performance, the agent can answer correctness questions — what values flow
through a method — without stopping the app. Add snapshot=Class#method (semicolon-separate
several) and the agent instruments those methods and appends a Variable snapshots section
to the summary: per call site, the call count and a per-argument digest (distinct values,
null rate, numeric range):
java -javaagent:jvmlens-agent.jar=out=/var/log/jvmlens.md,snapshot='com.acme.OrderService#price;com.acme.Repo#find' -jar app.jar
## Variable snapshots
### `com.acme.OrderService.price` — 1410658 calls
- arg0: 3 distinct [STD, EXPRESS, FREE]
- arg1: 10 distinct [...] (range 0..9)
Method arguments need no debug info; capturing locals (requires -g) and conditions/PII
redaction are planned.
MCP server
jvmlens mcp runs a Model Context Protocol server
over stdio, exposing the analysis as scoped, navigable tools so an agent pulls only
the slice it needs (progressive disclosure) instead of one large blob:
| Tool | Returns |
|---|---|
|
Event counts, the heuristic cause, and which drill-down tool to use next. |
|
Application-attributed hot call paths, by sample share. |
|
Leaf (self-time) hot methods, runtime included. |
|
Top allocation sites and allocated types. |
|
Lock contention by method and contended monitors. |
|
External (network + file) blocking I/O by endpoint. |
|
Virtual-thread pinning sites, by pinned time. |
|
Deadlocked threads and their wait-for cycle (agent recordings). |
|
Capture a live local JVM by |
The drill-down tools take a file (path to a .jfr); profile takes a pid. All accept
optional appPackages / exclude scoping. The server only serves structured data — it
never calls an LLM, so recordings never leave the host. Register it with an MCP client:
{ "mcpServers": { "jvmlens": { "command": "java", "args": ["-jar", "/path/to/jvmlens.jar", "mcp"] } } }
For a remote server, point the MCP client at the host’s access channel — no JMX, no extra ports:
{ "mcpServers": { "prod": { "command": "ssh", "args": ["prod-host", "java", "-jar", "jvmlens.jar", "mcp"] } } }
Output formats
-f / --format selects the rendering (case-insensitive); all three carry the
same ranked signal from one analysis pass:
| Format | Use |
|---|---|
|
Compact markdown — readable by humans and agents alike. |
|
Scoped JSON object — for tooling, or the future MCP server, to consume. |
|
The markdown wrapped in an LLM task instruction, ready to paste. |
java -jar jvmlens.jar analyze --format json recording.jfr
java -jar jvmlens.jar analyze -f prompt recording.jfr
Producing a recording
Any JFR recording works. To capture one with the built-in profiler:
java -XX:StartFlightRecording=duration=30s,filename=recording.jfr,settings=profile -jar your-app.jar
The examples/ directory contains a planted-pathology workload (CPU hot path,
memory leak, lock contention) for producing sample recordings.