Usage

Analyze an existing recording

java -jar jvmlens.jar analyze recording.jfr

The output is markdown: ranked hot paths (application-frame attributed), allocation sites, lock contention, GC pressure, and a one-line heuristic cause.

Each ranked row shows its absolute hit count next to the share (e.g. 99% (1040 samples)), so a high percentage built on one stray hit is obvious. Every section is tagged with how the data was obtained — [sampled] (statistical: CPU and allocation) or [measured] (exact: lock blocked time, GC) — so the reliability of each signal is explicit.

Reading the rows:

Source-line anchoring — a hot-path row’s teaser lists the top leaves where time actually goes with counts, each carrying its source line (com.example.Svc.compute:88 30/168); allocation sites carry the allocation call-site’s line (:120 · byte[] 4.2 GB). The locator turns "diagnose → find the spot → edit" into one step for a coding agent.
Leaf confidence — when no single leaf holds more than ~20% of a path’s samples the teaser is flagged ⚠ diffuse, so you don’t chase a frame that isn’t where the cost is.
Allocation confidence — a short recording yields few allocation samples; a ⚠ Only N allocation samples caveat means the per-site byte shares are noisy (the total bytes stay reliable).

Differential analysis (before → after)

The agent optimize→measure loop — "did the fix work, what changed" — is a single flag. -b / --baseline diffs <file.jfr> (the after) against a baseline (the before) and names what moved, instead of making you eyeball two summaries:

java -jar jvmlens.jar analyze --baseline before.jfr after.jfr
java -jar jvmlens.jar analyze -b before.jfr after.jfr -f prompt   # wrapped for an LLM

# JVM profile diff (before.jfr → after.jfr)
## Totals
- Exec samples: 236 → 691 (+455, +193%)
- Allocation: 9.6 GB → 3.4 GB (-6.2 GB, -64%)
- GC pause: 1663 ms → 35 ms (-1628 ms, -98%)
## Allocation sites
- `GoFmt.floatString` — 517.6 KB → 299.8 KB (▼ 42%) [share 22%→51%]

It diffs the totals (incl. total allocation bytes — the absolute memory anchor), hot paths, allocation sites, locks, and every extended section. Each row is anchored on its absolute weight (bytes / ms / samples) with the direction (▲/▼) and the share change as secondary context, ranked by the size of the absolute change; NEW / GONE are called out, and rows that barely moved drop. Anchoring on absolute matters: in an optimize loop the total shrinks, so a site whose absolute bytes fell can show a rising share — share alone would mislabel a real win as a regression. -a/-x scoping applies to both recordings.

CI perf-gate (`--assert`)

Add --assert "<rules>" (with --baseline) and jvmlens exits non-zero on regression — a backend-free perf gate for a PR, ground the SaaS APMs can’t take (they own prod, not the pull request). Rules are comma-separated metric < threshold:

Metric Fails when

Metric	Fails when
`gc-ms`	the after GC pause time (ms) is not under the limit
`gc-pct`	GC pause increased by more than the limit (%)
`alloc-pct`	total allocation bytes increased by more than the limit (%) — the absolute memory gate
`oldobj-delta`	retained (old-object) samples grew by more than the limit
`regression-pp`	a hot path’s share rose by more than the limit (percentage points)
`new-hotpath-pp`	a new hot path appears at/above the limit (% share)

gc-ms

the after GC pause time (ms) is not under the limit

gc-pct

GC pause increased by more than the limit (%)

alloc-pct

total allocation bytes increased by more than the limit (%) — the absolute memory gate

oldobj-delta

retained (old-object) samples grew by more than the limit

regression-pp

a hot path’s share rose by more than the limit (percentage points)

new-hotpath-pp

a new hot path appears at/above the limit (% share)

Prefer the absolute gates (gc-, alloc-pct, oldobj-delta) for memory/GC; the -pp ones are share-based and can shuffle when the leader shrinks in an optimize loop.

java -jar jvmlens.jar analyze -b before.jfr after.jfr \
  --assert "gc-pct < 10, regression-pp < 5, new-hotpath-pp < 20" || echo "perf regressed"

Each rule prints ✅/❌ with the actual value; the process exits 1 if any rule fails, 0 if all pass (2 for bad arguments) — drop it straight into a CI step.

JMH benchmarks (a directory of forks)

Optimizing a JVM library is almost always a JMH loop, and JMH’s -prof jfr writes one .jfr per fork (often all named profile.jfr in per-benchmark subdirs). Point analyze (and --baseline) at a directory and jvmlens finds every .jfr under it and merges the forks into one summary, so the signal isn’t split:

java -jar benchmarks.jar -prof "jfr:dir=/tmp/run-after"     # JMH writes per-fork .jfr there
java -jar jvmlens.jar analyze /tmp/run-after -a com.example       # merged summary of all forks

# before/after across two JMH runs — the diff header uses the directory names, so it stays
# unambiguous even though every fork file is `profile.jfr`:
java -jar jvmlens.jar analyze --baseline /tmp/run-before /tmp/run-after \
  --assert "alloc-pct < 0, gc-pct < 10"

(JMH records the whole fork including warm-up; for measurement-only signal, configure JMH to delay the recording past warm-up, compare steady-state runs, or pass --skip-warmup <ms> to analyze — it drops the first <ms> of each recording, measured per file from that file’s earliest event, so each fork’s own warm-up is trimmed and hot paths reflect steady state.)

Or skip the separate step entirely with the JMH profiler plugin — it records the fork and prints the summary inline when the trial ends. Put jvmlens-jmh.jar (a tiny engine+profiler jar, no Spring/picocli/jmh) on the benchmark’s classpath and:

java -cp benchmarks.jar:jvmlens-jmh.jar org.openjdk.jmh.Main \
  -prof "org.alexmond.jvmlens.jmh.JvmlensProfiler:appPackage=com.example;report=cpu"

appPackage (+-separate several) scopes to your code; report focuses the summary. An unknown option key is a hard error with a did-you-mean (no silent misconfiguration). The plugin runs in JMH’s host process and reuses the same engine as analyze.

For a before→after diff entirely inside JMH — no separate analyze — keep the fork’s recording with keep=<path> (so it can seed the next run’s baseline) and diff against a prior one with baseline=<prev.jfr> (the profiler prints the change report instead of the summary):

# run 1 (baseline): keep the recording
... -prof "org.alexmond.jvmlens.jmh.JvmlensProfiler:appPackage=com.example;keep=/tmp/before.jfr"
# run 2 (after the fix): print the diff vs run 1
... -prof "org.alexmond.jvmlens.jmh.JvmlensProfiler:appPackage=com.example;baseline=/tmp/before.jfr"

Fix-direction hints (`--hints`, opt-in)

--hints appends a hedged ## Likely fix directions [possible] section that maps recognized hot-frame / allocation shapes to a one-line direction, each grounded in the row that triggered it — e.g. DoubleToDecimal/formatUnsignedInt → "number→string formatting", a LinkedList$ListItr.<init> self-time → "per-iteration iterator allocation", AbstractStringBuilder.ensureCapacity → "presize the buffer". It is off by default (the report stays clean data) and every line is tagged [possible] — a direction to investigate, never an assertion.

java -jar jvmlens.jar analyze recording.jfr --hints

Budget-dialing the size

--top-k <n> keeps only the top n rows per section; --max-tokens <n> shrinks top-k until the summary fits roughly n tokens (chars/4). For an always-on agent that adjusts limits at runtime, the agent control plane’s topn does the same per-dimension (see Runtime control).

java -jar jvmlens.jar analyze recording.jfr --top-k 3
java -jar jvmlens.jar analyze recording.jfr --max-tokens 250

Report focus

-r / --report narrows the output to one concern (reusing the same sections the MCP server exposes):

Report Shows

Report	Shows
`full` (default)	Everything.
`cpu`	Hot paths + leaf methods (sampled).
`memory`	Allocation sites + types.
`locks`	Lock contention + contended monitors (measured).
`gc`	GC pressure and the allocation that drives it.
`io`	External (network + file) blocking I/O by endpoint (measured).
`pinning`	Virtual-thread pinning sites, by pinned time (measured).
`db`	Top SQL statements (agent JDBC instrumentation).
`web`	Top HTTP endpoints (agent servlet instrumentation).
`messaging`	Top messaging operations (agent Kafka/JMS instrumentation).
`cache`	Top cache operations (agent Spring-Cache instrumentation).
`metrics`	Top Micrometer timers (consumed from an existing registry).
`deadlock`	Deadlocked threads and their wait-for cycle (agent recordings).

full (default)

Everything.

cpu

Hot paths + leaf methods (sampled).

memory

Allocation sites + types.

locks

Lock contention + contended monitors (measured).

gc

GC pressure and the allocation that drives it.

io

External (network + file) blocking I/O by endpoint (measured).

pinning

Virtual-thread pinning sites, by pinned time (measured).

db

Top SQL statements (agent JDBC instrumentation).

web

Top HTTP endpoints (agent servlet instrumentation).

messaging

Top messaging operations (agent Kafka/JMS instrumentation).

cache

Top cache operations (agent Spring-Cache instrumentation).

metrics

Top Micrometer timers (consumed from an existing registry).

deadlock

Deadlocked threads and their wait-for cycle (agent recordings).

The agent always runs a deadlock check (ThreadMXBean.findDeadlockedThreads) — no option needed. A true deadlock is distinct from ordinary lock contention: the threads block forever and never acquire the monitor, so JFR’s JavaMonitorEnter never fires; the ThreadMXBean check is the reliable signal (and only sees the JVM it runs in, hence agent-only). When present, a Deadlocked threads (wait-for cycle) section names each stuck thread and the lock it waits on / who holds it.

The full report also appends a hedged Cross-dimension correlation note when two or more dimensions carry signal — it co-locates the dominant endpoint / query / I/O / hot path / lock / GC so an LLM sees the candidate chain in one place. It’s co-occurrence, not proof (jvmlens has no per-request trace linkage), so it suggests rather than asserts.

The io and pinning sections appear in the full report whenever the recording carries those events. io aggregates jdk.SocketRead/jdk.SocketWrite (by remote host:port) and jdk.FileRead/jdk.FileWrite (by path), ranked by blocked time with a bytes/op teaser. pinning aggregates jdk.VirtualThreadPinned by site, surfacing the pinnedReason (MONITOR / NATIVE_METHOD on JDK 24+) — the carrier-pinning that silently caps virtual-thread throughput.

java -jar jvmlens.jar analyze --report cpu recording.jfr
java -jar jvmlens.jar analyze -r memory -f json recording.jfr

Live capture from a running JVM

profile <pid> attaches to a running JVM, captures a timed JFR recording, and summarizes it — no pre-recorded .jfr needed:

java -jar jvmlens.jar profile 12345                 # 20s, markdown
java -jar jvmlens.jar profile -d 30 -w 5 12345      # warm up 5s, record 30s
java -jar jvmlens.jar profile -d 30 -k run.jfr 12345  # keep the recording

-w / --warmup waits before recording so startup/JIT noise is skipped.

-e / --engine selects the capture engine: jfr (default, prod-safe, also works over remote JMX) or async (async-profiler — higher fidelity, adds native frames; local <pid> only, and writes JFR so the same summarizer consumes it):

java -jar jvmlens.jar profile --engine async -d 30 12345

Benchmark a workload without JMH (`bench`)

bench is the no-JMH harness: point it at any class’s main(String[]) and it runs a warmup→timed loop, captures a JFR over only the timed phase, and summarizes — so an ordinary app or library with no benchmark module doesn’t need a hand-rolled driver:

java -jar jvmlens.jar bench --main com.example.RenderDriver -w 20 -i 200 -a com.example
# load the workload from its own classpath (it needn't be on jvmlens's), keep the JFR,
# pass args to its main after `--`:
java -jar jvmlens.jar bench --main com.example.RenderDriver --cp target/classes:$(cat cp.txt) \
  -w 20 -i 200 -a com.example --jfr /tmp/before.jfr -- arg1 arg2

Each main invocation is one iteration. -w / --warmup iterations run before the recording starts (so JIT/classload churn stays out of the steady-state signal); -i / --iters are timed. --cp / --classpath loads the workload through a separate class loader; --jfr <file> keeps the recording (else a temp file) so it can be a --baseline for the next run; --no-analyze captures without printing. A one-line timing summary (iters, ms/iter) goes to stderr; the report to stdout, so it stays pipeable.

Remote servers (run on the host)

For a JVM deployed elsewhere, run jvmlens on that host through the access channel you already have and let it ship back the compact summary — no JMX ports, no extra start flags, works on any running JVM (and --engine async works too, since the profiler is local to the target):

ssh prod-host        'java -jar jvmlens.jar profile <pid> -f prompt'
kubectl exec pod --   java -jar jvmlens.jar profile 1 --engine async -f prompt
docker exec ctr       java -jar jvmlens.jar watch 1 --on-gc-ms 200

This plays to jvmlens’s strength: the output is a few hundred tokens, so there is nothing heavy to move over the network. (A networked MCP endpoint and an in-process agent are on the roadmap for always-on remote querying.)

Continuous watch (rolling profile)

watch <pid> keeps a continuous JFR ring buffer on the target and dumps + summarizes a rolling window every interval — the foundation of the production "dump-on-trigger" mode (condition-based triggers build on this):

java -jar jvmlens.jar watch 12345                          # every 30s, last 120s, forever
java -jar jvmlens.jar watch -i 60 --max-age 300 12345      # every 60s, last 5min
java -jar jvmlens.jar watch -n 5 -i 10 12345               # 5 snapshots, then stop

-i / --interval sets the dump cadence, --max-age the ring-buffer window, and -n / --snapshots a fixed count (0 = until interrupted). Each snapshot is summarized with the same -f / -a / -x options as analyze.

Dump on trigger

By default every interval is emitted. Pass any threshold and watch instead stays quiet and emits only when a window breaches it — the production "dump-on-trigger" mode:

Option Fires when

Option	Fires when
`--on-gc-ms <ms>`	total GC pause time in the window reaches `<ms>` (latency / memory pressure)
`--on-cpu-pct <pct>`	the top hot path reaches `<pct>` of samples (a hot loop)
`--on-old-objects <n>`	retained (old-object) samples reach `<n>` (suspected leak)

--on-gc-ms <ms>

total GC pause time in the window reaches <ms> (latency / memory pressure)

--on-cpu-pct <pct>

the top hot path reaches <pct> of samples (a hot loop)

--on-old-objects <n>

retained (old-object) samples reach <n> (suspected leak)

# emit a summary only when GC pauses exceed 200ms or a leak shows up
java -jar jvmlens.jar watch --on-gc-ms 200 --on-old-objects 5 12345

Scoping application code

By default a hot path is "application code" if it is outside the JDK and common frameworks (Spring, Apache, BouncyCastle, Jackson, logging, …). To focus on your own packages — or trim more noise — use -a / --app-package (include-only) and -x / --exclude (both repeatable, comma-separable); they apply to analyze and profile alike:

java -jar jvmlens.jar analyze -a org.alexmond recording.jfr
java -jar jvmlens.jar analyze -x com.thirdparty recording.jfr

A summary built from very few execution samples is flagged with a ⚠ adequacy caveat — its hot-path shares are statistically noisy; record longer or under steady-state load.

In-process agent

For always-on profiling — especially in containers — load jvmlens as a Java agent. It keeps a continuous JFR ring buffer inside the target and writes a fresh LLM-ready summary to a file every interval. No attach, no JMX, nothing external:

java -javaagent:jvmlens-agent.jar=out=/var/log/jvmlens.md,interval=60 -jar your-app.jar

Options (comma-separated key=value): out (latest-summary file), interval (seconds between summaries), settings (JFR config, default profile), snapshot (see below), db (instrument JDBC — see Database (SQL) below), web (instrument HTTP — see Web (HTTP endpoints) below), messaging (time Kafka/JMS send + poll/receive), cache (time Spring Cache get/put/evict), micrometer (summarize an existing Micrometer registry — no extra instrumentation; degrades to nothing if Micrometer is absent), history (see Long-running monitor below), paused (launch without emitting — start it after warm-up; see Runtime control below), and control (a file the agent watches for in-flight commands). The messaging and cache dimensions aggregate by Class.method operation and render top operations by total time. The agent jar is the separate jvmlens-agent.jar artifact (engine + agent + a relocated ByteBuddy); it can also be attached dynamically via the Agent-Class entry.

Runtime control (in-flight adjustment)

Like a desktop profiler’s live controls, the agent can be steered at runtime — without a restart — through a control file it watches (control=<file>). No ports, no JMX: an operator appends commands over whatever access they already have, via the jvmlens control CLI (run it on the host: kubectl exec, ssh, …):

java -javaagent:jvmlens-agent.jar=out=/agent/jvmlens.md,control=/agent/jvmlens.control,paused -jar app.jar

# then, on the host:
java -jar jvmlens.jar control /agent/jvmlens.control start          # begin (e.g. after warm-up)
java -jar jvmlens.jar control /agent/jvmlens.control enable db      # turn a dimension on (lazy-instruments)
java -jar jvmlens.jar control /agent/jvmlens.control topn db 5      # top 5 SQL queries with their stats
java -jar jvmlens.jar control /agent/jvmlens.control settings default   # lighter sampling (profile = denser)
java -jar jvmlens.jar control /agent/jvmlens.control scope app com.example   # adjust app-frame filtering
java -jar jvmlens.jar control /agent/jvmlens.control dump           # emit a summary now
java -jar jvmlens.jar control /agent/jvmlens.control status         # read current state back

Commands: start / stop, clear (reset the window + stores), dump (emit now), enable <dim> / disable <dim> (db/web/messaging/cache/micrometer/snapshot/deadlock), settings <profile|default> (sampling density), interval <seconds>, scope app|exclude <prefix> / scope reset (filtering), topn [<category>] <n> / topn reset (rows per section — category is cpu/perf, memory/mem, locks, io, pinning, or a plugin like db/web), and status. Each command makes the agent publish its state to <control-file>.status, which the CLI reads back and prints — so topn db 5 returns the resulting limits to you.

Launching paused and then start after warm-up is the clean fix for short cold runs profiling startup rather than the workload — no more guessing a --warmup duration.

Long-running monitor (history + trend)

out is overwritten each interval — only the latest window survives. For a multi-day watch, add history=<file.jsonl> and the agent instead appends one compact sample per interval (covering all three dimensions — CPU, memory, wait), so nothing is lost:

java -javaagent:jvmlens-agent.jar=out=/var/log/jvmlens.md,history=/var/log/jvmlens.jsonl,interval=300 \
     -jar your-app.jar

Let it run, then trend reduces the accumulated run to a change-over-time report — what moved across the days, not a single snapshot:

java -jar jvmlens.jar trend /var/log/jvmlens.jsonl          # markdown digest
java -jar jvmlens.jar trend -f prompt /var/log/jvmlens.jsonl  # wrapped for an LLM
java -jar jvmlens.jar trend -f json   /var/log/jvmlens.jsonl  # the raw samples

The digest reports each dimension’s first-third→last-third direction (rising / flat / falling), whether the hot path stayed stable or shifted, when lock contention appeared, and a hedged retention indicator — old-object growth alongside rising GC pressure is flagged as possible retention growth, never a confident "leak".

Database (SQL)

Add db to the agent options and it instruments java.sql.Statement.execute* (ByteBuddy) to time JDBC calls, aggregating them by sanitized SQL shape (literals parameterized, so no values reach the summary) into a Top SQL (by total time) section — each shape with its call count, average latency, and a hedged possible N+1 flag for high-count low-latency shapes:

java -javaagent:jvmlens-agent.jar=out=/var/log/jvmlens.md,db,interval=60 -jar app.jar

The SQL comes from the statement argument (plain Statement) or the statement’s toString() (most PreparedStatement drivers — H2, PostgreSQL); unknown shapes degrade to ?. This is profiling signal, not a query log — it never records literal values.

Web (HTTP endpoints)

Add web and the agent instruments HttpServlet.service (ByteBuddy; both jakarta.servlet and javax.servlet, read reflectively so jvmlens needs no servlet dependency — Spring MVC’s DispatcherServlet is covered by this one point). Requests aggregate by route shape (numeric / UUID / long-token path segments become {}, query strings dropped) into a Top HTTP endpoints (by total time) section, each with request count, average latency, and an error count (status ≥ 400):

java -javaagent:jvmlens-agent.jar=out=/var/log/jvmlens.md,web,db -jar app.jar

Variable snapshots

Beyond performance, the agent can answer correctness questions — what values flow through a method — without stopping the app. Add snapshot=Class#method (semicolon-separate several) and the agent instruments those methods and appends a Variable snapshots section to the summary: per call site, the call count and a per-argument digest (distinct values, null rate, numeric range):

java -javaagent:jvmlens-agent.jar=out=/var/log/jvmlens.md,snapshot='com.acme.OrderService#price;com.acme.Repo#find' -jar app.jar

## Variable snapshots
### `com.acme.OrderService.price` — 1410658 calls
- arg0: 3 distinct [STD, EXPRESS, FREE]
- arg1: 10 distinct [...] (range 0..9)

Method arguments need no debug info; capturing locals (requires -g) and conditions/PII redaction are planned.

MCP server

jvmlens mcp runs a Model Context Protocol server over stdio, exposing the analysis as scoped, navigable tools so an agent pulls only the slice it needs (progressive disclosure) instead of one large blob:

Tool Returns

Tool	Returns
`overview`	Event counts, the heuristic cause, and which drill-down tool to use next.
`hot_paths`	Application-attributed hot call paths, by sample share.
`hot_leaves`	Leaf (self-time) hot methods, runtime included.
`allocations`	Top allocation sites and allocated types.
`lock_contention`	Lock contention by method and contended monitors.
`io`	External (network + file) blocking I/O by endpoint.
`pinning`	Virtual-thread pinning sites, by pinned time.
`deadlock`	Deadlocked threads and their wait-for cycle (agent recordings).
`profile`	Capture a live local JVM by `pid` (`engine` jfr/async, `report` focus) and summarize it.

overview

Event counts, the heuristic cause, and which drill-down tool to use next.

hot_paths

Application-attributed hot call paths, by sample share.

hot_leaves

Leaf (self-time) hot methods, runtime included.

allocations

Top allocation sites and allocated types.

lock_contention

Lock contention by method and contended monitors.

io

External (network + file) blocking I/O by endpoint.

pinning

Virtual-thread pinning sites, by pinned time.

deadlock

Deadlocked threads and their wait-for cycle (agent recordings).

profile

Capture a live local JVM by pid (engine jfr/async, report focus) and summarize it.

The drill-down tools take a file (path to a .jfr); profile takes a pid. All accept optional appPackages / exclude scoping. The server only serves structured data — it never calls an LLM, so recordings never leave the host. Register it with an MCP client:

{ "mcpServers": { "jvmlens": { "command": "java", "args": ["-jar", "/path/to/jvmlens.jar", "mcp"] } } }

For a remote server, point the MCP client at the host’s access channel — no JMX, no extra ports:

{ "mcpServers": { "prod": { "command": "ssh", "args": ["prod-host", "java", "-jar", "jvmlens.jar", "mcp"] } } }

Output formats

-f / --format selects the rendering (case-insensitive); all three carry the same ranked signal from one analysis pass:

Format Use

Format	Use
`md` (default)	Compact markdown — readable by humans and agents alike.
`json`	Scoped JSON object — for tooling, or the future MCP server, to consume.
`prompt`	The markdown wrapped in an LLM task instruction, ready to paste.

md (default)

Compact markdown — readable by humans and agents alike.

json

Scoped JSON object — for tooling, or the future MCP server, to consume.

prompt

The markdown wrapped in an LLM task instruction, ready to paste.

java -jar jvmlens.jar analyze --format json recording.jfr
java -jar jvmlens.jar analyze -f prompt recording.jfr

Producing a recording

Any JFR recording works. To capture one with the built-in profiler:

java -XX:StartFlightRecording=duration=30s,filename=recording.jfr,settings=profile -jar your-app.jar

The examples/ directory contains a planted-pathology workload (CPU hot path, memory leak, lock contention) for producing sample recordings.

Exit codes

Code Meaning

Code	Meaning
`0`	Summary produced
`2`	Bad arguments (unreadable JFR file, non-numeric pid, invalid duration/warmup)
`3`	Live capture failed (could not attach to the target JVM)

0

Summary produced

2

Bad arguments (unreadable JFR file, non-numeric pid, invalid duration/warmup)

3

Live capture failed (could not attach to the target JVM)