Transformer interpretability, end-to-end

See inside any transformer.

A unified observability workspace for language models. Capture every activation, attention pattern, sparse feature, and causal attribution from a single forward pass — then explore the full picture from one interactive view.

Start free Self-host on-prem

No GPU required to start · Any open-source LLM · Open-source desktop build

stellaris · dim layer

26+

layers captured per pass

16k

sparse features per layer

< 1s

selection latency, any view

data leaves your VPC, on-prem

Forward-pass observability

Every signal, in one capture.

One click captures residual stream, attention weights, MLP outputs, and logit lens predictions across every layer. Nothing scattered across notebooks, nothing to re-run when you want to check one more thing.

Residual stream at any layer / component / position
Full attention pattern, head-by-head
Per-layer logit-lens token predictions
Sparse feature firings via per-layer encoders

activations · L18 · pos 5

attribution · feature 4321

Causal attribution & intervention

Test the circuit, don't just stare at it.

Trace any output backwards to the upstream features that actually caused it, then ablate or override those features and re-run the model to confirm. Correlation isn't enough; we make the causal step a single click.

Gradient-based attribution patching to any depth
Ablate any sparse feature across selected layers
Override any neuron value mid-pass and replay
Compare patched vs base, cell-for-cell

Latent-space geometry

The model's mind, projected into 3D.

The Dim Layer projects every (layer, position) residual into a learned 3D space. Watch token representations trace pathways through the network — semantic distance becomes something you can see and click.

PCA, UMAP, or learned projection bases
Pathway tunnels per token across layers
Click any point to drill into its activations
Compare prompt-to-prompt drift visually

dim layer · PCA · resid_post

What you can do

One workspace, every layer of the model.

Every capability you'd build piecemeal across notebooks, dashboards, and standalone scripts — unified.

Forward-pass capture

One click captures every layer's residual, attention, MLP, and logit-lens predictions.

Sparse feature decomposition

Unpack each layer into thousands of interpretable features per token-position. See the concepts the model is using.

Causal attribution

Trace any target feature back to its upstream contributors via gradient-based path analysis.

Activation patching

Override any neuron, head, or position mid-pass. Re-run and see how the output actually changes.

3D latent geometry

Project every (layer, position) residual into 3D and watch token pathways trace through the network.

Paired-pass comparison

Two prompts, two models, two ablations — side-by-side with every cell of divergence surfaced.

Who's it for

Built for the people asking hard questions.

Interpretability researchers

Stop rebuilding pipeline plumbing every paper. Capture once, slice every which way — and export the underlying data when you need to take it offline.

Model safety teams

Move from "this prompt triggers a refusal" to "this feature at layer 18 is what flipped the decision." Run it on-prem for sensitive evaluations.

ML engineers shipping LLMs

Diagnose unexpected behavior with surgical specificity. Patch the offending circuit, regenerate, ship the fix.

On-prem / enterprise

Run it on your own infrastructure.

Need air-gapped deployment, custom model uploads, live per-token circuit telemetry, or on-prem GPU? The desktop / on-prem build ships the same workspace without hosted-tier limits.

Get started in under a minute.

Free tier, no credit card. Run your first forward pass right now.

Open the app