Transformer interpretability, end-to-end

See inside any transformer.

A unified observability workspace for language models. Capture every activation, attention pattern, sparse feature, and causal attribution from a single forward pass — then explore the full picture from one interactive view.

No GPU required to start · Any open-source LLM · Open-source desktop build
stellaris · dim layer
26+
layers captured per pass
16k
sparse features per layer
< 1s
selection latency, any view
0
data leaves your VPC, on-prem
Forward-pass observability

Every signal, in one capture.

One click captures residual stream, attention weights, MLP outputs, and logit lens predictions across every layer. Nothing scattered across notebooks, nothing to re-run when you want to check one more thing.

  • Residual stream at any layer / component / position
  • Full attention pattern, head-by-head
  • Per-layer logit-lens token predictions
  • Sparse feature firings via per-layer encoders
activations · L18 · pos 5
attribution · feature 4321
Causal attribution & intervention

Test the circuit, don't just stare at it.

Trace any output backwards to the upstream features that actually caused it, then ablate or override those features and re-run the model to confirm. Correlation isn't enough; we make the causal step a single click.

  • Gradient-based attribution patching to any depth
  • Ablate any sparse feature across selected layers
  • Override any neuron value mid-pass and replay
  • Compare patched vs base, cell-for-cell
Latent-space geometry

The model's mind, projected into 3D.

The Dim Layer projects every (layer, position) residual into a learned 3D space. Watch token representations trace pathways through the network — semantic distance becomes something you can see and click.

  • PCA, UMAP, or learned projection bases
  • Pathway tunnels per token across layers
  • Click any point to drill into its activations
  • Compare prompt-to-prompt drift visually
dim layer · PCA · resid_post
What you can do

One workspace, every layer of the model.

Every capability you'd build piecemeal across notebooks, dashboards, and standalone scripts — unified.

Forward-pass capture

One click captures every layer's residual, attention, MLP, and logit-lens predictions.

Sparse feature decomposition

Unpack each layer into thousands of interpretable features per token-position. See the concepts the model is using.

Causal attribution

Trace any target feature back to its upstream contributors via gradient-based path analysis.

Activation patching

Override any neuron, head, or position mid-pass. Re-run and see how the output actually changes.

3D latent geometry

Project every (layer, position) residual into 3D and watch token pathways trace through the network.

Paired-pass comparison

Two prompts, two models, two ablations — side-by-side with every cell of divergence surfaced.

Who's it for

Built for the people asking hard questions.

Interpretability researchers

Stop rebuilding pipeline plumbing every paper. Capture once, slice every which way — and export the underlying data when you need to take it offline.

Model safety teams

Move from "this prompt triggers a refusal" to "this feature at layer 18 is what flipped the decision." Run it on-prem for sensitive evaluations.

ML engineers shipping LLMs

Diagnose unexpected behavior with surgical specificity. Patch the offending circuit, regenerate, ship the fix.

On-prem / enterprise

Run it on your own infrastructure.

Need air-gapped deployment, custom model uploads, live per-token circuit telemetry, or on-prem GPU? The desktop / on-prem build ships the same workspace without hosted-tier limits.

Contact us

Get started in under a minute.

Free tier, no credit card. Run your first forward pass right now.

Open the app