Articles:
The Security System
Adversarial risk detection across the full ML pipeline: prompt injection and poisoned samples in training data, red teaming and jailbreak taxonomy in model inspection, model robustness scoring, weight trojan detection, and attack surface comparison across model versions in the training monitor.
Read article →
The Agent System
An autonomous interpretability copilot that runs the full inspection pipeline, drives every panel in the UI, and narrates what it finds — all without you touching a button. 27 tools, 5 canonical workflows, live UI control.
Read article →
The Training Inspect System
Live signal detection across five failure modes, gradient and loss monitoring per step, SAE feature diffs and behavioral model diffs post-training, and an agentic chat that reads from live training state at send time.
Read article →
The Data Inspection System
Eight analysis modules across toxicity, PII, synthetic detection, liability chain, bias, copyright, text quality, and quality scoring — with compliance coverage for EU AI Act, India DPDPA, and NIST AI RMF, plus a full audit trail exportable as PDF or JSON.
Read article →
The Attribution System
Causal mediation analysis, SAE feature extraction, circuit attribution graph, logit lens, feature steering, UMAP exploration, fact verification, bias detection, and censor auditing — all in one pipeline on Llama 3.2 1B Instruct.
Read article →
The Eval System
Consistency, suppression detection, and knowledge boundary probing. Behavioral evals that surface failure modes without requiring a trained SAE — works on any TransformerLens-compatible model.
Read article →
The Benchmark Builder
Conversational benchmark creation inside the agent chat. Describe what you want to measure, the agent writes the prompt suite, scores each capability, and returns an interactive card inline. Bar, pie, radial, and line charts. Export as CSV, JSON, image, or PDF.
Read article →
Benchmarks
InterpScore, FeaturePurityScore, and MUI: integrated benchmarks for evaluating SAE features on Llama 3.2 1B Instruct. Plus thirteen weight editing benchmarks covering retention, generalization, locality, long-form generation, and more.
Read article →
Not sure if Aquin is right for you?
Aquin


