Research

Articles:

featureApr 2026

The Security System

Adversarial risk detection across the full ML pipeline: prompt injection and poisoned samples in training data, red teaming and jailbreak taxonomy in model inspection, model robustness scoring, weight trojan detection, and attack surface comparison across model versions in the training monitor.

Read article →

featureApr 2026

The Agent System

An autonomous interpretability copilot that runs the full inspection pipeline, drives every panel in the UI, and narrates what it finds — all without you touching a button. 27 tools, 5 canonical workflows, live UI control.

Read article →

featureApr 2026

The Training Inspect System

Live signal detection across five failure modes, gradient and loss monitoring per step, SAE feature diffs and behavioral model diffs post-training, and an agentic chat that reads from live training state at send time.

Read article →

featureApr 2026

The Data Inspection System

Eight analysis modules across toxicity, PII, synthetic detection, liability chain, bias, copyright, text quality, and quality scoring — with compliance coverage for EU AI Act, India DPDPA, and NIST AI RMF, plus a full audit trail exportable as PDF or JSON.

Read article →

featureApr 2026

The Attribution System

Causal mediation analysis, SAE feature extraction, circuit attribution graph, logit lens, feature steering, UMAP exploration, fact verification, bias detection, and censor auditing — all in one pipeline on Llama 3.2 1B Instruct.

Read article →

featureApr 2026

The Eval System

Consistency, suppression detection, and knowledge boundary probing. Behavioral evals that surface failure modes without requiring a trained SAE — works on any TransformerLens-compatible model.

Read article →

featureApr 2026

The Benchmark Builder

Conversational benchmark creation inside the agent chat. Describe what you want to measure, the agent writes the prompt suite, scores each capability, and returns an interactive card inline. Bar, pie, radial, and line charts. Export as CSV, JSON, image, or PDF.

Read article →

featureApr 2026

Benchmarks

InterpScore, FeaturePurityScore, and MUI: integrated benchmarks for evaluating SAE features on Llama 3.2 1B Instruct. Plus thirteen weight editing benchmarks covering retention, generalization, locality, long-form generation, and more.

Read article →

Join the Aquin Research Community

LLM researchers & ML engineers — open research, fellowships, hackathons, and early beta access.

Join Discord

Not sure if Aquin is right for you?

StatusPoliciesResearchCommunity·© 2026 Aquin. All rights reserved.

Aquin