Qapitol QA

Free resources

The playbooks we use in production engagements

Whitepapers, technical papers, checklists, and templates — distilled from work with 25+ unicorns and regulated enterprises. Free, in exchange for an email.

Whitepapers

whitepaper · 42 pages★ Popular

The Enterprise AI Evaluation Playbook

The 2026 field guide to evaluating enterprise AI: model selection, eval design, scoring architecture, and regulatory sign-off — distilled from engagements across 25+ unicorns and regulated enterprises.

  • Model selection scorecards
  • Eval design patterns for non-deterministic systems
  • LLM-as-judge calibration
  • Regulatory sign-off templates
whitepaper · 38 pages★ Popular

EU AI Act: A Technical Compliance Guide

Articles 9–15 decomposed into engineering obligations: risk management, data governance, documentation, human oversight, and robustness — with an obligation-mapping worksheet and compliance roadmap.

  • Article-by-article obligation mapping
  • Evidence generation patterns
  • High-risk classification flowchart
  • Compliance roadmap template
whitepaper · 34 pages★ Popular

Agentic QE: Designing Autonomous Quality Engineering Systems

Agent architecture for quality engineering: exploration, generation, execution, and self-healing agents — and the governance layer that makes them production-safe.

  • Reference agent architecture
  • Self-healing test patterns
  • Human-on-the-loop governance
  • Adoption sequencing
whitepaper · 28 pages

Sovereign AI in Regulated Industries: Architecture Patterns

VPC and air-gapped deployment patterns for banks, insurers, and government: in-perimeter model serving, evaluation, and continuous compliance monitoring.

  • Air-gap deployment blueprints
  • In-perimeter eval architecture
  • Compliance monitoring patterns

Technical papers

technical paper · 18 pages

Insurance-NER: Domain-Specific Named Entity Recognition

Technical paper: a fine-tuned Llama 3.1 8B model achieving 94.2% F1 on insurance policy document extraction — methodology, training data design, and evaluation protocol.

  • Fine-tuning methodology
  • 94.2% F1 evaluation protocol
  • Deployment footprint analysis
technical paper · 16 pages

PolicyTrace: Explainable AI Obligation Reasoning

Technical paper: generating policy reasoning traces that survive regulator scrutiny — how CHEQ produces defensible, step-by-step evidence for every compliance judgment.

  • Reasoning trace architecture
  • Audit trail formats
  • Regulator submission patterns

Checklists & templates

checklist · 6 pages

AI QE Maturity Assessment Checklist

Score your quality engineering practice across 40 dimensions, benchmarked against 350+ teams. Identify the three moves with the highest ROI for your maturity stage.

  • 40-dimension scoring grid
  • Benchmark percentiles
  • Prioritized action shortlist
template · 4 pages

LLM Evaluation Metrics Reference Card

Every metric that matters — accuracy, groundedness, toxicity, PII leakage, consistency, latency — with when-to-use guidance and threshold starting points.

  • 50+ metrics with definitions
  • Threshold starting points
  • Judge-model pairing guide
template · 5 pages

Synthetic Data Quality Scorecard

The gate every synthetic dataset should pass: distribution fidelity, business-rule conformance, privacy guarantees, and downstream test effectiveness.

  • Four-gate scorecard
  • Statistical validation recipes
  • Privacy checklist

New playbooks land in the newsletter first

The Control Layer Brief — fortnightly AI governance intelligence.

Next step

Want it applied, not just explained?

Book a free 48-hour compliance gap scan or QE maturity assessment.