Free resources
The playbooks we use in production engagements
Whitepapers, technical papers, checklists, and templates — distilled from work with 25+ unicorns and regulated enterprises. Free, in exchange for an email.
Whitepapers
The Enterprise AI Evaluation Playbook
The 2026 field guide to evaluating enterprise AI: model selection, eval design, scoring architecture, and regulatory sign-off — distilled from engagements across 25+ unicorns and regulated enterprises.
- ✓ Model selection scorecards
- ✓ Eval design patterns for non-deterministic systems
- ✓ LLM-as-judge calibration
- ✓ Regulatory sign-off templates
EU AI Act: A Technical Compliance Guide
Articles 9–15 decomposed into engineering obligations: risk management, data governance, documentation, human oversight, and robustness — with an obligation-mapping worksheet and compliance roadmap.
- ✓ Article-by-article obligation mapping
- ✓ Evidence generation patterns
- ✓ High-risk classification flowchart
- ✓ Compliance roadmap template
Agentic QE: Designing Autonomous Quality Engineering Systems
Agent architecture for quality engineering: exploration, generation, execution, and self-healing agents — and the governance layer that makes them production-safe.
- ✓ Reference agent architecture
- ✓ Self-healing test patterns
- ✓ Human-on-the-loop governance
- ✓ Adoption sequencing
Sovereign AI in Regulated Industries: Architecture Patterns
VPC and air-gapped deployment patterns for banks, insurers, and government: in-perimeter model serving, evaluation, and continuous compliance monitoring.
- ✓ Air-gap deployment blueprints
- ✓ In-perimeter eval architecture
- ✓ Compliance monitoring patterns
Technical papers
Insurance-NER: Domain-Specific Named Entity Recognition
Technical paper: a fine-tuned Llama 3.1 8B model achieving 94.2% F1 on insurance policy document extraction — methodology, training data design, and evaluation protocol.
- ✓ Fine-tuning methodology
- ✓ 94.2% F1 evaluation protocol
- ✓ Deployment footprint analysis
PolicyTrace: Explainable AI Obligation Reasoning
Technical paper: generating policy reasoning traces that survive regulator scrutiny — how CHEQ produces defensible, step-by-step evidence for every compliance judgment.
- ✓ Reasoning trace architecture
- ✓ Audit trail formats
- ✓ Regulator submission patterns
Checklists & templates
AI QE Maturity Assessment Checklist
Score your quality engineering practice across 40 dimensions, benchmarked against 350+ teams. Identify the three moves with the highest ROI for your maturity stage.
- ✓ 40-dimension scoring grid
- ✓ Benchmark percentiles
- ✓ Prioritized action shortlist
LLM Evaluation Metrics Reference Card
Every metric that matters — accuracy, groundedness, toxicity, PII leakage, consistency, latency — with when-to-use guidance and threshold starting points.
- ✓ 50+ metrics with definitions
- ✓ Threshold starting points
- ✓ Judge-model pairing guide
Synthetic Data Quality Scorecard
The gate every synthetic dataset should pass: distribution fidelity, business-rule conformance, privacy guarantees, and downstream test effectiveness.
- ✓ Four-gate scorecard
- ✓ Statistical validation recipes
- ✓ Privacy checklist
New playbooks land in the newsletter first
The Control Layer Brief — fortnightly AI governance intelligence.
Next step
Want it applied, not just explained?
Book a free 48-hour compliance gap scan or QE maturity assessment.
