Research — Certius Labs

FEB·2024 Air Canada ● court ruling: airline bound by chatbot promise ·SEP·2025 Two Sigma ● $170M client losses · AI model manipulation ·OCT·2025 Geneva Association ● 90%+ businesses want AI insurance ·Q4·2025 Verisk ● AI exclusions from commercial general liability ·DEC·2025 Amazon Kiro ● 13-hour AWS outage · AI coding agent ·2025 Stanford HAI ● GenAI lawsuits up 137% YoY ·JAN·2026 Singapore ● first agentic AI governance framework ·FEB·2026 Lobstar Wilde ● $441K accidental crypto transfer ·FEB·2026 Colorado AI Act ● effective · first US state AI law ·MAR·2026 Alibaba ROME ● rogue agent mines crypto · disables firewall ·2026 Gravitee ● 88% enterprises report AI incidents · 21% have visibility ·2026 IBM ● +$670K average cost per shadow AI breach ·2026 Gartner ● AI governance software · $492M market ·AUG·2026 EU AI Act ● full enforcement · up to €35M or 7% revenue ·2032 Deloitte ● AI insurance premiums projected at $4.77B ·2034 Market forecast ● AI agent market · $7.6B → $236B ·FEB·2024 Air Canada ● court ruling: airline bound by chatbot promise ·SEP·2025 Two Sigma ● $170M client losses · AI model manipulation ·OCT·2025 Geneva Association ● 90%+ businesses want AI insurance ·Q4·2025 Verisk ● AI exclusions from commercial general liability ·DEC·2025 Amazon Kiro ● 13-hour AWS outage · AI coding agent ·2025 Stanford HAI ● GenAI lawsuits up 137% YoY ·JAN·2026 Singapore ● first agentic AI governance framework ·FEB·2026 Lobstar Wilde ● $441K accidental crypto transfer ·FEB·2026 Colorado AI Act ● effective · first US state AI law ·MAR·2026 Alibaba ROME ● rogue agent mines crypto · disables firewall ·2026 Gravitee ● 88% enterprises report AI incidents · 21% have visibility ·2026 IBM ● +$670K average cost per shadow AI breach ·2026 Gartner ● AI governance software · $492M market ·AUG·2026 EU AI Act ● full enforcement · up to €35M or 7% revenue ·2032 Deloitte ● AI insurance premiums projected at $4.77B ·2034 Market forecast ● AI agent market · $7.6B → $236B

Featured Report ● Q1 · 2026

The State of AI Agent Risk, 2026.

47 pages. 512 agents audited across 14 industries. Median risk score 612. Top failure modes: indirect prompt injection (78% susceptible), over-permissioned tools (64%), system prompt leakage (57%). We publish the aggregated, anonymized data so the industry has a benchmark to reason from.

PDF · 4.2 MB 47 pages Published 14 Apr 2026

Read the report →

Inside

Methodology · how we score 512 agents
Findings by industry vertical
Top 10 attack patterns that worked
Regulatory alignment matrix
Insurance pricing implications
Appendix · raw scenario library

Who contributed

Certius Labs adversarial engineering
Partner carriers (under NDA)
14 enterprise pilot customers

Research from inside
the adversarial lab.

Inside

Who contributed

Everything we've published.

Want the field notes
in your inbox?

Inside

Who contributed

Everything we've published.

Want the field notesin your inbox?

Want the field notes
in your inbox?