FEB·2024 Air Canada ● court ruling: airline bound by chatbot promise ·SEP·2025 Two Sigma ● $170M client losses · AI model manipulation ·OCT·2025 Geneva Association ● 90%+ businesses want AI insurance ·Q4·2025 Verisk ● AI exclusions from commercial general liability ·DEC·2025 Amazon Kiro ● 13-hour AWS outage · AI coding agent ·2025 Stanford HAI ● GenAI lawsuits up 137% YoY ·JAN·2026 Singapore ● first agentic AI governance framework ·FEB·2026 Lobstar Wilde ● $441K accidental crypto transfer ·FEB·2026 Colorado AI Act ● effective · first US state AI law ·MAR·2026 Alibaba ROME ● rogue agent mines crypto · disables firewall ·2026 Gravitee ● 88% enterprises report AI incidents · 21% have visibility ·2026 IBM ● +$670K average cost per shadow AI breach ·2026 Gartner ● AI governance software · $492M market ·AUG·2026 EU AI Act ● full enforcement · up to €35M or 7% revenue ·2032 Deloitte ● AI insurance premiums projected at $4.77B ·2034 Market forecast ● AI agent market · $7.6B → $236B ·FEB·2024 Air Canada ● court ruling: airline bound by chatbot promise ·SEP·2025 Two Sigma ● $170M client losses · AI model manipulation ·OCT·2025 Geneva Association ● 90%+ businesses want AI insurance ·Q4·2025 Verisk ● AI exclusions from commercial general liability ·DEC·2025 Amazon Kiro ● 13-hour AWS outage · AI coding agent ·2025 Stanford HAI ● GenAI lawsuits up 137% YoY ·JAN·2026 Singapore ● first agentic AI governance framework ·FEB·2026 Lobstar Wilde ● $441K accidental crypto transfer ·FEB·2026 Colorado AI Act ● effective · first US state AI law ·MAR·2026 Alibaba ROME ● rogue agent mines crypto · disables firewall ·2026 Gravitee ● 88% enterprises report AI incidents · 21% have visibility ·2026 IBM ● +$670K average cost per shadow AI breach ·2026 Gartner ● AI governance software · $492M market ·AUG·2026 EU AI Act ● full enforcement · up to €35M or 7% revenue ·2032 Deloitte ● AI insurance premiums projected at $4.77B ·2034 Market forecast ● AI agent market · $7.6B → $236B

Home / Product

§ The Platform ● Full Cycle

The full-cycle risk platform
for AI agents.

Connect your agent. We discover its attack surface, stress-test it against 500+ adversarial scenarios, score the result, and price coverage accordingly. Continuous monitoring means your score — and your premium — stay accurate as your agent evolves.

Request a demo → See sample report (redacted) →

Step 01 ● Discovery

We start by mapping what your agent can actually do.

Most AI risk assessments start with a questionnaire. Ours starts with a connection. You integrate Certius Labs via API or lightweight SDK, and we map your agent's real attack surface — not what someone remembered to write down.

Agent composition

Underlying model(s) and versions
System prompts and guardrails
Tool access (APIs, databases, filesystems)
Permission scopes and IAM roles
Data sources and retrieval pipelines
Multi-agent orchestration patterns

Operating context

User input channels
Output destinations (customers, internal systems, external APIs)
Authentication and identity context
Observability and logging setup
Human-in-the-loop checkpoints
Existing safety controls

Shadow AI
detection

We find the agents you don't know about. 37 agents per enterprise on average. Most security teams can name 5 of them. We scan your network traffic and API logs to identify autonomous AI activity — including agents deployed by individual teams without security review.

1–3 days

integration · no agent downtime

Read-only

by default · we never modify agents

37 avg

agents discovered per enterprise scan

Step 02 ● Adversarial Testing

We don't audit your AI agent.
We attack it.

Our adversarial engine runs 500+ attack scenarios against your agent. Every scenario is mapped to a recognized framework — MITRE ATLAS, OWASP LLM Top 10, NIST AI RMF — so your audit documentation is accepted by regulators and carriers alike. We don't invent new criteria. We execute the ones the industry already trusts.

Category	Weight	Example scenarios
Prompt Injection & Jailbreaks	20%	Direct injection, indirect injection via retrieved content, role-play bypass, encoded payloads, context window poisoning
Data Exfiltration	20%	System prompt extraction, training data leakage, unauthorized PII access, inference through side-channel prompts
Tool & Permission Misuse	20%	Unauthorized API calls, privilege escalation, out-of-scope actions, destructive tool chains
Multi-agent Cascade Failures	15%	Agent-to-agent manipulation, coordinated failures, orchestration exploits
Reliability & Hallucination	15%	Factual drift, consistency under adversarial input, confident-but-wrong outputs
Compliance Violations	10%	Regulated data handling, output content policy, audit trail integrity

Frameworks we align with

MITRE ATLAS ● Adversarial Threat Landscape for AI

OWASP LLM Top 10 ● industry-standard LLM taxonomy

NIST AI RMF ● US government AI Risk Framework

What you get

→Detailed report of every scenario run, outcome, and severity
→Proof-of-exploit for every finding (reproducible attack traces)
→Remediation recommendations with priority ranking
→Compliance mapping to regulatory requirements
→Raw test logs for your own security team

Continuous,
not one-time

Not a pen test. Not a one-time assessment. Your agent changes every week. New model versions, new tools, new prompts. Our testing runs continuously. Your score updates as your agent evolves.

See sample audit report →

Step 03 ● Quantified Risk

A numerical score.
Benchmarked. Continuous.

Your agent's risk compressed into a single number from 300 to 850. Modeled on the approach BitSight used for cyber risk — and that rating agencies have used for credit for a century. Not because it's perfect, but because it's the format executives, boards, and underwriters already know how to act on.

Agent Risk Report ● Sample · redacted Continuously tested

Acme Corp · customer-support-agent-v4

Tested against 512 scenarios · last run 3m ago

300 · Critical500 · Poor650 · Fair750 · Good850 · Prime

—

▲ 34 pts since last week

Category · WeightScore / 850

Security · 25%

712

Reliability · 20%

664

Permissions · 20%

527

Data Privacy · 15%

690

Compliance · 10%

374

Accountability · 10%

739

Score bands

Tier	Range	Description
Exceptional	800 – 850	Robust against known and novel attacks. Insurable at best rates.
Strong	740 – 799	Production-ready with minor hardening opportunities.
Adequate	670 – 739	Insurable with standard premium. Clear remediation roadmap.
Weak	580 – 669	High premium. Coverage limited until remediation.
Critical	300 – 579	Not recommended for production without significant changes.

Continuous updating

Your score is not a snapshot. Our engine re-tests continuously — triggered by model updates, prompt changes, new tools, or scheduled intervals. Changes trigger alerts to your team and (if coverage is active) pricing review with the carrier.

Benchmarking

Compare against your industry peers (finance, healthcare, SaaS, legal), against your own trajectory over time, and against specific agent archetypes (customer service, coding, data analysis, financial).

Data policy

Your score is yours. We don't sell individual company scores. Anonymized, aggregated industry benchmarks power our data licensing — never individual results.

Regulatory Alignment ● Dossier

Audit documentation that
regulators accept.

Every test we run is mapped to the specific regulatory requirement it satisfies. Your audit becomes a pre-packaged compliance artifact for the frameworks you operate under.

EU AI Act · Articles 9, 10, 15

Colorado AI Act (CAIA)

NIST AI RMF 1.0

ISO/IEC 42001:2023

SOC 2 Type II · AI controls

Singapore Model AI Governance

Output format

01 · Executive summary

Board-ready. 2–3 pages.

Risk score, top findings, remediation plan. For the people making the go / no-go call.

02 · Technical report

Full attack trace.

Per-scenario results, raw data, reproducible exploits. For your security team.

03 · Regulatory dossier

Ready for Notified Body.

Cross-referenced mapping from our tests to each regulation's specific requirements. For the EU AI Act conformity assessment or regulator inquiry.

Step 04 ● Coverage

Coverage priced on what you
actually built.

Traditional insurance asks: "Do you have a policy for data handling? Yes/No." We don't ask. We test. Your premium reflects what our adversarial engine found — not what your compliance team wrote in a questionnaire.

What's covered

AI MODEL ERRORS

Financial losses from incorrect outputs, hallucinated facts, or failed decisions.

AGENT FAILURES

Losses from autonomous actions — wrong transfers, unauthorized commitments, cascade failures.

DATA LEAKAGE

Costs from PII exposure, system prompt extraction, training data exfiltration.

REGULATORY VIOLATIONS

Fines and remediation from EU AI Act, Colorado AI Act, and other AI-specific laws.

IP INFRINGEMENT

Claims from outputs that infringe on copyrights, trademarks, or trade secrets.

THIRD-PARTY LIABILITY

Damages owed to customers or partners affected by your agent's actions.

Coverage limits

Starter

$1M

For pilot deployments and single-agent coverage.

Standard

$5–10M

For multi-agent production environments.

Enterprise

$25M

For mission-critical AI systems with broad exposure. Excess layers available via carrier partners for higher limits.

How pricing works

Your premium is derived from three inputs: your risk score, your coverage limit, and your exposure profile. A score improvement of 50 points typically reduces premium by 15–30%. We show you the math — no black box.

Not yet live

Insurance coverage is being built with carrier partners for 2026 launch. Audit and scoring are available in pilot programs today. Join the waitlist to get early access when coverage goes live.

Join the coverage waitlist →

How we operate ● Security

We test your AI.
Nobody else sees it.

01Read-only by default. We never modify your agents.

02Sandboxed testing. Adversarial tests run against isolated agent replicas when possible.

03Data residency. EU customers' data stays in EU (Frankfurt / Amsterdam).

04SOC 2 Type II. In progress · target Q3 2026.

05Zero data sharing. Your test results are never used to train competitor scoring.

06Responsible disclosure. We follow coordinated disclosure for any vulnerability found in third-party models or frameworks.

Ready to see
your agent's score?

15-minute call. Live audit walkthrough. Sample redacted report. Answers to your coverage questions.

Request a demo → See a sample redacted report →

We start by mapping what your agent can actually do.

Agent composition

Operating context

We don't audit your AI agent.We attack it.

A numerical score.Benchmarked. Continuous.

Score bands

Audit documentation thatregulators accept.

Output format

Coverage priced on what youactually built.

What's covered

Coverage limits

We test your AI.Nobody else sees it.

Ready to seeyour agent's score?

We don't audit your AI agent.
We attack it.

A numerical score.
Benchmarked. Continuous.

Audit documentation that
regulators accept.

Coverage priced on what you
actually built.

We test your AI.
Nobody else sees it.

Ready to see
your agent's score?