Home / Product
§ The Platform Full Cycle

The full-cycle risk platform
for AI agents.

Connect your agent. We discover its attack surface, stress-test it against 500+ adversarial scenarios, score the result, and price coverage accordingly. Continuous monitoring means your score — and your premium — stay accurate as your agent evolves.
Step 01 Discovery

We start by mapping what your agent can actually do.

Most AI risk assessments start with a questionnaire. Ours starts with a connection. You integrate Certius Labs via API or lightweight SDK, and we map your agent's real attack surface — not what someone remembered to write down.

Agent composition

  • Underlying model(s) and versions
  • System prompts and guardrails
  • Tool access (APIs, databases, filesystems)
  • Permission scopes and IAM roles
  • Data sources and retrieval pipelines
  • Multi-agent orchestration patterns

Operating context

  • User input channels
  • Output destinations (customers, internal systems, external APIs)
  • Authentication and identity context
  • Observability and logging setup
  • Human-in-the-loop checkpoints
  • Existing safety controls
Shadow AI
detection
We find the agents you don't know about. 37 agents per enterprise on average. Most security teams can name 5 of them. We scan your network traffic and API logs to identify autonomous AI activity — including agents deployed by individual teams without security review.
1–3 days
integration · no agent downtime
Read-only
by default · we never modify agents
37 avg
agents discovered per enterprise scan
Step 02 Adversarial Testing

We don't audit your AI agent.
We attack it.

Our adversarial engine runs 500+ attack scenarios against your agent. Every scenario is mapped to a recognized framework — MITRE ATLAS, OWASP LLM Top 10, NIST AI RMF — so your audit documentation is accepted by regulators and carriers alike. We don't invent new criteria. We execute the ones the industry already trusts.

CategoryWeightExample scenarios
Prompt Injection & Jailbreaks
20% Direct injection, indirect injection via retrieved content, role-play bypass, encoded payloads, context window poisoning
Data Exfiltration
20% System prompt extraction, training data leakage, unauthorized PII access, inference through side-channel prompts
Tool & Permission Misuse
20% Unauthorized API calls, privilege escalation, out-of-scope actions, destructive tool chains
Multi-agent Cascade Failures
15% Agent-to-agent manipulation, coordinated failures, orchestration exploits
Reliability & Hallucination
15% Factual drift, consistency under adversarial input, confident-but-wrong outputs
Compliance Violations
10% Regulated data handling, output content policy, audit trail integrity
Frameworks we align with
MITRE ATLAS Adversarial Threat Landscape for AI
OWASP LLM Top 10 industry-standard LLM taxonomy
NIST AI RMF US government AI Risk Framework
What you get
  • Detailed report of every scenario run, outcome, and severity
  • Proof-of-exploit for every finding (reproducible attack traces)
  • Remediation recommendations with priority ranking
  • Compliance mapping to regulatory requirements
  • Raw test logs for your own security team
Continuous,
not one-time
Not a pen test. Not a one-time assessment. Your agent changes every week. New model versions, new tools, new prompts. Our testing runs continuously. Your score updates as your agent evolves.
See sample audit report
Step 03 Quantified Risk

A numerical score.
Benchmarked. Continuous.

Your agent's risk compressed into a single number from 300 to 850. Modeled on the approach BitSight used for cyber risk — and that rating agencies have used for credit for a century. Not because it's perfect, but because it's the format executives, boards, and underwriters already know how to act on.

Agent Risk Report Sample · redacted Continuously tested
Acme Corp · customer-support-agent-v4
Tested against 512 scenarios · last run 3m ago
300 · Critical500 · Poor650 · Fair750 · Good850 · Prime
▲ 34 pts since last week
Category · WeightScore / 850
Security · 25%
712
Reliability · 20%
664
Permissions · 20%
527
Data Privacy · 15%
690
Compliance · 10%
374
Accountability · 10%
739

Score bands

TierRangeDescription
Exceptional800 – 850Robust against known and novel attacks. Insurable at best rates.
Strong740 – 799Production-ready with minor hardening opportunities.
Adequate670 – 739Insurable with standard premium. Clear remediation roadmap.
Weak580 – 669High premium. Coverage limited until remediation.
Critical300 – 579Not recommended for production without significant changes.
Continuous updating

Your score is not a snapshot. Our engine re-tests continuously — triggered by model updates, prompt changes, new tools, or scheduled intervals. Changes trigger alerts to your team and (if coverage is active) pricing review with the carrier.

Benchmarking

Compare against your industry peers (finance, healthcare, SaaS, legal), against your own trajectory over time, and against specific agent archetypes (customer service, coding, data analysis, financial).

Data policy
Your score is yours. We don't sell individual company scores. Anonymized, aggregated industry benchmarks power our data licensing — never individual results.
Regulatory Alignment Dossier

Audit documentation that
regulators accept.

Every test we run is mapped to the specific regulatory requirement it satisfies. Your audit becomes a pre-packaged compliance artifact for the frameworks you operate under.

EU AI Act · Articles 9, 10, 15
Colorado AI Act (CAIA)
NIST AI RMF 1.0
ISO/IEC 42001:2023
SOC 2 Type II · AI controls
Singapore Model AI Governance

Output format

01 · Executive summary
Board-ready. 2–3 pages.
Risk score, top findings, remediation plan. For the people making the go / no-go call.
02 · Technical report
Full attack trace.
Per-scenario results, raw data, reproducible exploits. For your security team.
03 · Regulatory dossier
Ready for Notified Body.
Cross-referenced mapping from our tests to each regulation's specific requirements. For the EU AI Act conformity assessment or regulator inquiry.
Step 04 Coverage

Coverage priced on what you
actually built.

Traditional insurance asks: "Do you have a policy for data handling? Yes/No." We don't ask. We test. Your premium reflects what our adversarial engine found — not what your compliance team wrote in a questionnaire.

What's covered

AI MODEL ERRORS
Financial losses from incorrect outputs, hallucinated facts, or failed decisions.
AGENT FAILURES
Losses from autonomous actions — wrong transfers, unauthorized commitments, cascade failures.
DATA LEAKAGE
Costs from PII exposure, system prompt extraction, training data exfiltration.
REGULATORY VIOLATIONS
Fines and remediation from EU AI Act, Colorado AI Act, and other AI-specific laws.
IP INFRINGEMENT
Claims from outputs that infringe on copyrights, trademarks, or trade secrets.
THIRD-PARTY LIABILITY
Damages owed to customers or partners affected by your agent's actions.

Coverage limits

Starter
$1M
For pilot deployments and single-agent coverage.
Standard
$5–10M
For multi-agent production environments.
Enterprise
$25M
For mission-critical AI systems with broad exposure. Excess layers available via carrier partners for higher limits.
How pricing works

Your premium is derived from three inputs: your risk score, your coverage limit, and your exposure profile. A score improvement of 50 points typically reduces premium by 15–30%. We show you the math — no black box.

Not yet live
Insurance coverage is being built with carrier partners for 2026 launch. Audit and scoring are available in pilot programs today. Join the waitlist to get early access when coverage goes live.
Join the coverage waitlist
How we operate Security

We test your AI.
Nobody else sees it.

01Read-only by default. We never modify your agents.
02Sandboxed testing. Adversarial tests run against isolated agent replicas when possible.
03Data residency. EU customers' data stays in EU (Frankfurt / Amsterdam).
04SOC 2 Type II. In progress · target Q3 2026.
05Zero data sharing. Your test results are never used to train competitor scoring.
06Responsible disclosure. We follow coordinated disclosure for any vulnerability found in third-party models or frameworks.

Ready to see
your agent's score?

15-minute call. Live audit walkthrough. Sample redacted report. Answers to your coverage questions.