Agent assurance for European enterprises
Vidimus is the control plane CIOs, Chief AI Officers, and CISOs use to approve, test, monitor, and audit AI agents before — and after — production deployment. Built EU-first: data stays in-region, controls map to the regulation, and every decision lands in an immutable audit trail.
| Agent | Risk class | Status | 30‑day checks | Last review |
|---|---|---|---|---|
CT Claims Triage Copilot Insurance · claims | High | In review | 2 days ago | |
KY KYC Onboarding Agent Banking · onboarding | Limited | Approved | 5 days ago | |
PI Patient Intake Assistant Healthcare · admissions | High | Monitoring | 1 day ago | |
UW Underwriting Assistant Insurance · pricing | Limited | Approved | 1 week ago | |
FR Fraud Signals Monitor Banking · risk | Minimal | Monitoring | 3 days ago |
One platform from proposal to audit
Four surfaces that share one data model, so a probe failure in production can be traced back to the obligation it exercises and forward to the reviewer who signed the agent off.
Approve
A structured intake captures the agent’s purpose, model, data sources, tools, and human-oversight points. Vidimus classifies the agent against the EU AI Act risk tiers, evaluates a control checklist drawn from AI Act, GDPR, and DORA, and routes the file to the right reviewer with an immutable sign-off trail.
Test
For each agent intake we synthesise an adversarial test plan from the regulation corpus — prohibition probes, transparency probes, oversight probes, jailbreak resistance, tool overreach. Every probe is graded by an independent judge model. Runs are reproducible: prompt, response, verdict, and reasoning are all retained.
Monitor
Verdicts, drift signals, and configuration changes feed a live dashboard tied to the agents already in production. Regressions, new probe failures, and reviewer interventions surface in one place — not buried in separate observability stacks.
Audit
Every approval, override, and test run is appended to a tamper-evident log. The evidence pack is generated on demand: risk classification, applicable obligations, probe results, control verdicts, document evidence, and the full chronological trail. Mapped to AI Act Annex IV and your internal control catalogue.
How it works
The same path every agent takes. Idempotent at each step so a paused or interrupted flow resumes without losing state.
- 01
Declare the agent
Intake captures purpose, data sources, model provider, tools, customer-facing scope, and oversight design. Validated with Zod at the boundary so the downstream risk derivation is honest.
- 02
Vidimus builds the regulatory test plan
A pattern bank × corpus synthesiser produces obligation-specific probes against AI Act, GDPR, and DORA. A critic model filters the loose and the unwinnable so only defensible probes reach your agent.
- 03
Run the probes, collect the evidence
Probes are sent to your agent through HTTP, A2A, or MCP adapters. Each turn is graded; tool calls are observed; uploaded documentation is verified against the obligations it claims to address.
- 04
Hand a regulator the pack
Export a versioned, content-addressed evidence pack. Append-only audit trail, citations to verbatim regulation text, and reviewer decisions with reasons. Re-issuable, replayable, and tied to a specific intake fingerprint.
Built to hold up to a regulator
Compliance is the substrate, not a feature. The architectural choices that matter to a supervisor are load-bearing, not optional flags.
EU data residency
Postgres, storage, and the model providers we default to are EU-region. Tenant data does not leave the bloc.
Tenant isolation in the database
Every table is scoped by org and enforced through Postgres row-level security — not just application checks.
Append-only audit trail
Approval decisions, control overrides, evidence-pack exports, and deployment events are immutable with actor, time, and reason.
Citable, replayable evidence
Probes quote the regulation verbatim; plans are content-addressed; uploaded documents are verified passage-by-passage against the obligations they cover.
Run a pilot on one agent
Bring one agent through the full path — intake, probe synthesis, run, and pack. Pilots take about two weeks and end with an evidence pack we walk through with your risk and compliance leads.