Skip to content

Services

Customer Experience · From $25k/mo · 6–12 weeks

AI drafts. Humans send. 88% deflect rate on routine queries.

Your support team isn't slow — they're working from templates last updated before the last policy change. We fix the data problem, not the staffing problem, and pass every draft through a human before it hits a customer.

88%

routine-query deflect rate at a federal health body after one quarter in production

4 days → 1

wait time reduction at a federal health engagement — 8,000 queries a month

23%

response error rate found at Diagnose — caused by stale templates, not poor staff performance

240 hrs

staff hours freed per week within one quarter of the reply helper going live

The core problem

Your support team isn't slow. They're working from templates last updated before the last policy change.

A federal health body ran 8,000 staff queries a month. Four-day wait times. Headcount at capacity. Diagnose read 12 months of tickets and found that 23% of outgoing responses contained an error traceable to a policy change that was applied to the source guidance but never reflected in the reply templates. The team was not performing poorly — they were working from stale material at volume. That is a data problem, not a staffing problem.

The structural fix is the same across any support-heavy operation: AI reads the incoming query, drafts a response trained on your current policy and past resolutions, and presents it for a human to review and send. When policy changes, the training update propagates to every future draft rather than relying on each agent to notice the memo.

What changes

The same challenge. Two very different outcomes.

Without Effektiv

  • 4-day wait times with headcount at capacity
  • 23% error rate in outgoing responses from stale templates
  • Staff time consumed drafting from memory rather than current guidance
  • Policy changes propagated manually, inconsistently, slowly
  • No visibility into which queries drive the most error risk
  • Human sent every response — with no draft assistance

With Effektiv

  • 1-day wait time after one quarter in production
  • Under 2% error rate — AI surfaces current guidance with each draft
  • Staff focus on review and judgment, not drafting from scratch
  • Policy changes propagate to every future draft automatically
  • Escalation precision dashboard — live visibility for supervisors
  • Human in the loop on every send — a contract clause, not a setting

How we deliver

Diagnose. Design. Deliver.

Two weeks of listening before a line of code. The price is fixed at the end of Design — not at kick-off.

Phase 1 · 1–2 weeks

Diagnose

We read 12 months of ticket history, identify the most common query categories, and audit reply accuracy against current guidance. This is where stale-template errors are found — before they compound further. The error rate and category breakdown become the baseline for the outcome contract.

Phase 2 · 1–2 weeks

Design

Agent structure, human-gate rules, eval gates, and the weekly reporting schedule. Human-in-the-loop is required by design on any response that goes to a customer. All model inference on AWS Bedrock in AU regions. IRAP path available for PROTECTED data work from Q4 2026.

Phase 3 · 6–12 weeks

Deliver

The outcome contract names the deflect rate target, response accuracy target, and wait-time target — each measured weekly against the baseline set in Diagnose. The eval rig, reply templates, and prompt rules are yours at exit.

What you walk away with

Everything ships to your team at exit. No lock-in.

🛠

Reply helpers in production

Drafts for human approval. No autonomous send. Brand-voice eval gated on every draft before it reaches a reviewer.

🧪

Brand voice reference set

200+ tagged examples your future model versions train against. Yours at exit.

🗄

Policy match rig

Every reply checked against current policy before it reaches a human reviewer, with citations attached.

📒

Escalation precision dashboard

Live human-in-the-loop dashboard for the agent and the supervisor. Drift triggers an eval refresh.

🎓

Reviewer playbook

Calibrated for your support team. Weekly drift reports built in. Yours to extend.

Quality gates

What the eval rig measures.

Every output passes a multi-gate evaluation before it merges or ships. Outputs that fail do not proceed. The eval rig and all gate code are yours at exit.

  • Brand-voice match score — threshold agreed in Design against samples from your prior responses
  • False-reply rate — target zero on customer-facing outputs. Drafts that fail go to escalation, not to send
  • Policy match rate — scored against the live policy corpus, with citations attached to every draft
  • Response-time delta vs the prior baseline, measured weekly against the contract target
  • Escalation precision — correct escalations divided by total escalations. Drift triggers an eval refresh

Eval rig · sample run

Brand-voice match score —PASS
False-reply rate — target zero on customer-facinPASS
Policy match rate — scored against the live poliPASS
Response-time delta vs the prior baselinePASS
Escalation precision — correct escalations dividPASS

Eval rig source code shipped to your repo at exit.

Sample engagement

A federal health body ran 8,000 staff queries a month with a four-day wait time and headcount at full capacity. Diagnose found a 23% response error rate driven by stale templates. The reply helper was built in six weeks, trained on current guidance with a human-in-the-loop gate on every send. Within one quarter, 240 staff hours a week were freed, wait time fell to one day, and the routine-query deflect rate reached 88%. IRAP sign-off in progress for PROTECTED data work.

Read the full case →

Compliance posture

ISO 27001 in progress (Q3 2026) ISO 42001 aligned NIST AI RMF mapped IRAP path Q4 2026 Full governance posture →

Other services

Other ways we work with you.

Common questions

Frequently asked questions.

Human in the loop. Always.

See what your support operation looks like with AI doing the drafting.

Show us your ticket volume, your current wait times, and your policy corpus. We price on outcomes: deflect rate, accuracy, and wait time — all measured weekly against the baseline we find in Diagnose.