Skip to main content

AI Triage (deep dive)

For the user-facing walkthrough, see Maintenance → AI Triage. This page covers the prompt itself, how we evaluate it, and the heuristic fallback.

Prompt structure

ClaudeTriageAssistant sends Claude a two-part prompt:

  1. System: role, output format, categories + urgency levels with definitions, priority rules ("safety first"), and format constraints (strict JSON, no prose before or after).
  2. User: the tenant report text, verbatim.

Key decisions in the system prompt:

  • Categories are closed-set. Claude can't invent a new one; must pick from the 11 supported.
  • Urgency maps to SLA. We explain EMERGENCY = 4h SLA, URGENT = 24h so Claude understands the real-world consequence of its choice.
  • Safety wins ties. If input is ambiguous between APPLIANCE and EMERGENCY_SAFETY (e.g. gas heater yellow flame), we instruct escalation to EMERGENCY_SAFETY.
  • Actions, not just classification. Prompt v2 added suggestedActions[] so the PM has next-steps, not just a label.
  • Confidence calibration. We ask Claude to self-score. In evals we find Haiku is slightly over-confident (0.9 when we'd mark 0.8); we haven't re-calibrated yet but threshold the UI at 0.7 conservatively.

Prompt version history

  • prompt-v1 (2026-03) — initial: category + urgency + confidence + one-sentence reasoning.
  • prompt-v2 (2026-04) — added suggestedActions[] array. Triage went from "classifier" to "triage advisor".
  • prompt-v3 (planned, see AI Roadmap) — adds sentimentScore + escalationFlag for tenants writing in ALL CAPS about a repeat issue.

Heuristic fallback

HeuristicTriageAssistant — cheap, deterministic, zero-cost. Keyword rules fire top-down; first match wins. Abbreviated:

gas|carbon monoxide|smoke|fire            → EMERGENCY_SAFETY, EMERGENCY
burst|flood|water coming through ceiling → PLUMBING, URGENT
no hot water|cold shower → PLUMBING, URGENT
power out|no electricity|sparks → ELECTRICAL, URGENT
oven|fridge|washing machine|dishwasher → APPLIANCE, NORMAL
lock|key|broken door|break.?in → SECURITY, URGENT
pest|rat|mice|cockroach → PEST, NORMAL

(default) → OTHER, NORMAL, confidence 0.4

Confidence for the fallback is capped at 0.5 so the UI always shows the "Human triage required" badge. We want a PM reading the request when the heuristic is doing the classification.

Evaluation

We maintain a fixture set (ClaudeTriageAssistantTest) of ~40 labelled tenant reports, including:

  • Clear single-category (broken tap → PLUMBING URGENT)
  • Ambiguous multi-signal (gas heater making noise + tenant has headache → EMERGENCY_SAFETY EMERGENCY)
  • Long rambling prose with buried safety signal
  • Short one-liner
  • Non-English phrasing (tenant-written broken English)

We run the fixture against every prompt change and look at:

  • Category accuracy (target ≥90%)
  • Urgency correctness within ±1 level (target 100%)
  • No EMERGENCY_SAFETY downgrade in presence of safety keywords (hard gate)
  • Confidence calibration (target Brier score improvement)

Anthropic API integration

AnthropicClient is a direct java.net.http.HttpClient wrapper. No SDK — deliberate, keeps the dependency surface small and the behaviour debuggable.

Key config:

  • Model: claude-haiku-4-5
  • Max output tokens: 600
  • Timeout: 8 seconds
  • API key: ANTHROPIC_API_KEY env var
  • Circuit breaker: 3 consecutive failures → 5-minute open state

See also