AI Triage (deep dive)
For the user-facing walkthrough, see Maintenance → AI Triage. This page covers the prompt itself, how we evaluate it, and the heuristic fallback.
Prompt structure
ClaudeTriageAssistant sends Claude a two-part prompt:
- System: role, output format, categories + urgency levels with definitions, priority rules ("safety first"), and format constraints (strict JSON, no prose before or after).
- User: the tenant report text, verbatim.
Key decisions in the system prompt:
- Categories are closed-set. Claude can't invent a new one; must pick from the 11 supported.
- Urgency maps to SLA. We explain
EMERGENCY = 4h SLA,URGENT = 24hso Claude understands the real-world consequence of its choice. - Safety wins ties. If input is ambiguous between
APPLIANCEandEMERGENCY_SAFETY(e.g. gas heater yellow flame), we instruct escalation toEMERGENCY_SAFETY. - Actions, not just classification. Prompt v2 added
suggestedActions[]so the PM has next-steps, not just a label. - Confidence calibration. We ask Claude to self-score. In evals we find Haiku is slightly over-confident (0.9 when we'd mark 0.8); we haven't re-calibrated yet but threshold the UI at 0.7 conservatively.
Prompt version history
- prompt-v1 (2026-03) — initial: category + urgency + confidence + one-sentence reasoning.
- prompt-v2 (2026-04) — added
suggestedActions[]array. Triage went from "classifier" to "triage advisor". - prompt-v3 (planned, see AI Roadmap) — adds
sentimentScore+escalationFlagfor tenants writing in ALL CAPS about a repeat issue.
Heuristic fallback
HeuristicTriageAssistant — cheap, deterministic, zero-cost. Keyword
rules fire top-down; first match wins. Abbreviated:
gas|carbon monoxide|smoke|fire → EMERGENCY_SAFETY, EMERGENCY
burst|flood|water coming through ceiling → PLUMBING, URGENT
no hot water|cold shower → PLUMBING, URGENT
power out|no electricity|sparks → ELECTRICAL, URGENT
oven|fridge|washing machine|dishwasher → APPLIANCE, NORMAL
lock|key|broken door|break.?in → SECURITY, URGENT
pest|rat|mice|cockroach → PEST, NORMAL
…
(default) → OTHER, NORMAL, confidence 0.4
Confidence for the fallback is capped at 0.5 so the UI always shows the "Human triage required" badge. We want a PM reading the request when the heuristic is doing the classification.
Evaluation
We maintain a fixture set (ClaudeTriageAssistantTest) of ~40 labelled
tenant reports, including:
- Clear single-category (broken tap → PLUMBING URGENT)
- Ambiguous multi-signal (gas heater making noise + tenant has headache → EMERGENCY_SAFETY EMERGENCY)
- Long rambling prose with buried safety signal
- Short one-liner
- Non-English phrasing (tenant-written broken English)
We run the fixture against every prompt change and look at:
- Category accuracy (target ≥90%)
- Urgency correctness within ±1 level (target 100%)
- No
EMERGENCY_SAFETYdowngrade in presence of safety keywords (hard gate) - Confidence calibration (target Brier score improvement)
Anthropic API integration
AnthropicClient is a direct java.net.http.HttpClient wrapper. No SDK —
deliberate, keeps the dependency surface small and the behaviour
debuggable.
Key config:
- Model:
claude-haiku-4-5 - Max output tokens: 600
- Timeout: 8 seconds
- API key:
ANTHROPIC_API_KEYenv var - Circuit breaker: 3 consecutive failures → 5-minute open state