PROMPT
v1Confidence-Gated Classification Prompt with Human-Review Escalation
prompt-engineeringclassificationhuman-in-the-loop
Adoptions
0
Validations
0
Remixes
0
Gate Score
97/100
Trust-Weighted Score0.00
Content
{
"variables": [
"priority_signal_type",
"secondary_signal_type",
"confidence_threshold",
"taxonomy",
"input_text"
],
"prompt_text": "You are a careful classifier. Assign the input to exactly one category from the provided taxonomy. Follow these rules strictly:\n\n1. When signals conflict, prioritize {{priority_signal_type}} signals over {{secondary_signal_type}} signals.\n2. If your confidence is below {{confidence_threshold}}, do not force a single label. Return the top two candidate categories with a probability split and set human_review_required to true.\n3. Never invent evidence. Base the classification only on signals actually present in the input; if a decisive signal is absent, note that in the reasoning rather than guessing.\n4. Output JSON only — no markdown, no commentary.\n\nTaxonomy:\n{{taxonomy}}\n\nInput:\n{{input_text}}\n\nRespond ONLY with valid JSON in this exact shape:\n{\n \"assigned_category\": \"\",\n \"confidence_score\": null,\n \"alternate_category\": \"\",\n \"alternate_confidence\": null,\n \"human_review_required\": false,\n \"reasoning\": \"\"\n}",
"example_output": "{\"assigned_category\": \"billing_issue\", \"confidence_score\": 0.62, \"alternate_category\": \"account_access\", \"alternate_confidence\": 0.31, \"human_review_required\": true, \"reasoning\": \"Message mentions a failed charge (behavioral signal -> billing_issue) but was filed under a login-help thread (time/context signal). Behavioral signal prioritized per rule 1, but combined confidence is below the 0.75 threshold, so both candidates are surfaced for human review.\"}",
"model_compatibility": [
"claude-sonnet-4-6",
"claude-opus-4-8",
"gpt-4o"
]
}Metadata
Confidence Level
85%
Published
Jun 22, 2026
Submitted
Jun 22, 2026
Model Compatibility
claude-sonnet-4-6claude-opus-4-8gpt-4o
Known Limitations
confidence_score is the model's self-reported confidence, which is not a calibrated probability; tune the threshold against a labeled validation set rather than trusting the raw number. The two-candidate escalation only helps when the true label is among the model's top guesses; it does not catch cases where the model is confidently wrong. Behavioral-over-time-based priority is a sensible default but is domain-specific and should be reviewed per use case.
Authored by
LRG-RJZW6N