Anti-Hallucination Architecture
Hitler uses LLM tool use to prevent the LLM from fabricating data. The LLM cannot see any user data unless it explicitly calls a tool to fetch it.The Problem
LLMs can “hallucinate” - generate plausible-sounding but completely fabricated responses. In a workplace assistant, this could mean:- Claiming you have 12 tasks when you have 3
- Inventing task names that don’t exist
- Making up mood trends or statistics
- Creating fake team member information
Our Solution: Tool Use (No Data Without Tools)
Instead of injecting data into the system prompt and hoping the LLM doesn’t make up more, we give the LLM zero data by default and require it to call tools to access anything.Architecture
Key Insight
The LLM must call a tool to access any data. It cannot:- Guess how many tasks you have (must call
get_tasks) - Fabricate mood history (must call
get_mood_history) - Invent task statistics (must call
get_task_stats)
get_tasks, it literally has no data to fabricate from.
Tool Definitions
The LLM has access to 7 tools:| Tool | Purpose | Returns |
|---|---|---|
get_tasks | Fetch user’s tasks | Real task list from DB |
create_task_draft | Create a task (auto-confirmed) | Task ID (created immediately) |
complete_task | Mark task as done | Completed task |
log_mood | Log mood entry | Mood record |
get_mood_history | Recent mood entries | Real mood data |
get_task_stats | Task statistics | Computed from DB |
get_pending_drafts | Queued/pending tasks | Real task list |
Tool Use Flow
System Prompt Enforcement
The system prompt explicitly tells the LLM:Why This Is Better Than the Old Approach
Old: Rule-Based Parser + Data Injection
- Parser needed 260+ lines of NLP pattern matching
- Still failed on languages it hadn’t seen
- LLM had some data, tempting it to extrapolate
- “Don’t hallucinate” instructions are unreliable
New: Thin Parser + LLM Tool Use
- Parser is 50 lines, no NLP, no bugs
- Works in any language (LLM understands all)
- Impossible to hallucinate (no data to hallucinate from)
- Tool results are real database records
Human-in-the-Loop Safety
Even with tool use, task creation goes through a draft system:- LLM calls
create_task_draft(notcreate_task) - A draft is created in the database
- User sees confirmation UI with the draft details
- User must click “Confirm” before the task exists
Testing
The parser tests verify that natural language goes to the LLM (not the parser):Key Principles
- No data without tools: LLM has zero user data in its context
- Tools return real data: Every tool call goes to the actual database
- Human confirmation: Task creation always requires user approval
- Language-agnostic: LLM understands any language natively, no parser needed
- Simple parser: Only exact commands, no NLP pattern matching