Why Data Stays Locked
Data typically lives in warehouses as transactional tables, event logs, and curated views. That’s useful for storage and scale—but not always for humans. A “simple” question often spans multiple systems, mixed definitions, and messy edge cases. The warehouse has the answer, but it doesn’t speak in plain language.
Organizations spend heavily on infrastructure and reporting, yet the day-to-day value remains stuck behind specialist workflows. Analysts and engineers do the right things—cleaning, merging, validating—but that process takes time. The practical consequence is that even quick checks (“What changed this week?”) can have a long turnaround time.
AI as a Language Interpreter
AI changes the interface. Instead of asking users to learn schemas, SQL, and metric definitions, the system can interpret intent, map it to known business concepts, and translate it into queries. Done well, it doesn’t replace dashboards—it complements them by covering the questions you didn’t pre-design for.
Chatbot as a Natural-Language Bridge
A chatbot is simply the most natural front door. People ask questions the way they talk: “Which line had the most downtime last week?” or “Did scrap spike after the supplier change?” For ad-hoc and lightweight analytics, this removes friction and reduces dependency on ticket queues. You get an answer fast—and you can follow up immediately.
The Hard Part: Trust & Correctness
Text-to-SQL is not magic and neither is AI. The risk isn’t slow answers—it’s wrong answers delivered confidently. While AI models are getting better everyday, they still suffer from tendency to Hallucinate especially under unclear and ambiguity situations (what metric?), incorrect joins (wrong grain), or hidden assumptions (filters, time windows, exclusions). If users can’t verify the logic, trust collapses quickly.
Our Approach to Faithful SQL
We treat SQL generation as a controlled process, not a single-shot guess. The assistant first clarifies intent (metric, timeframe, scope), then grounds the query in the available schema and business definitions, and finally produces SQL that can be inspected and validated. When the question is ambiguous, it asks a short follow-up instead of hallucinating. The goal is simple: speed with accountability.
- Intent Classifier (LLM) — Determines if the user query needs SQL or is a meta-data request; routes accordingly.
- Plan Agent (LLM) — Designs the desired output features like columns, grain, filters, limit.
- Column Normalize Agent (LLM) — Maps planned fields (e.g., “member’s region”) to actual schema columns.
- Mapping Agent (LLM) — Classifies each field as Direct / Derived / Missing using schema + sample rows
- SQL Generator (LLM) — Writes DuckDB-compatible SQL from the mapped plan; and pass to
validate_ai_sql(). - Execution & Analysis — Executes SQL on the dataset and returns an HTML table & a short analysis
Business Value for Operations
The payoff is faster learning cycles. Teams spend less time chasing basic answers and more time solving real problems. Leaders get quicker clarity, fewer surprises, and an easier way to cross-check what they’re told. In operations, that reduction in decision latency compounds—because the best time to act is usually before the next shift.