You're Aiming Your LLM at a Brick Wall

I build the data layer and skills for a production LLM agent. A skill is a set of repeatable instructions the agent loads on demand. When a customer asks a data question, the agent loads the data skill, which tells it what tools are available, when to use them, and how to use them.

One of the things I've observed is product teams wanting to hardcode recommendation patterns into skills. When you see this pattern in the data, recommend this. When a customer asks about that metric, present it this way. The instinct is understandable because it's measurable and deterministic, it's easier to demo, and it has a clear beginning and end.

But if you tell the agent exactly what to recommend for every scenario, you've built in every bias you already have. You've lost the thing that makes an LLM useful: the ability to surface something no human thought to look for. There's a difference between precision and prescription that most teams are conflating. I learned this through process safety work in my prior career. OSHA's PSM regulations are performative, not prescriptive: you're required to reach the safe outcome, but how you get there is yours to figure out. The path isn't dictated and that flexibility is intentional.

Building agent skills has the same dynamic. Language precision matters enormously: "must" instead of "need to," "required" instead of "should." Without that precision the agent finds loopholes and may skip steps, improvise and return inconsistent results. Tightening that language made our agent dramatically more reliable. But that's precision about how the agent works: load this resource, follow these steps, format the request this way. It's not about what data the agent should consider and what the agent should conclude.

Some teams, when they see unreliable output, reach for more control over outcomes when what they actually need is more control over process. Give the agent guardrails on how it retrieves and structures data. Then give it enough context to actually think and let it find patterns and conclusions you've never considered.

When I built the universal data skill, I included a decision framework for how to present results (table, chart, or something else) and specific formatting requirements for each option. That's precision. But I didn't prescribe what insights the agent should surface from the data. That's where the LLM earns its keep.

The pushback I hear is that training users is too time consuming or expensive. It's easier to over-engineer the agent than to invest in helping customers use it well. So instead of teaching people how to engage in discovery, teams prescribe every outcome and ship something that looks impressive in a demo but can only return what its builders already thought to put in.

But you don't have to train users the old way. The LLM can help users find better questions too, if you build for that instead of building around it. The user still has to do work and participate. An agent that removes the need for human thinking isn't augmenting anything. It's an expensive vending machine.