When an agent starts behaving unreliably, the instinct is to add more instructions. More detail. More guardrails. More steps. I've done it myself, and I've watched it happen on my team. It feels like the responsible thing to do.

It made things worse.

I build skills for a production LLM agent. A skill tells the agent what tools are available and how to use them; it references resources that load conditionally depending on what the customer is asking. One of the skills I built includes a decision framework for data display: once you have results, here's how to decide whether to present them as a table, a chart, or something else. If you choose a chart, you must load additional resources because the charting tools require the request body in a very specific format.

Even with fully stepped-out instructions (step one, step two, step three), if the language gave the agent any room to skip a step, it would try. Charts rendered incorrectly about half the time. Request bodies for queries were malformed about half the time, throwing 400 errors from the MCP. During post error review, the agent would claim it thought it already knew enough, or that it had "memories" of the correct format, and skip loading the resources that contained the actual specifications.

The obvious response is to add more instructions. Spell it out more and add another guardrail. But the longer the prompt gets, the harder the LLM has to work to pay attention to all of it and at scale, the token cost compounds fast. Every skill load, every resource call, every agent invocation is burning through that bloated prompt repeatedly. We'd seen this repeatedly when prompts got too large: the agent started ignoring some instructions, skipping steps, and improvising in ways that looked helpful but produced inconsistent and poor results. More words just added bloat, which compounded the original problem.

The fix was precision, not more words. Instead make your words precise. "Must" instead of "need to." "Required" instead of "should." Going back through every instruction and tightening the language to eliminate loopholes. Humans read "you should do this step" and "you must do this step" as roughly equivalent. LLMs treat them as fundamentally different permission levels. The nuance of language becomes a reliability engineering problem.

What took longer to recognize was the formatting. A lot of what we'd built into our skills and resources was decorative rather than functional: section dividers that duplicated what clear headers were already doing, emoji used as visual cues, bold text meant to signal emphasis. The agent didn't interpret any of it. They were just extra characters to process. The word was still just the word.

The reason it took us a while to see this comes down to something more fundamental: we weren't thinking carefully enough about who the user of these instructions actually was. We're conditioned to make things clear for other humans. And most of these instructions were developed in collaboration with an LLM which is itself trained to produce output that's readable and helpful to humans. So we were asking a human-facing system to write instructions, and it naturally wrote them for a human audience.

It wasn't until I explicitly told the LLM that these instructions would only ever be read by another LLM acting as an agent, and asked whether that changed its recommendations, that the framing shifted. It did. Immediately. The shift wasn't to completely strip out all formatting. It was more precise than that: if the formatting is doing something structural, keep it. If it's doing something visual, cut it. The agent doesn't have eyes. The one exception worth noting is code blocks, which are processed differently from inline text and do carry structural meaning for the agent. That's a narrow exception, not a reason to reach for formatting everywhere. The lesson wasn't really about formatting at all, it was about identifying your user. We just hadn't considered that the user might not be human.

What I've taken from this, both in building agent skills and in how I use AI in my own work, is that precision and freedom aren't opposites. Make your instructions lean and clear and they should still be sufficiently readable for humans to review. Precision in the structure gives the AI more room to think clearly about the substance. A lean set of instructions with exact language outperforms a verbose set with soft language. Not because the LLM is lazy, but because attention is a finite resource, and every unnecessary word dilutes the ones that matter.