Skills First, Memory Later
A lot of agent design starts in the wrong place. People reach for memory, graphs, and multi-layer orchestration before they have a small system that can already do one thing reliably. That usually creates more surface area, more token burn, and more ways for the agent to drift away from the actual task.
A better default is to keep the system simple and make behavior explicit. If the agent needs a new capability, add a skill. If it needs a new source of truth, add a deterministic step. The goal is not to make the agent feel impressive. The goal is to make it predictable enough that it can be trusted.
That is why micro-agents and narrow skills are so compelling. A small prompt with one responsibility is easier to test, cheaper to run, and easier to replace when it breaks. Big agents can still be useful for multi-phase work, but they should not be the first answer to every problem.
Markdown works well as the backbone for that kind of system because it keeps the harness visible. Notes, decisions, and workflows can be stored in plain files while code handles the parts that must never be vague. That split matters: the LLM can help extract or transform information, but deterministic code should own the classification and routing.
Context is part of the same equation. It is finite, and the quality drops when the system carries too much into a session. Instead of stuffing more into the prompt, it is usually better to retrieve only what is needed and let the agent operate on a smaller, cleaner view of the task.
Token economics reinforce the same lesson. Every extra round trip, every redundant summary, and every oversized prompt has a cost. Programmatic tooling helps because it lets code handle the obvious steps first and leaves the model to do the part that actually needs judgment.
Even then, human judgment does not go away. The best results still come from a loop where the model produces something, the system checks it, and a human or a clear rule decides whether it is good enough. Simple systems are not less capable. They are just easier to keep honest.
If an agent stack starts getting hard to explain, that is usually the warning sign. Reduce the prompt, move logic into code, add a skill instead of a memory layer, and keep the harness small enough that you can reason about it end to end.